Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how is this related to existing systems like riemann? #29

Closed
timofeytt opened this issue Jul 25, 2020 · 2 comments
Closed

how is this related to existing systems like riemann? #29

timofeytt opened this issue Jul 25, 2020 · 2 comments
Assignees
Labels
question A request for information of clarification

Comments

@timofeytt
Copy link

What is the overlapping functionality with existing systems like riemann and how can they be used better together?

@timofeytt
Copy link
Author

A note about metrics-clojure would be helpful, too.

@BrunoBonacci BrunoBonacci self-assigned this Jul 25, 2020
@BrunoBonacci BrunoBonacci added the question A request for information of clarification label Jul 25, 2020
@BrunoBonacci
Copy link
Owner

Hi @timofeytt

You are correct saying that Riemann is somehow overlapping with µ/log, in fact both systems are event-based systems although in Riemann the basic event is a metric event (an event that describes or samples a metric).
In µ/log, each event is a free-form, pure, event which means that you as in Reimann you have a bunch of categorical properties (tags) which you can use to "slice & dice" the events and group them the way you want, but, in opposition to Riemann, µ/log doesn't constrain the user to a single numerical field.
If an event needs multiple numerical properties to describe it fully in µ/log you can pack this information in a single event.

Another difference is that the core of Riemann is a streaming and aggregation engine which allows you to turn raw data into high level (meaningful) insights. µ/log (at this stage) is just a client to produce the raw info.
It is entirely possible to write a µ/log publisher to send µ/log events to Riemann in its expected format.

Regarding metrics-clojure the difference is more fundamental. metrics-clojure, like many other libraries, is basic of a metering system. Events happen on a remote system and get aggregated at the source, then, time to time, the metric is sampled and the sample is sent to a collection system. Because the events are aggregated at the source, you are not able to slice & dice the metrics at query time unless you have expressly captured that particular category.
I'm very familiar with this approach, I used it for many years and I even wrote a Clojure wrapper for it (TRACKit!). Some tools like Prometheus try to overcome the lack of categorical dimensions providing a hybrid approach, but still not as rich as µ/log.

The benefits of switching to an event-based system are enormous although not very apparent at the start.
Instrumenting your code with a metrics library and produce a rich set of metrics is a very tedious and time consuming.
For example, if you instrument only your webservice request handlers with µ/log you could ask the following questions:

  • how many requests I've received in the last week
  • how many requests by day/hour/minute/second
  • how many requests by user over time
  • how many requests by endpoint overtime
  • how many requests were failures (4xx or 5xx)
  • of the error request, how many were for a specific endpoint
  • which user issued the failing requests
  • what do they have different than the successful requests
  • what's the latency distribution of the successful request vs the failed requests
  • which content-type/content-encoding was used
  • what's the distribution of the failures by host/jvm
  • what's the status JVM metrics (GC/memory/etc) of failing hosts during that time.
  • what's the repartition of the latencies between internal processing and external connections (db query, caches, etc)
    and much more. All this from 1 single good log instrumentation.

To achieve the same with a metrics system you will need several dozens of metrics to be collected and published.

µ/log works incredibly well with Elasticsearch which is an amazing tool to slice and dice the data the way you need.
One side of Elasticsearch which is not very well known is that Elasticsearch has a very fast and robust aggregation engine as well.

The final point is that traditional systems consider logs different from metrics and different from traces (the 3 pillars of observability), in reality, they are all different forms of events. For example, the same events that you can use for the logs and to capture metrics can represent traces. In µ/log, if you add a Zipkin publisher you get the traces collected and visualised as follow:

disruption traces

all this just come from simple µ/log instrumentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A request for information of clarification
Projects
None yet
Development

No branches or pull requests

2 participants