Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sifting layer, decide what writer to use on the fly #2877

Open
mateiandrei94 opened this issue Feb 12, 2024 · 0 comments
Open

Sifting layer, decide what writer to use on the fly #2877

mateiandrei94 opened this issue Feb 12, 2024 · 0 comments

Comments

@mateiandrei94
Copy link

Feature Request

Crates

tracing_subscriber

Motivation

I come from the Java world. In Java I used a logging library called Logback.
Logback has 2 core notions:

  • Logger : similar to spans and events in tracing
  • Appender : similar to subscriber in tracing

Logback has a SiftingAppender
Log4j2 (also a popular java logging library) has a similar RoutingAppender.
It's akin to a tracing subscriber layer that creates and caches layers on the fly based on information about the event and it's associated spans and routes the events to the correct cached layer (or if not present yet, creates one on they fly, caches it, and routes the event to the newly created layer).

Considerations

I have read issue #971
Yes if you know in advance what are the possible values you can do this with existing layers and per layer filtering, an example would be a bool key and depending on the value you have 3 different layers (none, true, false). The point of sifting is that you do not know the possible values at compile time, so you need to dynamically create layers at runtime.

Proposal

Implement a "SiftingLayer" which is a tracing subscriber layer. Perhaps it could be placed in the layer module besides the Identity layer and named SiftingLayer or RoutingLayer, the name doesn't really matter.

Use case

Imagine you're not a big corporation, you don't use the ELK Stack, you can't afford a database for logs, what you can, is write a log file.
Your application has thousands of users and requests, all going to the same log file.
Wouldn't it be nice to have a SiftingLayer that writes to a different file based on information about (for example) the requester ? In this example the application would create a span that would last for the entire "request", this span would trace a filed, for example requester_id. The SiftingLayer would sift based on requester_id, creating a new fmt layer with a file writer which filename could look like format!("{}/{}.log", root_folder, requester_id);. The result would be a folder inside of which you have n log files for each requester_id.

How should the new feature be implemented, and why?

The SiftingLayer should be implemented as a tracing subscriber layer which would take in as parameters:

  • a selector, it expresses sifting interest, could be a simple key-value vec, where key is either a metadata key, such as level, name, target, etc... or structured fields such as is custom_key in the following example: info!(custom_key = 7, "message"); and the value is the default value in case the field value is unknown at the time an event is emitted, or perhaps the selector could simply be a vec of keys without a default value (in which case the layer builder would receive Option instead of a value).
    Or even better yet it could be a struct, for example :
pub struct SiftSelector {
    level : bool,
    name : bool,
    target : bool,
    // ... and the rest of metadata about spans and events
    fields : Vec<String>
}
  • a closure function (fn mut ?) that returns a layer based on the set of values for the selected keys. This is in the layer documentation, however, the functionality of creating a layer should be passed in as a parameter to the SiftingLayer :
pub fn create_layer<S>(self, 
   //the other parameters should give me the value of what i selected,
   //for instance if i selected target, i expect a key=target (since i selected it) and value= .. well the value of the target, wrapped in an option
) -> Box<dyn Layer<S> + Send + Sync + 'static>
    where
        S: tracing_core::Subscriber,
        for<'a> S: LookupSpan<'a>,

Lastly the create_layer function should be called by the SiftingLayer only once per tuple of selected keys.
What do I mean by that ?
imagine i chose a SiftSelector that selects level and a custom field named customer_id.
if ever there is a debug event with customer_id=1 the sifting selector would lookup in it's in memory storage of dyn Layer whether there is a Layer for (debug, customer_id=1) if there is one, use it, otherwise if there isn't one call create_layer and pass it the tuple (debug, customer_id=1) then cache the result, further events (or I should say layer "method" calls) with (debug, customer_id=1) should be routed to that existing layer.

Why a Layer ?

Because Layers are composable, the SiftingLayer's job is to sift events based on some data about the span or event and route them to dynamically created layers.

  • Add any considered drawbacks.
    There are none, in fact the SiftingLayer could be implemented on a separate crate, however it's so ubiquitous that it should deserve a place in the tracing_subscriber crate as a Layer implementation in the layer module.

Alternatives

  • Are there other ways to solve this problem that you've considered?
    Yes, I wrote my own application specific layer that has hardcoded sifting keys based on my own application needs.
  • What are their potential drawbacks?
    None, it's just tedious to write, ideally this could be all done using a configuration file.
  • Why was the proposed solution chosen over these alternatives?
    Because I'm lazy and I don't want to have to write my own, which is potentially slower than an implementation written by Rust experts, which would also make it "plug and play".
    Another reason is because in my implementation I'm using span extensions to track the values of span keys. In order to sift events based on values of spans I need to know the values, I think this is already done by the fmt layer since it's capable of formatting per event the values of the span keys that are entered. It doesn't make sense to have yet another copy of those values, which is why I believe that the SiftingLayer could benefit from being written by programmers who know exactly how the tracing subscriber works and not me.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant