New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An RFC for structured logging #296

Open
wants to merge 7 commits into
base: master
from

Conversation

Projects
None yet
@KodrAus
Contributor

KodrAus commented Oct 14, 2018

  • For #149
  • Includes #281 as an alternative design

Rendered

Sample API docs


Hi! So we've been working through some designs in #149 on how structured logging could be supported in log. I've since gone away and tried to capture the results of that discussion and my experiments over the year into a concrete proposal that we can refine further. It looks pretty long, but a fair chunk of that is blocks of source.

I don't necessarily expect us to merge this PR even once we're happy with it, it's really just an effort to get more directed feedback on a significant API using a process we're familiar with in the Rust community.

cc @sfackler @dpc (who I know has limited capacity) @Thomasdezeeuw

@diwic

This comment has been minimized.

diwic commented Oct 14, 2018

As for serde, I want to raise an issue of serde not being capable of serializing empty arrays correctly, i e, the type information gets lost. As this issue was closed as "wontfix", this made serde impossible to use for some formats (such as mine).

Maybe that limitation is not severe enough for this case - that's up to the rest of you to decide - but I think it's something to be aware of when choosing a serialization crate.

@dpc

This comment has been minimized.

dpc commented Oct 14, 2018

The only thing that worries me is the inability to express data that can be sent to another thread without serialization. One idea that I have is putting additional bound on every value: https://doc.rust-lang.org/std/borrow/trait.ToOwned.html

This way values are always taken as references, but a async-logger implementation can just call to_owned() on them and send a copy (possibly Boxed, orArced) to another thread.

@dpc

This comment has been minimized.

dpc commented Oct 14, 2018

@diwic while I typically love using serde, there were multiple times where I was not happy with it trying to implement the backend (some serialization logic etc). It was typically related to handling sequences. Each time I dug, there were good reasons for how things were the certain way, and there is no way to make a serialization library that is perfect in 100% of use cases. So I can relate, but it is a de facto standard, at least right now.

It seems to me, having serde dependency behind a feature flag, and thus the ability to support other solutions (or serde versions) is a good thing to have.

The pain point with libraries like logging is that they are interoperability point of the whole ecosystem, and having to introduce any breaking changes (like public dependency) can be a painful experience for everyone.

@sfackler

This comment has been minimized.

Member

sfackler commented Oct 14, 2018

I'm not convinced that a serde dependency is the right choice for this use case. The log crate needs to focus on being lightweight in terms of code generation in particular. I think a virtual-dispatch based approach may work out better there.

@bitemyapp

This comment has been minimized.

bitemyapp commented Oct 14, 2018

For some prior art, over in Haskell-land I've been using katip. Not perfect (perhaps particularly on the serialization side?), but it's worked better than previous libraries I'd used that attempted to provide single input / multi output structured logging w/ support for ordinary text logs.

A faculty for flipping between nginx-style / apache compatible / etc. text logging would be neat too.

@TimNN

This comment has been minimized.

TimNN commented Oct 14, 2018

I haven't wrapped my head around all the details yet, but requiring serde as a depending feels kind of heavy handed to me (especially since, as far as I can tell, there is no way for the top-level crate to disable the serde integration).

I would personally feel better about this if there was another feature, only for use in the top-level crate, that would act as a "kill switch", removing the dependency on serde. (I'm not sure if that is feasible. For the log-recording part it should be, I think; for the sinks I'm not so sure).

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 14, 2018

It sounds like we're not sold on depending on serde here. There certainly are some big complications. The rationale was that if we don't use serde::Serialize and serde::Serializer then we'll need to define traits that are conceptually very similar to them. We'll either have to comprimise in what can be represented by punting on the advanced APIs that make Serializer complex, like sequences and maps, or we'll have to introduce another API with extra cognitive load for libraries like uuid to implement, and for formats like json to support.

So maybe we can find an API that doesn't bake in serde, and can be naturally object-safe, but also doesn't require libraries to opt-in to supporting structured logging as well as generic serialization. What I'd really like to avoid is accumulating a random bag of arbitrary traits in the ecosystem over time that need to be implemented before new libraries can be considered useful.

@gnzlbg

This comment has been minimized.

gnzlbg commented Oct 14, 2018

I think that adding a dependency to include a trait is not a big issue. The problem is that in the case of serde, this dependency is huge, which adds a non-negligible cost to everybody using log even if they are not using structured logging.

If instead of requiring full serde, we would just require a thin serde-traits crate instead, then the cost of pulling the trait might be insignificant, and just adding it by default a non issue.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 14, 2018

The problem is that in the case of serde, this dependency is huge, which adds a non-negligible cost to everybody using log even if they are not using structured logging.

This might just be my opinion after being immersed in it for so long, but I can't think of any case where non-structured logging is preferable to structured logging. If it's available we should be using it.

serde effectively is already just a collection of traits and some default implementations, but is definitely a fairly heavy crate as far as compile times go. That cost is amortized if the library is already part of the dependency graph though.

@diwic

This comment has been minimized.

diwic commented Oct 14, 2018

So, the easiest way is just to say that keys are &str and values are Display. (Maybe there could be something that outputs the Debug string rather than the Display string.)
Maybe we could just start there?

Because the moment we cross that line and say that we want something else than strings in the log backend, may it be integers, arrays, maps, booleans or something else, the question becomes much more complex because a) we then have to decide what we want to support and what we want to not support and b) all log backends need to implement support for this as well.
Surely serde can help with both, but then we're stuck with serde's design choices. Are we happy with that?

@christophebiocca

This comment has been minimized.

christophebiocca commented Oct 14, 2018

One question that isn't clearly answered from reading the RFC:
Can metadata be interpolated into the log message itself? We used a home-built structured logging framework for python and have generally captured all arguments as metadata, even if they showed up in the message. That way we don't have to write arguments twice (once as metadata, once as logging arguments) Given that all metadata must support Display in some form this seems trivial to support?

@matthieu-m

This comment has been minimized.

matthieu-m commented Oct 14, 2018

Is it possible/desirable to have a context argument?

The examples show key-value pairs being passed explicitly, however I could imagine wanting to prepare a "map" of key-value pairs ahead of time (a context) and passing that to multiple logs.

@dpc

This comment has been minimized.

dpc commented Oct 14, 2018

@matthieu-m

Is it possible/desirable to have a context argument?

Your logging backend could implement it and be initialized with desired metadata, that would add to key-value pairs of each Record when serializing it.

If you want explicit context, that you could manipulate and pass at runtime, like in slog, then it seems it's not part of this RFC, and the general opinion is that "contextual logging" is a bit intrusive (because you have to keep passing the context, or use some form of scopes) and it shouldn't be part of log itself. It could however still be provided by libraries like slog, and by having structured logging in log the interoperability between them would be better.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 14, 2018

So, the easiest way is just to say that keys are &str and values are Display. (Maybe there could be something that outputs the Debug string rather than the Display string.)
Maybe we could just start there?

I like the sentiment here, but I don't think we can really start with Display and build up, because Display is simply not sufficient for structured logging. All you get are strings, which is really limiting for log servers that use that structure for querying later.

The examples show key-value pairs being passed explicitly, however I could imagine wanting to prepare a "map" of key-value pairs ahead of time (a context) and passing that to multiple logs.

Sure can! While preparing this RFC I put together an an experimental repo for this kind of contextual logging.

Can metadata be interpolated into the log message itself?

This is something we do in .NET using message templates. I think an alternative implementation of the log! macro, implemented using procedural macros, could do this. The syntax for the log! macro proposed here is pretty sparten to remain compatible and to encourage crates to experiment with their own log! macros. Interpolating from a single bag of data is one thing to try. Another is being able to log an impl Fail or impl Error as an argument directly in error!.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 14, 2018

My sticking points on the serialization question are:

  • Don't require libraries depend on log and implement a trait before their types can be logged
  • Don't limit the kinds of data structures that can be logged to anything less than serde supports

We possibly can find a way to achieve that without baking in serde directly, but creating a pervasive alternative serialization framework to support just for logging, or arbitrarily limiting what can be logged both seem like non-starters to me.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 14, 2018

The log crate needs to focus on being lightweight in terms of code generation in particular. I think a virtual-dispatch based approach may work out better there.

@sfackler this is using virtual dispatch by way of erased-serde, but that crate has to work fairly hard to make that possible. There aren't any generics involved outside of trait implementations, which would probably exist in any implementation.

The lightweight code generation requirement is an interesting one I haven't really thought about at all. Could you elaborate on it a bit more? Which usergroup is it important to?

@softprops

This comment has been minimized.

softprops commented Oct 14, 2018

I believe the author of serde may also have some ideas for a lighter weight serde alternative
https://github.com/dtolnay/miniserde/blob/master/README.md

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 15, 2018

Looking at the complexity concern with serde (just ignoring the compile-time / artifact-size concerns for now), we could make life easier for formats that aren't already in the context of serialization (like env_logger's human-centric format) by tweaking the required trait bounds for logging. Instead of picking either Display or Serialize for structured values and facading that decision behind another implementation of Serialize, we could always require both Debug and Serialize, letting implementations of Log pick the one they want through their Visitor. So if you're already writing logs using write! where structure is lost already you can keep doing that, and if you want to serialize the whole lot as a structure-retaining format like json you can do that too.

@diwic

This comment has been minimized.

diwic commented Oct 15, 2018

I like the sentiment here, but I don't think we can really start with Display and build up, because Display is simply not sufficient for structured logging. All you get are strings, which is really limiting for log servers that use that structure for querying later.

Strings are indeed all we get with such a proposal, which makes writing a backend to "log" a lot easier, than if it has to support all data types that someone somewhere has decided.

If there are technical difficulties with starting with "anything Display" as values, how about starting with only &str? That shouldn't cause any concerns with backwards compatibility if we add more types later.

@Dowwie

This comment has been minimized.

Dowwie commented Oct 15, 2018

This seems like a proposal for a white horse with black stripes that we are told ought not be considered a zebra. I hope that I'm wrong but signs are showing that the stdlibbers are hungry to grow stdlib beyond its original mandate.

I'm not an advocate of stdlib crates competing with those of the rest of the ecoystem. stdlib Rust should foster cooperation, not competition. So, what would I need slog for if its main features were made available in a stdlib log crate? This is a question I and others will have if core functional overlaps exist.

I'm very interested in seeing a proposal for structured logging that is novel and does not overlap. Why would the authors of slog want to refactor working code just to make it line with stdlib team designs? Why should they?

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 15, 2018

Strings are indeed all we get with such a proposal, which makes writing a backend to "log" a lot easier, than if it has to support all data types that someone somewhere has decided.

I totally agree with you there. Having to write a Serializer just to write values for a human-centric format is a formidable barrier-to-entry. The Using Visitor to print or serialize key value pairs section has some examples of how human-centric and machine-centric formats could support structured logging. The human-centric one is actually a bit intellectually dishonest though because it omits the few hundred lines of Serializer code for brevity. However, if we make all values Debug + Serialize instead of just Serialize, then a human-centric log format using std::fmt could be as simple as:

fn write_pretty(w: impl Write, r: &Record) -> io::Result<()> {
    // Write the first line of the log record
    ...

    // Write each key-value pair using the `WriteKeyValues` visitor
    record
        .key_values()
        .try_for_each(|k, v| writeln!(&mut w, "{}: {:?}", k, v))
        .map_err(|e| io::Error::new(io::ErrorKind::Other, e.into_error()))
}

For properly structured formats, where a Serializer implementation already exists, like serde_json, the structured logging support can still be standard serde code:

fn write_json(w: impl Write, r: &Record) -> io::Result<()> {
    let r = SerializeRecord {
        lvl: r.level(),
        ts: epoch_millis(),
        msg: r.args().to_string(),
        props: r.key_values().serialize_as_map(),
    };

    serde_json::to_writer(w, &r)
}

#[derive(Serialize)]
struct SerializeRecord<KVS> {
    lvl: Level,
    ts: u64,
    msg: String,
    #[serde(flatten)]
    props: KVS,
}

So I think the Debug + Serialize approach could be a way to let implementors of Log use either the std::fmt machinery, or the serde::Serializer machinery as appropriate, without sacrificing structure altogether like we would with &str.

I have also been hacking around with serde itself following @gnzlbg's suggestion of a thin serde-traits library, and without all the impl blocks and other machinery on my box I can get a fresh serde to compile in ~2 seconds instead of ~9 (log currently takes ~1 second), so maybe there's an avenue to explore there too.

@dpc

This comment has been minimized.

dpc commented Oct 15, 2018

@Dowwie

Logging is an interoperability problem. Almost every crate does some logging, and if parts of ecosystem use different logging, then in practice it will degrade to lowest common denominator.

Also, I don't know about others but I don't want to be maintaining libraries if I could get the same functionality from (semi-)stdlib.

@KodrAus

I totally agree with you there. Having to write a Serializer just to write values for a human-centric format is a formidable barrier-to-entry.

Most people just take a one that was written and doesn't care. I have used json, yaml, cbor and probably many other serialization formats in Rust, and I never had to implement own serde serializer.

Even less so with logging, where people typically just want to dump a bunch of json logs to ElasticSearch. For slog rarely (never?) do I hear about people wanting human-centric input other than what slog-term implements. I mean ... "human-centric output" is just a bunch of lines. Anyone can have preferences, but it's just not worth implementing again for every project.

@Ralith

This comment has been minimized.

Ralith commented Oct 15, 2018

@KodrAus

Can metadata be interpolated into the log message itself?

This is something we do in .NET using message templates. I think an alternative implementation of the log! macro, implemented using procedural macros, could do this.

slog already does something like this, I think without proc macros; named format arguments are interpreted as structured metadata in addition to being formatted, while anonymous ones aren't. It's nice to use.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 15, 2018

Most people just take a one that was written and doesn't care. I have used json, yaml, cbor and probably many other serialization formats in Rust, and I never had to implement own serde serializer.

Well, slog-term does implement its own Serializer for its format, it's just not a serde one. I ended up going down a similar path in env_logger too while playing with the API this RFC proposes and writing a Serializer. Since there are a number of terminal loggers based on log already it does seem worth exploring how they could be better supported. I don't think the required Serialize + Debug bounds rather than Serialize would be a problem when converting from log into slog, but it might be a problem when converting from slog into log. If that's the case then it's not really suitable.

Even less so with logging, where people typically just want to dump a bunch of json logs to ElasticSearch. For slog rarely (never?) do I hear about people wanting human-centric input other than what slog-term implements.

This is true, there's a lot more value to get out of structured logging when you use a log store that supports working with structured data, like Elasticsearch. That's why I don't think we should punt on the actual structured serialization support for this RFC.

slog already does something like this, I think without proc macros; named format arguments are interpreted as structured metadata in addition to being formatted, while anonymous ones aren't. It's nice to use.

Ah gotcha. We could do the same thing in log!, but couldn't retain the structure of those format arguments without breaking existing users because the trait bounds would be different. It could be done without the same breakage using a proc macro though that's able to interpret the format string and know which values need to be interpolated from the one set of structured key-value pairs:

info!("the format string with a {number}"; number = 2);
@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 15, 2018

I've spent some time going around in circles on serde, so maybe we could look at some alternatives and their respective trade-offs properly? Like turning the Visitor into something like Serializer, like slog does.

The trade-off there that I see is that we'd then need another trait like Serialize that's implemented throughout the ecosystem for loggable types, probably also alongside serde anyways.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 16, 2018

Ok, I spent some time exploring the serialization API more to remove the required serde dependency from this RFC and came up with a proof-of-concept.

This API doesn't require serde by default, but can support it using a Cargo feature. Without the serde feature enabled it's super lightweight, the idea being that out-of-the-box we just support a limited set of primitive types as structured key-value pairs, but that set can be expanded to any type that implements serde::Serialize when the feature is opted into. The Serialize type from this crate doesn't become a public dependency because you can't implement it manually (that also lets us change its default bounds using Cargo features without fear of breakage), and a reasonable set of types is still supported even without the serde feature.

I think this approach can mitigate some of the concerns about serde being too heavyweight, but not miss the opportunity to take advantage of its already extensive ecosystem. Any thoughts?

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 18, 2018

Ok, what do we think of this API, based on the serialization proof-of-concept I posted before? I've stuck with the visit terminology over the serialize terminology because actually serializing a key-value pair is just one thing you might want to do with them. Other things include checking whether a specific pair is present and doing something with its primitive type, like comparing the timestamp on a record to one that was previously set in some ambient context to calculate the elapsed time.

I changed KeyValueSource to Source, Visitor to SourceVisitor, and Serialize to Visit. The Error type is basically the same as what's currently in the RFC.

// --- Set of key-value pairs --- //

pub trait Source {
    fn visit<'kvs>(&'kvs self, visitor: &mut dyn SourceVisitor<'kvs>) -> Result<(), Error>;


    // --- Provided adapter methods --- //
    // Using these you probably don't to build a `SourceVisitor` yourself

    fn erase(&self) -> ErasedSource
    where
        Self: Sized,
    {}

    fn try_for_each<F, E>(self, f: F) -> Result<(), Error>
    where
        Self: Sized,
        F: FnMut(Key, Value) -> Result<(), E>,
        E: Into<Error>,
    {}

    fn get<'kvs, Q>(&'kvs self, key: Q) -> Option<Value<'kvs>>
    where
        Q: Borrow<str>,
    {}

    #[cfg(feature = "serde")]
    fn serialize_as_map(self) -> SerializeAsMap<Self>
    where
        Self: Sized,
    {}

    ...
}

pub trait SourceVisitor<'kvs> {
    fn visit_pair(&self, key: Key<'kvs>, value: Value<'kvs>) -> Result<(), Error>;
}


// --- Individual key or value --- //

// Can't be implemented manually
// It's either a fixed set of `std` types or `T: Serialize`
pub trait Visit: private::Sealed {
    fn visit(&self, visitor: &mut dyn Visitor) -> Result<(), Error>;
}

#[cfg(not(feature = "serde"))]
impl Visit for $std_ty {} // where $std_ty is `u8`, `bool`, `&str`, `&Path` etc

#[cfg(feature = "serde")]
impl<T: Serialize + ?Sized> Visit for T {} // covers all $std_ty impls + a bunch more

pub trait Visitor {
    fn visit(&mut self, v: &dyn Visit) -> Result<(), Error> {
        v.visit(self)
    }

    fn visit_i64(&mut self, v: i64) -> Result<(), Error>;
    fn visit_u64(&mut self, v: u64) -> Result<(), Error>;
    fn visit_f64(&mut self, v: f64) -> Result<(), Error>;
    fn visit_bool(&mut self, v: bool) -> Result<(), Error>;
    fn visit_char(&mut self, v: char) -> Result<(), Error>;
    fn visit_str(&mut self, v: &str) -> Result<(), Error>;
    fn visit_bytes(&mut self, v: &[u8]) -> Result<(), Error>;

    // If a format is given a complex `serde` datastructure, like a map or sequence
    // A format might want to write a placeholder value like `null` instead
    fn unsupported(&mut self) -> Result<(), Error> {
        Err(Error::msg("unsupported value"))
    }
}


// --- Concrete types for keys and values --- //

// `ToOwned::Owned: Send + Sync + 'static`?
// use `Borrow<str>` as the generic trait bound for keys
pub struct Key<'k> {}

impl<'k> Key<'k> {
    fn as_str(&self) -> &str {}
}

// `Key` is always `Visit`
// If `serde` is available, it's also `Serialize`
#[cfg(not(feature = "serde"))]
impl<'k> Visit for Key<'k> {}
#[cfg(feature = "serde")]
impl<'k> Serialize for Key<'k> {}

// `ToOwned::Owned: Send + Sync + 'static`?
// use `Visit` as the generic trait bound for values
pub struct Value<'v> {}

// `Value` is always `Visit`
// If `serde` is available, it's also `Serialize`
#[cfg(not(feature = "serde"))]
impl<'v> Visit for Value<'v> {}
#[cfg(feature = "serde")]
impl<'v> Serialize for Value<'v> {}

the log! macros would be able to accept idents as keys, and any expression that implements Visit as values.

Why Visitor and SourceVisitor?

The API could be smaller if there was just one Visitor type:

trait Visitor {
    fn visit_key(&mut self, v: Key);
    fn visit_value(&mut self, v: Value);
    fn visit_pair(&mut self, k: Key, v: Value) {}

    fn visit_i64(&mut self, v: i64);
}

This might seem like a simplification, but what it's really doing is just making the sequence of operations in serializing a key-value pair implicit. In our format that implements Visitor we'd need to think about values being serialized before keys, keys or values being serialized multiple times, or being given a primitive like visit_str instead of a key or value and having to guess which one it is.

So having one trait, SourceVisitor, that surfaces key-value pairs, and another trait, Visitor, that looks at individual values lets the format decide exactly how it's going to work with its pairs. It also means we can lean on SourceVisitor to work with pairs consistently so that ergonomic methods like try_for_each and serialize_as_map can be provided on Source. With the combination of these two methods I actually haven't found myself even needing to write a SourceVisitor manually in my experiments with this API.

Why structs for Key and Value instead of trait objects?

Using structures for Key and Value is for backwards compatibility. With a dyn Trait or impl Trait we've got all our cards on the table, and it's much harder to introduce changes without breakage. Using structures we don't have that problem. We could also abstract over owned and borrowed keys and values, and provide some useful methods using static rather than dynamic dispatch.

@bitemyapp

This comment has been minimized.

bitemyapp commented Oct 18, 2018

@KodrAus apologies for my ignorance, does log already support or will this structured logging change add support for multi-output from a single structured log record?

@dpc

This comment has been minimized.

dpc commented Oct 18, 2018

@bitemyapp This is unrelated to this feature. Technically it is possible with log already, but log would have to embrace "composable loggers", provide a common trait for interoperability to make this effortless and common (Like slogs Drain) Please open another issue (if one does not already exist). As far as I can tell this RFC is compatible with functionality like this already.

@diwic

This comment has been minimized.

diwic commented Oct 18, 2018

Ok, what do we think of this API, based on the serialization proof-of-concept I posted before?

I guess there will be some bikeshedding over exactly what types are to be expected here? E g, I don't see myself wanting to log a char very often (and if I do, I could log it as a str).

Another thought is what to do with Option. We could make that skip the key-value pair if the option is None and log the unwrapped contents if the option is Some, maybe?

@dpc

This comment has been minimized.

dpc commented Oct 18, 2018

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 18, 2018

@bitemyapp Yeh, this is another feature that's a logical next step from structured logging, but I haven't tried to tackle it as part of this RFC besides trying not to design ourselves into a corner. Personally, I'm not sure that we need to support too much composition in log beyond what's needed on Record, because frameworks like slog already do this really well, and as a facade log doesn't necessarily need to care too much.

I guess there will be some bikeshedding over exactly what types are to be expected here? E g, I don't see myself wanting to log a char very often (and if I do, I could log it as a str).

@diwic Sure, let's colour some bikesheds!

  • We should add u128 and i128, I forgot about those.
  • Should we add methods for smaller numbers than 64-bit?
  • visit_char could default to visit_str, using a method like char::encode_utf8.
  • We should support Some and None in some way. Some could also default to just visit on its inner value.
  • Another type that might be worth treating explicitly is Instant.

What else do we think?

Other APIs for comparison

slog

pub trait Serializer {
    impl_default_as_fmt! {
        usize => emit_usize
    }
    impl_default_as_fmt! {
        isize => emit_isize
    }
    impl_default_as_fmt! {
        bool => emit_bool
    }
    impl_default_as_fmt! {
        char => emit_char
    }
    impl_default_as_fmt! {
        u8 => emit_u8
    }
    impl_default_as_fmt! {
        i8 => emit_i8
    }
    impl_default_as_fmt! {
        u16 => emit_u16
    }
    impl_default_as_fmt! {
        i16 => emit_i16
    }
    impl_default_as_fmt! {
        u32 => emit_u32
    }
    impl_default_as_fmt! {
        i32 => emit_i32
    }
    impl_default_as_fmt! {
        f32 => emit_f32
    }
    impl_default_as_fmt! {
        u64 => emit_u64
    }
    impl_default_as_fmt! {
        i64 => emit_i64
    }
    impl_default_as_fmt! {
        f64 => emit_f64
    }
    impl_default_as_fmt! {
        #[cfg(integer128)]
        u128 => emit_u128
    }
    impl_default_as_fmt! {
        #[cfg(integer128)]
        i128 => emit_i128
    }
    impl_default_as_fmt! {
        &str => emit_str
    }

    fn emit_unit(&mut self, key: Key) -> Result {
        self.emit_arguments(key, &format_args!("()"))
    }

    fn emit_none(&mut self, key: Key) -> Result {
        self.emit_arguments(key, &format_args!(""))
    }

    fn emit_arguments(&mut self, key: Key, val: &fmt::Arguments) -> Result;

    #[cfg(feature = "nested-values")]
    fn emit_serde(&mut self, key: Key, value: &SerdeValue) -> Result {
        value.serialize_fallback(key, &mut SerializerForward(self))
    }
}

Here we have default implementations for most of these methods that forward to emit_arguments. Based on these APIs, I can't see any issues with us adapting between log::Source and slog::KV.

serde

pub trait Serializer: Sized {
    type Ok;
    type Error: Error;

    type SerializeSeq: SerializeSeq<Ok = Self::Ok, Error = Self::Error>;
    type SerializeTuple: SerializeTuple<Ok = Self::Ok, Error = Self::Error>;
    type SerializeTupleStruct: SerializeTupleStruct<Ok = Self::Ok, Error = Self::Error>;
    type SerializeTupleVariant: SerializeTupleVariant<Ok = Self::Ok, Error = Self::Error>;
    type SerializeMap: SerializeMap<Ok = Self::Ok, Error = Self::Error>;
    type SerializeStruct: SerializeStruct<Ok = Self::Ok, Error = Self::Error>;
    type SerializeStructVariant: SerializeStructVariant<Ok = Self::Ok, Error = Self::Error>;

    fn serialize_bool(self, v: bool) -> Result<Self::Ok, Self::Error>;

    fn serialize_i8(self, v: i8) -> Result<Self::Ok, Self::Error>;
    fn serialize_i16(self, v: i16) -> Result<Self::Ok, Self::Error>;
    fn serialize_i32(self, v: i32) -> Result<Self::Ok, Self::Error>;
    fn serialize_i64(self, v: i64) -> Result<Self::Ok, Self::Error>;
    serde_if_integer128! {
        fn serialize_i128(self, v: i128) -> Result<Self::Ok, Self::Error> {
            let _ = v;
            Err(Error::custom("i128 is not supported"))
        }
    }

    fn serialize_u8(self, v: u8) -> Result<Self::Ok, Self::Error>;
    fn serialize_u16(self, v: u16) -> Result<Self::Ok, Self::Error>;
    fn serialize_u32(self, v: u32) -> Result<Self::Ok, Self::Error>;
    fn serialize_u64(self, v: u64) -> Result<Self::Ok, Self::Error>;
    serde_if_integer128! {
        fn serialize_u128(self, v: u128) -> Result<Self::Ok, Self::Error> {
            let _ = v;
            Err(Error::custom("u128 is not supported"))
        }
    }

    fn serialize_f32(self, v: f32) -> Result<Self::Ok, Self::Error>;
    fn serialize_f64(self, v: f64) -> Result<Self::Ok, Self::Error>;

    fn serialize_char(self, v: char) -> Result<Self::Ok, Self::Error>;
    fn serialize_str(self, v: &str) -> Result<Self::Ok, Self::Error>;
    fn serialize_bytes(self, v: &[u8]) -> Result<Self::Ok, Self::Error>;

    fn serialize_none(self) -> Result<Self::Ok, Self::Error>;
    fn serialize_some<T: ?Sized>(self, value: &T) -> Result<Self::Ok, Self::Error>
    where
        T: Serialize;

    fn serialize_unit(self) -> Result<Self::Ok, Self::Error>;
    fn serialize_unit_struct(self, name: &'static str) -> Result<Self::Ok, Self::Error>;
    fn serialize_unit_variant(
        self,
        name: &'static str,
        variant_index: u32,
        variant: &'static str,
    ) -> Result<Self::Ok, Self::Error>;

    fn serialize_newtype_struct<T: ?Sized>(
        self,
        name: &'static str,
        value: &T,
    ) -> Result<Self::Ok, Self::Error>
    where
        T: Serialize;
    fn serialize_newtype_variant<T: ?Sized>(
        self,
        name: &'static str,
        variant_index: u32,
        variant: &'static str,
        value: &T,
    ) -> Result<Self::Ok, Self::Error>
    where
        T: Serialize;

    fn serialize_seq(self, len: Option<usize>) -> Result<Self::SerializeSeq, Self::Error>;
    fn serialize_tuple(self, len: usize) -> Result<Self::SerializeTuple, Self::Error>;
    fn serialize_tuple_struct(
        self,
        name: &'static str,
        len: usize,
    ) -> Result<Self::SerializeTupleStruct, Self::Error>;
    fn serialize_tuple_variant(
        self,
        name: &'static str,
        variant_index: u32,
        variant: &'static str,
        len: usize,
    ) -> Result<Self::SerializeTupleVariant, Self::Error>;

    fn serialize_map(self, len: Option<usize>) -> Result<Self::SerializeMap, Self::Error>;
    fn serialize_struct(
        self,
        name: &'static str,
        len: usize,
    ) -> Result<Self::SerializeStruct, Self::Error>;
    fn serialize_struct_variant(
        self,
        name: &'static str,
        variant_index: u32,
        variant: &'static str,
        len: usize,
    ) -> Result<Self::SerializeStructVariant, Self::Error>;

    fn collect_seq<I>(self, iter: I) -> Result<Self::Ok, Self::Error>
    where
        I: IntoIterator,
        <I as IntoIterator>::Item: Serialize,
    {
        let iter = iter.into_iter();
        let mut serializer = try!(self.serialize_seq(iter.len_hint()));
        for item in iter {
            try!(serializer.serialize_element(&item));
        }
        serializer.end()
    }
    fn collect_map<K, V, I>(self, iter: I) -> Result<Self::Ok, Self::Error>
    where
        K: Serialize,
        V: Serialize,
        I: IntoIterator<Item = (K, V)>,
    {
        let iter = iter.into_iter();
        let mut serializer = try!(self.serialize_map(iter.len_hint()));
        for (key, value) in iter {
            try!(serializer.serialize_entry(&key, &value));
        }
        serializer.end()
    }
    #[cfg(any(feature = "std", feature = "alloc"))]
    fn collect_str<T: ?Sized>(self, value: &T) -> Result<Self::Ok, Self::Error>
    where
        T: Display,
    {
        use lib::fmt::Write;
        let mut string = String::new();
        write!(string, "{}", value).unwrap();
        self.serialize_str(&string)
    }
    #[cfg(not(any(feature = "std", feature = "alloc")))]
    fn collect_str<T: ?Sized>(self, value: &T) -> Result<Self::Ok, Self::Error>
    where
        T: Display;

    #[inline]
    fn is_human_readable(&self) -> bool {
        true
    }
}

Here we have a whole lot of extra complexity for just about any datastructure you can represent in Rust.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 20, 2018

One major drawback of the design I posted is that while it doesn't technically make serde a required dependency, it does make it implicitly required for any implementation of Log, because there's a chance that some of the Values on a log record can't be captured by the Visitor trait (some library might opt in to serde and log a Vec or something). That's why there's that Visitor::unsupported method, but it's a cop-out and the experience of having to deal with it would be poor.

So I've been thinking about how slog::Serializer by default forwards its methods to emit_arguments. We could do something similar in log, so instead of getting handed some unrepresentable value in your Visitor, you can always at least write it using std::fmt. That would mean that instead of doing a blanket implementation of Visit for any serde::Serialize, we'd do a blanket implementation of Visit for any serde::Serialize + std::fmt::Debug.

So updating the API from before:

// --- Set of key-value pairs --- //

pub trait Source {
    fn visit<'kvs>(&'kvs self, visitor: &mut dyn SourceVisitor<'kvs>) -> Result<(), Error>;


    // --- Provided adapter methods --- //
    // Using these you probably don't to build a `SourceVisitor` yourself

    fn erase(&self) -> ErasedSource
    where
        Self: Sized,
    {}

    fn try_for_each<F, E>(self, f: F) -> Result<(), Error>
    where
        Self: Sized,
        F: FnMut(Key, Value) -> Result<(), E>,
        E: Into<Error>,
    {}

    fn get<'kvs, Q>(&'kvs self, key: Q) -> Option<Value<'kvs>>
    where
        Q: Borrow<str>,
    {}

    #[cfg(feature = "serde")]
    fn serialize_as_map(self) -> SerializeAsMap<Self>
    where
        Self: Sized,
    {}

    ...
}

pub trait SourceVisitor<'kvs> {
    fn visit_pair(&self, key: Key<'kvs>, value: Value<'kvs>) -> Result<(), Error>;
}


// --- Individual key or value --- //

// Can't be implemented manually
// It's either a fixed set of `std` types or `T: Serialize`
pub trait Visit: private::Sealed {
    fn visit(&self, visitor: &mut dyn Visitor) -> Result<(), Error>;
}

#[cfg(not(feature = "serde"))]
impl Visit for $std_ty {} // where $std_ty is `u8`, `bool`, `&str`, `&Path` etc

#[cfg(feature = "serde")]
impl<T: Serialize + Debug + ?Sized> Visit for T {} // covers all $std_ty impls + a bunch more

pub trait Visitor {
    fn visit(&mut self, v: &dyn Visit) -> Result<(), Error> {
        v.visit(self)
    }

    fn visit_i64(&mut self, v: i64) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }
    fn visit_u64(&mut self, v: u64) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }

    #[cfg(feature = "i128")]
    fn visit_u128(&mut self, v: u128) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }
    #[cfg(feature = "i128")]
    fn visit_i128(&mut self, v: i128) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }

    fn visit_f64(&mut self, v: f64) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }
    fn visit_bool(&mut self, v: bool) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }
    fn visit_char(&mut self, v: char) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }
    fn visit_str(&mut self, v: &str) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }
    fn visit_bytes(&mut self, v: &[u8]) -> Result<(), Error> {
        self.visit_args(format_args!("{:?}", v))
    }

    fn visit_none(&mut self) -> Result<(), Error> {
        self.visit_args(format_args!("None"))
    }

    fn visit_args(&mut self, args: &std::fmt::Arguments) -> Result<(), Error>;
}


// --- Concrete types for keys and values --- //

// `ToOwned::Owned: Send + Sync + 'static`?
// use `Borrow<str>` as the generic trait bound for keys
pub struct Key<'k> {}

impl<'k> Key<'k> {
    fn as_str(&self) -> &str {}
}

// `Key` is always `Visit`
// If `serde` is available, it's also `Serialize`
#[cfg(not(feature = "serde"))]
impl<'k> Visit for Key<'k> {}
#[cfg(feature = "serde")]
impl<'k> Serialize for Key<'k> {}

// `ToOwned::Owned: Send + Sync + 'static`?
// use `Visit` as the generic trait bound for values
pub struct Value<'v> {}

// `Value` is always `Visit`
// If `serde` is available, it's also `Serialize`
#[cfg(not(feature = "serde"))]
impl<'v> Visit for Value<'v> {}
#[cfg(feature = "serde")]
impl<'v> Serialize for Value<'v> {}
impl<'v> Debug for Value<'v> {}

That way a library implementing Log could always get a reasonable representation of any loggable value without depending on serde.

@KodrAus KodrAus force-pushed the KodrAus:rfc/structured-logging branch from 06ee40f to b818223 Oct 28, 2018

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 29, 2018

Alrighty, I've gone ahead and updated the RFC based on the feedback so far. It removes serde as a required dependency, so we can support structured logging without opting in to serde support. Instead, it makes serde an optional extensibility point in the ecosystem so that log doesn't become a massive public dependency of its own to support structured logging in a way that's nice for users to consume. So with default features, log supports structured logging but doesn't bring in any new dependencies.

I've also added a section on how we could manage getting an owned version of the key-value pairs that is Send + Sync + 'static efficiently.

If you'd rather poke around an example implementation, I've got one here

@Thomasdezeeuw

Thank you @KodrAus for writing this RFC.

I however don't think this is the right way forward for this crate. Aiming to be the standard logging crate means supporting a lot of use cases, while not growing too big. And I my opinion these additions make the crate too complex and too large, even when leaving out serde.

The number of types and trait required to support this doesn't seem to be worth it, to me atleast. Eventhough I use serde myself I also think that serde is too complex, it only gets away with it because it has a limited number of backends (e.g. serde_json) and has nice macro support to derive the required frontend traits.

However serde only implements one way to serialise data structures. Maybe a different way of generic serialisation, one that is easier to implement, should be the way forward.

self.visit(visitor)
}
fn erased_get<'kvs>(&'kvs self, key: &str) -> Option<Value<'kvs>> {

This comment has been minimized.

@Thomasdezeeuw

Thomasdezeeuw Oct 30, 2018

Contributor

Missing =.

Show resolved Hide resolved rfcs/0000-structured-logging.md Outdated
Show resolved Hide resolved rfcs/0000-structured-logging.md Outdated
Show resolved Hide resolved rfcs/0000-structured-logging.md Outdated
Show resolved Hide resolved rfcs/0000-structured-logging.md Outdated
Show resolved Hide resolved rfcs/0000-structured-logging.md
Ok(())
}
}
```

This comment has been minimized.

@Thomasdezeeuw

Thomasdezeeuw Oct 30, 2018

Contributor

All the code examples seem somewhat repetitive, any possibility of providing a derive macro? Like serde does.

This comment has been minimized.

@KodrAus

KodrAus Oct 31, 2018

Contributor

Ah these examples are all just variations of the same thing; implementing the Source trait where the underlying data is in various different shapes. I'm not sure a macro would be worthwhile for deriving Source, because the way you'd surface key-value pairs is really dependent on what the underlying source is.

}
```

### `Error`

This comment has been minimized.

@Thomasdezeeuw

Thomasdezeeuw Oct 30, 2018

Contributor

Do we really need a custom error? Is there any serialisation format that can fail? Can't this be relaced by io::Error?

This comment has been minimized.

@KodrAus

KodrAus Oct 31, 2018

Contributor

Json can fail if you give it non-string keys, and any printing done through the Write trait can also fail. We do need our own error type because the log crate needs to work in no-std environments, and io::Error is only available in the standard library.

Show resolved Hide resolved rfcs/0000-structured-logging.md Outdated
Show resolved Hide resolved rfcs/0000-structured-logging.md Outdated
@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Oct 31, 2018

@Thomasdezeeuw thanks for taking the time to read through this giant document! I appreciate your input on how readable it is(n't). Before working through those though maybe we should try address the main criticism you've raised.

When talking about this API being too big are you talking about the surface area or the volume? I mean, do you think the public API is too big, or do you think there's too much going on behind the scenes to support it?

Personally, I don't think the public API is too big here, but there's definitely a lot more going on than in the simplest API we could imagine for key-value pairs.

If we look at the simplest possible representation we can see where the API proposed here abstracts over it:

&[
    ("key1", 42),
    ("key2", &["a", "b", "c"]),
    ("key3": Uuid::new_v4()),
];

There are two abstractions that we work with:

  • Source
  • Visit

The Source trait abstracts storage for key-value pairs:

&[
│     ("key1", 42),
│     ("key2", &["a", "b", "c"]),
│     ("key3": Uuid::new_v4()),
├ ];
└ `Source`

This abstraction is important for being able to convert records from other frameworks to and from records in log. If we only supported &[(&str, &Visit)] then we wouldn't be able to efficiently work with types from other frameworks, like BTreeMap<String, Value>, or Arc<CustomMap>.

The Visit trait abstracts serialization for structured values:

  &[
      ("key1", 42),
┌─────────────┴──┘
│     ("key2", &["a", "b", "c"]),
├─────────────┴────────────────┘
│     ("key3": Uuid::new_v4()),
├─────────────┴──────────────┘
│ ];
└ `Visit`

This abstraction is important (over just something like std::fmt::Debug) because we need to be able to inspect the structure of values in order to serialize them or maybe style them in a particular way in a terminal. We could just carry primitive values in an enum like serde_json::Value, but then we have the same problem we did with the concrete slice in Source. We make it more difficult for other logging frameworks to integrate with log when their values have a different underlying structure. Alternatively we could just carry a raw trait object and expect that to be downcast to a known concrete type. That wouldn't work either because log needs to support arbitrary log producers communicating with arbitrary log consumers. Unless you control both the producer and the consumer you couldn't reasonably know what to downcast that trait object to. So we need some serialization API that forces implementors to conform to some contract that lets anybody reasonably determine its structure.

The API proposed for Visit here is already a very simple serialization API. It's just one that can optionally carry serde::Serialize with it. That lets it punt a lot of complexity that would otherwise be introduced for arrays, maps, #[derive], formats, and managing breakage. We get all that for free(ish). There's internal (not public), module-local complexity to get these benefits, but I think localized complexity is much more manageable and is a fair tradeoff to make in log to offer the ecosystem the most appropriate API that it can for producing and consuming structured logs.

So I appreciate the need to keep log tight and fundamental. I think based on the usecases they need to support, both Source and Visit do achieve a good balance. I've also got the assumption that structured logging is a really fundamental feature, so its API deserves an appropriate amount of complexity for the benefit that it brings the Rust ecosystem.

@sfackler do you have any more thoughts?

@Thomasdezeeuw

This comment has been minimized.

Contributor

Thomasdezeeuw commented Nov 2, 2018

@KodrAus Saying that the crate would become too big wasn't clear, your right. What I meant was that the public API would become too big, i.e. it would have too many types (struct/traits). If I counted correctly 7 new types would be added to the public API. On top of that another couple helpers, e.g. SerializeAsMap, and more types when the serde feature is enabled, e.g. ErasedSource. The currently crate exposes 7 types (+ 2 builder types), which means the public API would be at least become twice as large.

For a "simple" logging facade it doesn't seem so simple to me.

But I do agree that structured logging is a fundamental feature. So perhaps with some structuring of the public API, e.g. by pushing most Log implemening types to the back, while pulling the logging macros to the front. This way "users" of the log crate keep a simple public API, while the "Log implementers" have to deal a more complex API.

P.S. I hope I wasn't too harsh with my feedback.

@Thomasdezeeuw

This comment has been minimized.

Contributor

Thomasdezeeuw commented Nov 2, 2018

@KodrAus To continue the discussion around the Source and Visitor traits; could this be solved with an a serde_json::Value like type in an iterator, so &dyn Iterator<Item = (&str, Value)>. This means that BTreeMap<String, Value>, and similiar types, can be chained onto it. And Value will keep a type around. Of course Value can't possibly support all types, so maybe a variant should be Other(&dyn Display), or perhaps a similiar trait. Or with a serde feature: Serde(&dyn serde::Serialize).

Upside is that the public API remains small, adds only a single type. Downside is that the order of the iterator is now unspecified, I'm unsure if this is important for you. And we still have fully typed value (only when they fit in Value) when logging.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Nov 4, 2018

For a "simple" logging facade it doesn't seem so simple to me.

Hmm, I wouldn't necessarily describe log as a simple logging facade. At least I wouldn't make simple its primary goal. Instead, I'd call log a fundamental logging facade. Simple is a nice goal, but fundamental is what would let log support the development of Rust's logging ecosystem rather than just being an obsolete artifact of it. Other people might have different opinions than me on that though.

could this be solved with an a serde_json::Value like type in an iterator

This is actually where I started earlier this year, but ran into some complications. The problem with an iterator approach is that it's not object safe, and making it object-safe requires some unappealing public machinery for progressing and resetting the iterator. So the Source and source::Visitor system is conceptually like IntoIterator and Iterator, but retains object safety by pushing the iteration inside the Source. You just get a chance to observe that iteration using a Visitor. This pattern lets us build a complimentary set of adapter methods like we already have with Iterator. Those adapters make it possible to normally work with a Source without having to define any Visitors at all. So the Source and source::Visitor approach results in the same number of traits to work with as we'd have with IntoIterator and Iterator.

The Value type you're describing isn't too far off the original version of Value from this RFC. I ended up changing it to Value, value::Visit and value::Visitor in the process of limiting serde's role in the API. It turns out it's roughly the same number of types. Even if you just have a concrete Value type you still need a trait like ToValue that can take an arbitrary T and turn it into a Value for the log! macros, and for records from other frameworks. Exposing the variants of that Value directly as an enum wouldn't be desirable because of backwards compatibility concerns. You'd either need a non-exhaustive catch-all arm when matching over the variants, or you wouldn't be able to add variants without breaking consumers. Neither of those options is appealing. So you'd end up needing a trait like value::Visitor anyways so callers could guarantee that all values can always be logged regardless of internal changes to the representation of the Value. So the Value, value::Visit and value::Visitor approach results in the same number of traits to work with as a concrete type does. I do feel like the value machinery deserves a lot of scrutiny though.

So I don't think just aiming to reduce the number of types is necessarily going to result in a more understandable public API. That lands us in a state where we make arbitrary limitations for the sake of an API we assume is more understandable. We want the API to be understandable through its use. I think the way to do that is to look at the interactions of these types for unnecessary complications, rather than just treating their existence as a complication. It is definitely worth making each type justify its existence though.

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Nov 4, 2018

To make it a little easier to explore, I've pushed up the API docs for this proposed API.

@KodrAus KodrAus referenced this pull request Nov 11, 2018

Open

Logging middleware #8

KodrAus added some commits Nov 11, 2018

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Nov 13, 2018

Working through the process of pushing serde further out of the proposed structured logging API, I've been playing with another alternative serialization API, that's similar to the direction a prototype tracing API for tokio is taking.

Something like this would remove all the wackiness around changing trait bounds based on Cargo features in order to avoid loggable types depending on log, but would create a pervasive public dependency, and put us in the business of maintaining a serialization framework. I've previously been trying not to do this. We might be able to mitigate those concerns though if we were aiming to build a common value serialization abstraction that all logging frameworks could depend on instead of defining their own.

@killercup killercup referenced this pull request Nov 16, 2018

Open

Log target #4

0 of 3 tasks complete
"This is the rendered {message}. It is not structured",
message = "message";
^^^^^^^^^^^^^^^^^^^
unstructured

This comment has been minimized.

@scottmcm

scottmcm Nov 17, 2018

This surprised me. I'm use to (not in rust) doing messages like "action {action_name} completed in {duration}ms", and still getting the properties as real columns that I can filter and pivot on.

This comment has been minimized.

@KodrAus

KodrAus Nov 24, 2018

Contributor

Yes, that's a better way to do things, but we couldn't do this and maintain potential structure without potential breakage to callers of log!, because the trait bounds within that format would change.

@Thomasdezeeuw Thomasdezeeuw referenced this pull request Nov 21, 2018

Closed

Initial implementation of adding key value pairs #281

0 of 2 tasks complete
@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Nov 28, 2018

Ok, I've got something more to share on the serialization front. This library (with rough source here) is similar to the last one I posted, but has a smaller surface area. It's no-std, object-safe and integrates optionally with serde. It seems like something we could build structured logging serialization off.

For integrating an external serialization framework like this into log, we could either:

  • use the same approach the RFC proposes with serde. This doesn't offer any benefit over what the RFC currently proposes.
  • depend on it and use its API. That means log would have more than zero dependencies by default. This was a problem for serde because it takes a while to compile and object-safety requires boxing.
  • pull it into log so it doesn't directly consume an external serialization framework. That means implementors of this value type would need to depend on log.
  • provide our own simplified API and bridge them, using Cargo features to enable support. That means we'd make callers of the log! macros have to work a bit harder to serialize their structured values.

Personally, I'm thinking the third option could be worth pursuing, since callers would already not be able to log their serde datastructures by default.

Does anybody else have any thoughts?

@KodrAus

This comment has been minimized.

Contributor

KodrAus commented Dec 5, 2018

I've managed to convince myself that it's not super important for structured logging in log to be able to capture all the types in the ecosystem that can be serialized by default. It's less user-friendly, but it leaves the door open for someone to build a better set of log! macros that can capture structured values differently, and could find their way back to log itself one day.

So if we don't need to bake a framework in out-of-the-box because we don't need to capture its values out-of-the-box, then we can probably move forward with a much slimmer values API, and see what direction tokio-trace and sval go in.

I was thinking a public API something like:

pub trait ToValue {
    fn to_value(&self) -> Value;
}

impl ToValue for u64 { .. }
impl ToValue for i64 { .. }
impl ToValue for str { .. }
// etc.

pub trait Visitor {
    // The only required method
    fn visit_any(&mut self, value: Value) -> Result<(), Error>;

    // Provided methods for primitives
    fn visit_u64(&mut self, value: u64) -> Result<(), Error> {
        self.visit_any(value.to_value())
    }
}

pub struct Value<'a>(_);

impl<'a> Value<'a> {
    pub fn from_debug(v: &dyn fmt::Debug) -> Self { .. }

    pub fn visit(&self, visitor: impl Visitor) -> Result<(), Error> { .. }
}

impl<'a> fmt::Debug for Value<'a> { .. }

Internally, we can use private methods on Value to retain the structure of integers and other primitives when forwarding to a Visitor.

As a provider of dataustructures, like uuid, you can implement the ToValue trait yourself, but the only way to get a hold of a Value is to call ToValue on another type that already implements it, or use Value::from_debug.

As a consumer of structured logs like env_logger, if you don't care about structure, you can just implement Visitor::visit_any. If you do care, you can override provided methods like Visitor::visit_u64.

Later, to extend this system to support another serialization framework, like sval, we would:

  • Add a public Value::from_sval method or something.
  • Derive sval::Value for Value as well as fmt::Debug.

For serde it would look basically the same, but could be supported in terms of sval.

That way, any implementation for Visitor could treat a Value like a sval::Value or a fmt::Debug without anything else having to change.

That could be a way for us to unblock on the serialization question.

@Thomasdezeeuw Thomasdezeeuw referenced this pull request Dec 12, 2018

Open

Improve Logging #21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment