Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured Logging #149

Open
alexcrichton opened this Issue May 19, 2017 · 60 comments

Comments

Projects
None yet
@alexcrichton
Copy link
Member

alexcrichton commented May 19, 2017

One of the major points of discussion during the log evaluation was structured logging, specifically the log crate's relation to slog. In the review we ended up concluding that there's no reason to not add structured logging to the log crate and it'd be awesome to unify the slog/log ecosystems!

This issue is intended to serve as a tracking issue for concerns and ideas related to this. For example the specific APIs the log crate would have, how this would relate to slog, migration stories, etc.

@dpc

This comment has been minimized.

Copy link

dpc commented May 19, 2017

Really? Awesome. Is that planned to be added before log 1.0.0 release?

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented May 19, 2017

The thought I had in the meeting was something like this:

info!("..."; MeaningfulType(information), PossiblyAnother(something_else));

where the currently selected logging implementation receives anything that matches its associated type, and any other types do nothing.

Key-value style structured logging in this paradigm may look something like this, where the kv macro is provided by the structured logging library to construct the appropriate type:

info!("..."; kv!("a" => a, "b" => b));

EDIT: or obviously it could provide its own macros that replace the ones in log:

info!("..."; "a" => a, "b" => b);
@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented May 19, 2017

@dpc that's the current intention, yeah, preparing this for the 1.0 release. We'd probably do it in 0.3 if we could, but it's unlikely to be possible without breaking changes! We're of course very eager to hear about everything you've done with slog and the best way to make progress on this. Want to make sure you're kept very closely in the loop!

@dpc

This comment has been minimized.

Copy link

dpc commented May 19, 2017

I will be happy to share my perspective and experiences during the discussion. Personally, I would be very happy to see as many features from slog ported to log so slog could just reuse log, or even be rendered completely unnecessary. I've also pinged slog users in case they are interested - they probably know better than me what do they look for in structured logging.

The whole feature is implemented in slog, and of course, I tried to make it as good as I could have imagined, so my initial point of view is: just take a look at https://docs.rs/slog/2.0.5/slog/trait.KV.html and https://docs.rs/slog/2.0.5/slog/trait.Serializer.html , and potentially other slog pieces, and use it as a starting point for discussion. The KV + Serializer are loosely mimicking serde, just simplified (eg. no sequences support).

Edit: Also, reading the documentation of https://docs.rs/slog/2.0.5/slog/macro.log.html , might give a good overview of features supported. Eg. it's possibe to use just someval in place of "k" => val if someval implements KV so that somevar can supply it's own key(s) (useful for logging structs)); or "k" => %v if v implements Display etc.

@daboross

This comment has been minimized.

Copy link

daboross commented May 20, 2017

When we do add structured logging, there's one question I'm wondering about: what do we encourage users to use?

Binary crates choosing either to go full-formatting (like current log) or full-data is not a problem, but if we continue to support both, what should libraries choose?

I mean having a library output debug information in structured logging will give the end reader more flexibility, but I'm guessing it would also add a bit more overhead as well. If all a binary crate wants is a console stream of readable log messages, is it best to construct the additional structured log records and then just use a default conversion to a string?

I'm also curious what the best practices for this in slog currently are, if that would be reasonable to expand on? I mean, so far my actual usage of log has been a leveled println!() replacement with the extra bonus that I get debug messages from my dependencies as well. Not that this has to be answered here - I should probably just go read the documentation - but it might be useful context for adding an implementation to the log crate too!

@dpc

This comment has been minimized.

Copy link

dpc commented May 20, 2017

Good question. I'm aware that not everyone buys into the whole structured logging concept. Especially in slog where more "unusual" features are bundled together, people don't initially see the whole value proposition. Most people are used to "there is one global logger and I give lines of texts" logging and ask "why all this hassle?". I myself became "a believer" once I was working on a heavy-duty production service written in Go, where all the data was logged to json, collected to elastic search cluster where business analytics and monitoring was performed on it.

It would be useful to check experiences of any other programming language community. I'm not aware of any languages that have first-class, core-encouraged, structured logging support, but maybe there are some.

IMO, if structural logging is to be supported, then it should be encouraged, since as far as I can see it is clearly better. Even if you just log a stream on a console, doesn't screenshot on https://github.com/slog-rs/slog look better than a plain wall of text? Without structured logging, you can only distinguish between basics: logging level, timestamp and message, etc. With structured logging, there is much more flexibility. And libraries can't decide for all applications using them, what kind of logging should they need so they should default to structured, as it can cheaply and gracefully be downgraded to plain line of text.

The performance different is negligible. In slog all structured logging key-value are just one additional pointer to KV object data. If you do info!("received number: {}", n) then there will be one "formatting to string" done for you. If you do info!("received"; "number" => n); and then write it to stream, there will also be one pass of "formatting to string" done for you. As far as I can tell, any logging performance considerations should be mostly: avoid copying and allocation, not doing anything until needed, preparing as much as possible during compilation (static etc.) and asynchronous logging (making sure passing logging data to another thread is cheap).

@daboross

This comment has been minimized.

Copy link

daboross commented May 22, 2017

Thanks for going over that!

I think initially I was a bit worried that adding complexity to log would scare new library authors away from adding logging into their libraries, but I think it won't be too bad.

The console logging does look nice! I mean, I would think it'd be nicer with data "inline" with the message, but then that would sacrifice all the rest of the flexibility.

To me,

weak encryption algorithm 'xor' requested by `82.9.9.9:42381` (host: localhost:8080, version 0.5)`

looks better than

weak encryption requested, algo: xor, port: 42381, peer_addr: 82.9.9.9, port: 8080, host: localhost, version: 0.5

... but I imagine having ecosystem-wide structured logging would be excellent for larger applications. The coloring available when data is separated like that isn't too bad either!

I think my ideal logging scheme would be something like info!("received number: {number}", number => n), with data both separated out into query-able structured records, and placed inline with the formatting - but that'd probably be a lot of work and complexity in teaching for little overall gain.

In any case, thanks for that!

@alexcrichton alexcrichton referenced this issue May 21, 2017

Closed

Tracking issue for log evaluation #150

36 of 38 tasks complete
@sanmai-NL

This comment has been minimized.

Copy link

sanmai-NL commented May 24, 2017

@dtolnay: I propose a hopefully leaner/less magical implementation than making KV part of a macro as now in the slog crate. What do you think about the logging function taking a single value-argument of a Serialize type? The deeper structure or representation of a record is then left to the programmer. It would be quite easy to declare a wrapper type for emitting the logging record as some sort of KV-dictionary. The advantage is that any logging record has a meaningful, possibly varying type. And of course the less magic thing, new developers just starting to use logging would not have to learn the intricacies of another set of macros.

@bradleybeddoes

This comment has been minimized.

Copy link

bradleybeddoes commented Jun 19, 2017

I just wanted to highlight one of the features I like best about slog which is simply the ability to add structure to the resulting log message as different parts of the application get involved.

For example I have a Hyper Service. Each Request is given a unique logger instance early on in the call function (a simple clone of a root logger, which is cheap in slog).

Eventually the application accesses the logger and logs something about the Request it has been working on, such as:

// .. current logger is acquired
info!(logger, "stored widget")

What is interesting however is that prior to this the logger has been extended by some other code (that I may or may not know about):

// .. current logger is acquired
let logger = logger.new(o!("request_id" => "123abc"));
// .. updated logger is made available for others as current logger

The output (assuming JSON formatter) is then roughly:

{"ts":"2017-06-19T02:25:08.521265800Z","level":"INFO","msg":"stored widget","request_id":"123abc"}

If I also added some external code to do authentication it might end up as:

{"ts":"2017-06-19T02:25:08.521265800Z","level":"INFO","msg":"stored widget","request_id":"123abc","user_id":100}

I understand that passing a logger instance all over the place may seem clunky to some folks but if we can figure a way to maintain this functionality there are a lot of nice benefits that can be gained.

@dpc

This comment has been minimized.

Copy link

dpc commented Jun 19, 2017

@bradleybeddoes I believe the part you're describing is called "contextual logging". I'm not sure if we should open another issue or consider it a part of this issue.

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Jun 19, 2017

As long as it's possible to construct a Record via the public API (which we're going to support), a contextual logging library can be built on top of log without too much trouble I don't think.

@lilianmoraru

This comment has been minimized.

Copy link

lilianmoraru commented Jul 8, 2017

I probably don't understand what does "structured logging" mean but seems like people assume that logging automatically means terminal or streaming to a file and then reading that file, so you need it to look nice for that use-case.

My personal interest(I guess I'm selfish) is to keep the cost of the logger as low as possible because in my case(a lot of services on an embedded device, all sending logs over the network), people send thousands of logs per second(actually, this is the amount of logs coming from the device, not from every application separately) and I can't afford adding overhead.
Here is the project I'm thinking about here(a bit inactive, sry...): https://github.com/lilianmoraru/dlt-rs

More info about DLT, here: https://at.projects.genivi.org/wiki/display/PROJ/About+DLT-Transport

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jul 11, 2017

It would be useful to check experiences of any other programming language community. I'm not aware of any languages that have first-class, core-encouraged, structured logging support, but maybe there are some.

In .NET we have serilog which uses a standard message templating syntax for structured logging.

@nblumhardt and I spent some time a while ago exploring a similar design for Rust. The idea is to log both a textual log message template along with the individual properties and additional context that make up a message. It's up to the particular sink to decide how it consumes the template and the properties, maybe by substituting properties in, composing a new object with a property containing the message, whatever. Libraries don't care where their logs go.

I'll admit I haven't spent a lot of time digging into the state of structured logging in Rust or the API log exposes besides it's macros for a long time so will go and catch up.

@nblumhardt

This comment has been minimized.

Copy link

nblumhardt commented Jul 11, 2017

One of the things we explored with emit was keeping very close to the API of the log crate - just adding a little bit of structured-logging-sauce on top :-)

I.e. here's a typical log event:

info!("Hello, {}!", env::var("USERNAME").unwrap());

And here's the structured extension along the lines of emit:

info!("Hello, {}!", user: env::var("USERNAME").unwrap());

The two events would produce exactly the same textual output, but by naming the user parameter, this would also be available as structured data attached to the event, if the log destination supported it.

Keeping a smooth continuum from completely unstructured, to fully-structured, log events might be preferable to splitting the API into two unrelated subsets, if it could be achieved in a non-breaking way :-)

@dpc

This comment has been minimized.

Copy link

dpc commented Jul 12, 2017

How is the performance of such structured logging proposals?

"Message Templates" seems like it will involve string parsing/manipulations. Eg. It looks to me that emit is dynamically allocating HashMap every time it logs anything: https://github.com/emit-rs/emit/blob/master/src/lib.rs#L144 . Might be OK for languages that are generally not very performant and overallocate all over the place, but probably in Rust niche, it can be a problem.

The way structured logging is done in slog is very performant, with ability to very quickly offload any formatting and output to different thread(s), with minimal work done by the current thread.

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jul 12, 2017

@dpc You're absolutely right that's an important point. And we expect a core Rust crate to cater to those use cases where performance is paramount. But how you capture properties is more of an implementation detail, whether it's a hash map or some kind of linked list or something else. I think we both agree that the latency of logging from a caller's perspective should be absolutely minimal, and the reality in either case is that a sink has the freedom to consume the structured log however it needs for its eventual destination.

The main thing I'd like to maintain is not having to pass a logger into the log calls, or have to carry a logger around in libraries for the sake of structured logging. I think it's still important to support that case, but in my experience it's been most helpful to grab the ambient logger and enrich it with some additional state as necessary rather than chain a logger throughout your application.

@daboross

This comment has been minimized.

Copy link

daboross commented Jul 12, 2017

I think it would be awesome if we could get something like emit's syntax working for this, glad you mentioned that nblumhardt.

Perhaps with a proc-macro-hack, or maybe even without, could the log!() macro be made to store a temporary for each argument, then put the arg in both a stack-allocated array of &[(&'static str, &Display)] and pass it onto format_args!()?

A vec or hashmap could then be allocated by the logging backend if it needed to send the arguments to another thread, or any other permanent storage.

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Jul 12, 2017

It's pretty straightforward to turn something like {foo: some_variable, bar: "some_literal"} into &[("foo", &some_variable), ("bar", &"some_literal")] in a normal macro-rules macro.

I would very much like to keep the "normal" logging path allocation free, and that shouldn't be too much harder to make happen with some structured logging.

@dpc

This comment has been minimized.

Copy link

dpc commented Jul 12, 2017

chain a logger throughout your application

This is a part of "contextual logging", and is orthogonal to structured logging provided by slog. One can use slog-scope and not worry about Logger object if so inclined.

I was mostly wondering about:

log("User {username} logged in from {ip_address}", username, ipAddress)

from https://messagetemplates.org/. It is a nice idea, but I wonder how hard it will be to generate static information at compile time, using stable-Rust macros.

&[("foo", &some_variable), ("bar", &"some_literal")]

What will be the type of it? I was doing something like that in slog v1: &[&'static str, &Value] where Value was a trait. The trait was so that users can implement it for other types. But then I switched in slog v2 to just &KeyValuePairs where KeyValuePairs is a trait, since it had many benefits: it is possible to implement KeyValuePairs for any type (arrays, tuple, heterogeneous list, structs, etc.). Key can be generated at runtime, supporting lazy values etc. The common case is of course impl KeyValuePairs for [(&'static str, T)]. Since KeyValuePairs provides potentially multiple key-value pairs, it can control order of pairs, or impl KeyValuePairs for combination of contexts etc.

Now that I looked at https://messagetemplates.org/, I kind of wish I did it as so in slog. Though I still don't know how to write a macro for it.

@daboross

This comment has been minimized.

Copy link

daboross commented Jul 12, 2017

@dpc

For basic logging, this is already somewhat supported with format_args!(). Any custom macro would just have to send arguments to that correctly, and then also be able to use them to create an array.

This works in today's log:

info!("a {whatisit} message! {ending}", whatisit="keyword", ending=":)");

http://play.integer32.com/?gist=09caba59aeed61eb13ec7c3261b7c97c&version=stable

@dpc

This comment has been minimized.

Copy link

dpc commented Jul 12, 2017

Hmmm... I liked the fact that one does not have to repeat itself in log("User {username} logged in from {ip_address}", username, ipAddress) but maybe it is even better, since it's not position dependent.

@dpc

This comment has been minimized.

Copy link

dpc commented Jul 12, 2017

Also, how to support "additional" keyvalues? Eg.

info!("a {whatisit} message! {ending}", whatisit="keyword", ending=":)", request_id=req_Id);

The request_id is assigned to the key-value pairs, but is not actually used in textual message itself. Sometimes the number of additional data might be quite big, and not everything is neccessary in the text information - some might be mostly useful for robotos.

Somehow the macro would have to make distinction between the values used in the text message, and addition ones, as format_args!() complains in any argument was not used (rightly so). Could be eg. additional separator.

info!("a {whatisit} message! {ending}", whatisit="keyword", ending=":)"; request_id=req_Id);

Also, what are the preferences for the delimiter between key and value? Slog uses => but it is kind of heavy. Would:

info!("a {whatisit} message! {ending}", whatisit: "keyword", ending: ":)"; request_id=req_Id);

be fine with everyone? Or are there any alternative preferences/ideas?

@fluffysquirrels

This comment has been minimized.

Copy link

fluffysquirrels commented Jul 12, 2017

As far as I can tell, neither emit or slog seem to support arbitrary types as structured values (see emit's Value enum, slog's Serializer trait). I found myself wanting to log e.g. a UserContext struct but could not do so. I'd be very happy to see support for any value with a serde::Serializable implementation, or some similar trait. Of course as a workaround you can convert any struct to JSON and store that.

Regarding the logging macro syntax bikeshedding, I quite like keeping format_args! as a subset, so using = between key and value for string formatting. I slightly prefer ; over , between string formatting and structured arguments. I quite like : between key and value for the structured arguments, to match the struct literal syntax, but => and = are also OK.

To summarise, I like:

info!("{} {msg}", "Hello", msg="world!"; k: v);
@dpc

This comment has been minimized.

Copy link

dpc commented Jul 12, 2017

A lot of people asked for structured types in slog. Sadly it has many challenges. I didn't feel up to the task and just left it out. I think it's a complex subject, and maybe we could use a different github issue so we keep this one focused on the core structured logging.

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jul 12, 2017

At first I didn't really like the syntax of separating replaceable properties from other properties, because I see the message as just a view over the properties. But thinking about it, rejecting logs at compile time with messages that aren't properly replaceable sounds useful.

smangelsdorf pushed a commit to gotham-rs/gotham that referenced this issue Aug 11, 2017

Add logging
Provides extensive logging across the project, especially at the
TRACE level.

Currently also tracking
rust-lang-nursery/log#149. Keen to have Gotham
move to structured logging as soon as practical.
@theduke

This comment has been minimized.

Copy link

theduke commented Oct 28, 2017

What's the status here?

Anything that can be done to move this forward?

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jun 16, 2018

Hi everyone! I found some time to revisit this thread and thought I'd share the latest iteration I've been exploring. I don't want to build a sandcastle so as always am keen for any input!

Strawman log API

NOTE: I've used some recent Rust features in my strawman implementation like impl Trait and dyn Trait. That's just because it's convenient. We don't actually need to bump the minimum version to make any of this work.

Some thoughts

I thought I'd share a few of the things I've tried to keep in mind when exploring the ways we could do structured logging in log first.

I'd like to be able to find a design for structured logging here that doesn't require a major breaking change to log. So the structured logging implementation I'm exploring is one that fits in the current design of the log crate.

Right now, log is your entrypoint, it's only got what you need to capture the input given to log! macros and have that input handled by some registered callback. More sophisticated pipelining of log records by composing handlers together is the responsibility of another framework that uses Log as the on-thread capturing of log records.

We might want to revisit that in the future, but can be worked through orthogonally.

Error handling

Errors in logging pipelines are tricky and there's no single obvious way to handle them.

Right now, the visitors for KeyValues are infallible, because serialization failures to in-memory buffers are usually from malformed input, so there's not much you can do with an error. We might want to revisit this though.

Sending records across threads

The log macros always borrow from their environment. Sending records across threads means serializing and boxing.

This API is much simpler to deal with at the expense of allocations when materializing an owned record. In my experience, most logging pipelines are CPU-bound until the last mile where records are sent over a network or written to a file. Doing CPU-bound processing on-thread before offloading a potentially serialized record for batching IO on another thread is both simple and efficient.

It also means callers don't need to worry about whether the receiver of their records requires owned or borrowed data.

Design

log!

We can log values that implement ToValue using the standard log macros:

info!("hello, {}", user; kvs: {
    user,
});

(We might want to bikeshed the syntax, the strawman implementation implements the kvs: { ... } part of the macro, but I haven't tried to make it disambiguate with the args passed to the message yet)

It's useful to be able to tweak the way values are logged. We can use an alternative in-built adapter using a standard attribute syntax:

info!("hello, {}", user; kvs: {
    #[log(display)]
    user,
});

This would be useful for types that don't meet the Serialize + Display bound, like Paths and most Errors:

error!("something didn't work"; kvs: {
    #[log(error)]
    err,
    #[log(path)]
    file_path,
});

There's nothing magic about the adapter ident within #[log($adapter)], it just resolves to a path like log::key_values::adapter::map::$adapter. New adapters can be added as new functions to that module without touching the macro.

Custom adapters can also be written as functions in other crates and associated with key values:

info!("hello, {}", user; kvs: {
    #[log(with = some::adapter)]
    user,
});

The signature for some::adapter would be something like:

fn adapter<T: Bounds>(value: T) -> impl ToValue;

Decorating the log macro invocations means we don't need libraries in the ecosystem with types we want to log to opt-in to ToValue.

One drawback of using an attribute-like syntax for decorating key-value pairs is that you might expect other standard attributes like #[cfg] to work too. This might be easier to support using procedural macros instead of macro_rules.

Value

A Value is basically a Serialize || Display. This is so that in environments where we don't have an allocator and can't use erased_serde, we can still log using the same types by default. If erased_serde is unavailable, values will be serialized as strings using the Display implementation. If erased_serde is available, values will be serialized with their Serialize implementation.

ToValue

A type that can be converted into a Value. Anything that implements ToValue can be used as a value in the log macros.

trait ToValue {
    fn to_value(&self) -> Value;
}

There's a blanket implementation for Serialize + Display:

impl<T> ToValue for T where T: serde::Serialize + std::fmt::Display;

There are some interesting implications of requiring Display in addition to Serialize. It means values need to have a single canonical human-readable representation, which is reasonable for values you want to log. Large and complex structures that implement Serialize aren't usually appropriate for log records.

The Serialize + Display bound is met by most standard primitives like i32 and String, and interchange types in the ecosystem like Uuid. However, it does exclude types that you'd reasonably want to log, like Paths, and most Error or Fail implementations. This is where adapters on the log macros come in handy.

KeyValue

A single key-value pair.

trait KeyValue {
    fn key(&self) -> &str;
    fn value(&self) -> Value;
}

This trait has a blanket implementation for tuples:

impl<K, V> KeyValue for (K, V) where K: AsRef<str>, V: ToValue;

KeyValues

A set of key-value pairs that can be visited using a Visitor.

trait KeyValues {
    fn visit(&self, visitor: &mut dyn Visitor);
}

This trait is implemented for slices:

impl<KV> KeyValues for [KV] where KV: KeyValue;

If std is available, it's also implmented for collections like BTreeMap and HashMap.

See some example KeyValues implementations.

Visitor

A type that visits individual key-value pairs.

trait Visitor {
    fn visit_kv(&mut self, kv: &dyn KeyValue);
}

This might involve serializing, filtering, counting etc. The name Visitor is just so it's clearer how this API interacts with serde::Serialize.

See an example Visitor implementation.

AsMap + AsSeq

There are a few extension traits to make it easy to serialize KeyValues using serde.

@dpc

This comment has been minimized.

Copy link

dpc commented Jun 20, 2018

@KodrAus
Long overdue... I've found some time to take a look.

trait ToValue {
    fn to_value(&self) -> Value;
}

Most of the API is "push", while this one is "pull" - control flow is reversed and I think this will create problems. Some things might require allocating because there is no way to return something holding an internal reference here etc.

I would consider:

trait Value {
   fn serialize_to(&dyn Serializer);
}

and

trait Serializer { // called Visitor in your example
   // make this #[cfg(serde)] or something?
  fn emit(to : &dyn serde::Serialize);
  fn emit_display(to : &dyn Display);
}

this way the API is always "push" and Value can always prepare anything on stack and pass a reference to it to the Serializer it got.

It was a common request to make Key something better than &'static str: https://github.com/slog-rs/slog/blob/master/src/key/dynamic.rs#L46

TBC.

@Thomasdezeeuw

This comment has been minimized.

Copy link
Contributor

Thomasdezeeuw commented Jun 21, 2018

I've created pr #281 for considation for this discussion. It uses the following design:

The trait to represent a key value pair:

pub trait KeyValue {
    fn key(&self) -> &fmt::Display;
    fn value(&self) -> &fmt::Display;
}

And for the log macro:

log!(Level::Error, "Message: {}", "arg1"; id = 123, key2 = "value2");
error!("Message"; id = 123, key2 = "value2");

Currently that means that the log (and other) macros uses a &'static str as key. But in case something else is required the RecordBuilder can be used, the new kvs method accepts &[& KeyValue] (lifetimes elided).

@dpc

This comment has been minimized.

Copy link

dpc commented Jun 21, 2018

@Thomasdezeeuw See my previous comment. Returning &X is not very practical. Let's say that implementation of KV has to create a value by concatenating two strings. Now it has String on a stack, and how is it supposed to return a reference to it?

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jun 21, 2018

@Thomasdezeeuw thanks for jumping in!

I think you've nailed the essence of what we're trying to do here, but I think the &dyn Display type for values is unfortunately limiting because it can't retain the structure of the original input. This is important for drains that emit records as structured data rather than text.

@dpc

Most of the API is "push", while this one is "pull" - control flow is reversed and I think this will create problems. Some things might require allocating because there is no way to return something holding an internal reference here etc.

In this case, the serde::Serialize implementation has the same scope as your proposed Value trait, so I don't think this is actually the case here. The goal of Value is to make it possible to keep erased-serde as a private dependency, so that between std and no_std contexts we're only adding APIs. It also abstracts the logic of determining whether to use Display or Serialize and offers us a chance to capture primitives on the stack in the no_std context and avoid displaying them as strings if we can get away with it.

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jun 21, 2018

@dpc That is a good point about the reversing of control flow though 👍 Sometimes that's a bit of a smell, so it deserves some more attention.

As for keys as Cow<'a, str>, that seems reasonable. We could support that using a similar facade as Value, since Cow isn't available in libcore.

@dpc

This comment has been minimized.

Copy link

dpc commented Jun 22, 2018

@KodrAus Could you elaborate on the private dependency? I mean serde will have to be exported somewhere at least once.

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jun 22, 2018

@dpc That's right, serde would be a public dependency but it supports no_std so that's not a problem. I've been trying to avoid making erased-serde a public dependency though, because it needs to allocate, so can't work in no_std. So Value makes erased_serde an implementation detail so we can thread arbitrary object-safe &dyn erased_serde::Serializes, while Value itself implements the non object-safe serde::Serialize.

@Thomasdezeeuw

This comment has been minimized.

Copy link
Contributor

Thomasdezeeuw commented Jun 22, 2018

For logging structs I was thinking about deconstructing them before logging, e.g.

let user = User {
    id: 123,
    name: "Bob".to_owned(),
}

log!(Level::Error, "Some message"; user_id = user.id, user_name = user.name);

Since most structures have atleast some fields that might not need to be logged.

I also think that making serde part of the API (outside of the currently bits behind a feature flag) would be a mistake. I think serde is awesome and use it as well, but that might not be the case for everyone. If log should be the standard, go-to logging framework that everyone writing rust code uses, it shouldn't depend on other crates in my opinion.

@mthebridge

This comment has been minimized.

Copy link

mthebridge commented Jun 22, 2018

Jumping in here because I have interest in this, and to throw in some extra thoughts...

I'm a big fan of just being able to log a single struct as one value and not having to write out all the fields each time.

In fact, at my company we're using quite successfully an extension we've written for slog (https://github.com/slog-rs/extlog) which allows defining logs entirely as objects, rather than "string + KV data". It would be great if the log crate could do something similar - at least as far as having the log macros be able to take a log "object" rather than a string.

Something like:

/// Define the log.
#[derive(LogObject)]  
#[LogDetails(message = "An event has happened", level="info")]
pub struct MyAwesomeLog {
  pub name: String,
  pub count:  u32 
}

// Then later...
log!(MyAwesomeLog { name: "my_user".to_string(), count: 42 });

This is equivalent to writing info!("An event has happened", name => "my_name".to_string(), count => 42); - but there are advantages of this approach: it gives you compile-time checking of log arguments, defines the logs as an API so you can document them and easily spot when they change (useful in the case where you're using a distributed logging solution and building monitoring systems on top of your logs, to ensure they don't change without good reason), and allows you to take advantage of high-performance log handling that doesn't need string parsing (if your log records get sent somewhere in binary format).

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jun 22, 2018

@Thomasdezeeuw This does still have the problem that user_id can only be interpreted as text, not as a number.

I have a different opinion about leaning on serde, because any API for capturing structured data is inevitably going to look something like serde's it makes sense to create that link so that effort invested into serde provides improvements for log and effort invested into Serialize implementations in the ecosystem result in more types that can be logged out-of-the-box. I don't imagine serde diverging from its current path enough to make its use in log problematic.

What are some concrete downsides you can see from using serde?

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jun 22, 2018

@mthebridge Neat! I can see why some folks would find it appealing to log that way. I think we could achieve something similar in log by using the proposed KeyValues trait. As for the log macro itself, since they're imported like other items in the 2018 Rust edition I think it'd be straightforward to define your own log macros in your application expecting an implementation of LogObject and constructing a RecordBuilder from it.

@Thomasdezeeuw

This comment has been minimized.

Copy link
Contributor

Thomasdezeeuw commented Jun 22, 2018

@KodrAus I don't really need the actual type for the value, Display is good enough for me. However when you do need the actual type, or a way to get it, makes things more complicated and indeed my proposed API won't work.

As for the depency on serde; my main worry is stability. Currently log has a fairly small API surface, with no depencies (again ignoring the bits behind feature flags). If log would have a dependency on serde (or any other crate for that matter) and it's exposed in the API, it's also fixed in it. Meaning that if serde was updated to v2, what should log do? Change to v2, possible breaking all exiting code? And what about old code that won't/can't be updated. Or what if the ecosystem moves away from serde to another serialisation crate? Again what should log do? Stay with the now "old and unused" crate, or again publish a code breaking v2?

I think that the API should remain small, yet flexible, so that log remains the standard for the years, and decades, to come.

@mthebridge I like the idea, but maybe that could be done in another crate? Like serde-derive, e.g. log-derive. And then have a trait inside the log crate to support it.

@mthebridge

This comment has been minimized.

Copy link

mthebridge commented Jun 22, 2018

@Thomasdezeeuw Oh, yes - I'm not saying this should be in the core log crate, but that it would be good if code using this kind of approach can still use the same logging API. If the API is flexible enough, then I'm perfectly happy with adding extra crates on top. :)

@KodrAus Thanks, I wasn't aware of proposals to change macro imports - I assume this is rust-lang/rust#35896? Overwriting the log macro feels a bit hokey, but I guess if it works then that solves my concerns!

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jun 23, 2018

@Thomasdezeeuw Stability is a valid concern. However I'm confident there's no way log could be the standard for years if we don't support capturing structured data, and since logging is usually a second-class citizen we'll get the best results if we leverage existing effort.

As for stability itself, serde is just about always a public dependency so I imagine any breaking changes to serde, if there were ever going to be any, would be made with backwards compatibility in mind because a lot of libraries and frameworks interact through its API. For the same reason I think any alternative serialization framework would need to integrate with serde in some way to gain momentum.

There's no concrete answer to the hypothetical scenario where serde v1 no longer allows users of log to capture the data it once did until that scenario becomes concrete. I imagine that's a case we as a community would need to work through and there are probably ways we could make old and new APIs work together, depending on what actually changes. So I feel like the stability risk is already shared by the ecosystem and outweighed by the benefits of making the vast majority of data types and formats now and in the future just work with log. I think the way I'm proposing we couple the crates together makes more sense than building a similar API ourselves. It's probably also worth noting that serde would still be optional, just enable some more features.

But those are just my thoughts. I'm sure @dtolnay has plenty about serde's role as a public dependency in the crates ecosystem that might be totally different. It's definitely an important discussion to have.

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Jul 9, 2018

I went back and had another look at the last API I was playing with based on @dpc's comment about control flow and rejigged it so keys and values are passed in to the Visitor rather than being pulled off them. That pushes the inversion up a layer so implementations of KeyValue can work with temporary state more easily. The ToValue trait remains the same, but isn't part of the core API.

I retained the Value struct as a way to keep erased_serde private and as a place we can do some trickery to retain the structure of primitive types like i32, bool, f32 etc in no-std contexts so they don't end up being emitted as Display. Key is a similar wrapper for strings that could support an owned as well as borrowed variant.

Overall, the public API could look something like this:

// --- RecordBuilder

impl RecordBuilder {
    fn key_values(&mut self, kvs: &'a dyn KeyValues) -> &mut RecordBuilder<'a>;
}


// --- Record

impl Record {
    fn key_values(&self) -> &dyn KeyValues;
    fn to_builder(&self) -> RecordBuilder;
}


// --- Visitor

trait Visitor {
    fn visit_pair(&mut self, k: Key, v: Value);
}

impl<'a, T: ?Sized> Visitor for &'a mut T where
    T: Visitor
{}


// --- KeyValues

trait KeyValues {
    fn visit(&self, visitor: &mut dyn Visitor);
}

impl<'a, T: ?Sized> KeyValues for &'a T where
    T: KeyValues
{}

impl<KV> KeyValues for [KV] where
    KV: KeyValue
{}

#[cfg(feature = "std")]
impl<KV> KeyValues for Vec<KV> where
    KV: KeyValue
{}

#[cfg(feature = "std")]
impl<K, V> KeyValues for BTreeMap<K, V> where
    for<'a> (&'a K, &'a V): KeyValue
{}

#[cfg(feature = "std")]
impl<K, V> KeyValues for HashMap<K, V> where
    for<'a> (&'a K, &'a V): KeyValue,
    K: Eq + Hash
{}


// --- KeyValue

trait KeyValue {
    fn visit(&self, visitor: &mut dyn Visitor);
}

impl<'a, T: ?Sized> KeyValue for &'a T where
    T: KeyValue
{}

impl<K, V> KeyValue for (K, V) where
    K: ToKey,
    V: ToValue
{}


// --- ToKey

trait ToKey {
    fn to_key(&self) -> Key;
}

impl<'a> ToKey for &'a dyn ToKey {}

impl<T> ToKey for T where
    T: AsRef<str>
{}


// --- ToValue

trait ToValue {
    fn to_value(&self) -> Value;
}

impl<'a> ToValue for &'a dyn ToValue {}

impl<T> ToValue for T where
    T: serde::Serialize + Display
{}


// --- Key

struct Key<'a>:
    ToKey +
    serde::Serialize +
    Display +
    Clone +
    PartialEq +
    Eq +
    PartialOrd +
    Ord +
    Hash +
    AsRef<str> +
    Borrow<str>;

impl<'a> Key<'a> {
    fn from_str(&'a str) -> Self;
    fn as_str(&self) -> &str;
}


// --- Value

struct Value<'a>:
    ToValue +
    serde::Serialize +
    Clone;

impl<'a> Value<'a> {
    fn new(&'a impl serde::Serialize + Display) -> Self;
    fn from_display(&'a impl Display) -> Self;
    fn from_serde(&'a impl serde::Serialize) -> Self;
}

I was thinking it might be good to put together some kind of an RFC document for structured logging including the various alternatives that have been proposed here and log's relationship with slog. Personally, I don't think we necessarily need to unify the log and slog ecosystems by deprecating one in favour of the other. It's not uncommon for a language ecosystem to include a few logging frameworks each making different tradeoffs. It might be worth exploring the ways libraries and applications currently work through the combination of log and slog and whether there's room for improvement.

What do you all think?

@dpc

This comment has been minimized.

Copy link

dpc commented Jul 21, 2018

@KodrAus

Great work pushing it all forward. I like this API.

About slog and log relationship:

I am lacking time to maintain a separate ecosystem for Rust logging, and my family is going to get bigger soon, so I will have even less of it.

The reason why slog was created is because I was working with Go codebases, and libraries implementing structured/contextual logging (zap/log15), and I got sold on both ideas. It really makes sense in a bigger system, with many modules, components, workers, threads, etc.

If log would have a "structured" part built in (as per this Issue), then slog would really have only the "contextual" part left on top of log.

I would be very happy if slog could become just a small extension on top of log, or even got fully replaced by log's with "contextual parts" that only "enterprise-y" users would use - since contextual logging comes with a usability price of passing "context" (slog::Loggers) around.

The thing with contextual logging is that logging records with context carry a lot of data. For asynchronous logging, serializing them on the current thread is going to be slow. That is why in slog there is an distinction between "Borrowed" (imediate) KVs and "Owned" (long-lived, shared between many logging statements) KVs . Owned KVs are thread-safe and can be passed between threads without copying/serializing. The idea is: it costs a bit more to create them (they are long chains of Arcs), but they can be extended, manipulated and passed cheaply to other threads. For log to support slog, there would have to be a way to pass thread-safe KeyValues.
log might not necessarily have end-user APIs for using them, but the facilities would be there.

Other than that, I think the nice part about slog ecosystem was composability. Slog had a concept of a Drain trait, and all logging libraries implement it. Thanks to that you can take envlogger-like RUST_LOG-controlled filtering, combine with asynchronous logger, and output some stuff to a file, some to logstash in JSON, and some to syslog, all without much sweat. I would really love log to have a trait like slog::Drain to allow that (see also #267).

I think you should push forward and after getting some opionions of maintainers of log, move forward. :)

BTW. I had some time for sketching and I've created a branch "how would I like to improve slog in next major version". https://github.com/slog-rs/slog/blob/v3/src/lib.rs Just in case you're curious. It matches almost 1 to 1 with your API, and some small differences are mostly to support the "contextual" logging.

@KodrAus

This comment has been minimized.

Copy link
Contributor

KodrAus commented Mar 10, 2019

We've just merged an RFC for an implementation 🎉 I'll start working towards landing some of the key pieces of that RFC shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.