Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for kv_unstable (structured logging) #328

Open
15 of 29 tasks
KodrAus opened this issue Apr 29, 2019 · 31 comments
Open
15 of 29 tasks

Tracking issue for kv_unstable (structured logging) #328

KodrAus opened this issue Apr 29, 2019 · 31 comments

Comments

@KodrAus
Copy link
Contributor

@KodrAus KodrAus commented Apr 29, 2019

Tracking issue for the implementation of #296. See also #149

  • Initial implementation (#324)
  • Integration with frameworks and loggers:
  • Explore macro support (#343)
    • log!("msg {}", interpolated, { key: value }) (#353) - We ended up having to revert this in log, but are experimenting externally
    • log_kv_macro
    • log_macros
  • Integration with serialization frameworks:
  • Micro benchmarks
  • Documentation and examples
  • Stabilization review
  • Open questions
    • Should Source::count return a Result<usize> instead of a plain usize?
@yoshuawuyts
Copy link
Member

@yoshuawuyts yoshuawuyts commented Jul 2, 2019

@KodrAus thanks for all the work you've put into this (and linking to this issue from a forum post).

I was curious: is there a release planned that includes the beta version of kv? I actually wanted to use the beta for async-log, but given there wasn't any release yet it wasn't possible 😅

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Jul 4, 2019

Hi @yoshuawuyts! 👋

Right now any release of log we publish will include this API under the kv_unstable feature gate, but we're not ready to commit to backwards compatibility just yet. So if you depend on log from git then you can access the API:

[dependencies.log]
git = "https://github.com/rust-lang-nursery/log.git"
features = ["kv_unstable"]

I think what we need are a few draft PRs in some of these log frameworks that tease out any breaking changes we should make, and then will be able to publish a release that replaces the kv_unstable feature gate with a kv one.

@yoshuawuyts
Copy link
Member

@yoshuawuyts yoshuawuyts commented Jul 4, 2019

Right now any release of log we publish will include this API under the kv_unstable feature gate,

Oh oops, maybe I was being a bit vague. The last release on crates.io was done in October 2018, which means currently no releases ship with kv_unstable on crates.io. Which in turn (I believe) means we can't ship any versions to crates.io.

In order to help test this feature, it would be nice if there was a release on crates.io with the ["kv_unstable"] feature available. For experimental technology such as async-log I think we can get away with publishing a version that directly depends on this feature, and keep updating it as it's polished.

So my question is: could a version be of log be published to crates.io that includes the experimental flag?

P.S. Thanks for taking time out of your day to reply to my questions!

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Jul 6, 2019

In order to help test this feature, it would be nice if there was a release on crates.io with the ["kv_unstable"] feature available. For experimental technology such as async-log I think we can get away with publishing a version that directly depends on this feature, and keep updating it as it's polished.

Ah this sounds fair enough. I think we can get pretty far without needing to break anything, but it's probably going to happen at some point. But right now it is harder for folks to get started with these new APIs, so once #339 lands I'll put together a release so you can depend on it from crates.io and browse the docs on docs.rs.

@yoshuawuyts
Copy link
Member

@yoshuawuyts yoshuawuyts commented Jul 21, 2019

@KodrAus in the top-level post there's an entry about "Explore macro support". Do you have any plans to experiment with this?

We've integrated key-value logging in lrlna/femme#1; all that's missing now is a way to create key-value pairs and send them to a logger. Even a small code example would be helpful, but ideally there'd be a small log crate that could work with this.

Also: is it perhaps useful to open a separate issue to track/discuss logging macros?


edit 2019-07-21:
Ah, I think I figured out how to log a kv-pair!

let record = log::Record::builder().key_values(kv_pairs).build();
log::logger().log(&record);

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Jul 21, 2019

Ah yes that Record::key_values method is the way you can attach key-value pairs to a record in the absence of macro support. It needs to be a &dyn Source.

Opening a separate issue to design macro support sounds like a good idea! We’ve got a couple of constraints but also plenty of opportunity to improve things.

@yoshuawuyts
Copy link
Member

@yoshuawuyts yoshuawuyts commented Jul 22, 2019

Macro issue created: #343

@yoshuawuyts
Copy link
Member

@yoshuawuyts yoshuawuyts commented Sep 1, 2019

Created a crate for shared types in key-value logs (to mark pairs of logs) -- https://github.com/yoshuawuyts/log-types. Not sure yet how useful this is, but I think it might be a fun reference of how to create source/key/value structs, and probably a useful experiment (:

@yoshuawuyts yoshuawuyts mentioned this issue Sep 1, 2019
1 task
@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Sep 1, 2019

Having some standards around well-known key-value pairs is something I've also been exploring in a little wrapper. It seems like it'll be useful for a more composable logging ecosystem.

@yoshuawuyts
Copy link
Member

@yoshuawuyts yoshuawuyts commented Sep 1, 2019

@KodrAus Yeah for sure! -- I wonder which others to add; I only really thought of spans::{start,end} and http::{request,response}. Though I guess timestamps, host, ip might be common too?

Oh also, I saw femme is not listed yet; could we perhaps add it to the tracking issue?

bors bot added a commit to async-rs/async-std that referenced this issue Sep 16, 2019
199: remove custom log code in favor of macro crate r=stjepang a=yoshuawuyts

This removes our custom log macro code in favor of using [`kv-log-macro`](https://github.com/yoshuawuyts/kv-log-macro). This is a temporary crate that exists only until rust-lang/log#353 lands which enables progress on rust-lang/log#328. Thanks!

Co-authored-by: Yoshua Wuyts <yoshuawuyts@gmail.com>
bors bot added a commit to async-rs/async-std that referenced this issue Sep 16, 2019
199: remove custom log code in favor of macro crate r=yoshuawuyts a=yoshuawuyts

This removes our custom log macro code in favor of using [`kv-log-macro`](https://github.com/yoshuawuyts/kv-log-macro). This is a temporary crate that exists only until rust-lang/log#353 lands which enables progress on rust-lang/log#328. Thanks!

Co-authored-by: Yoshua Wuyts <yoshuawuyts@gmail.com>
@atroche
Copy link

@atroche atroche commented Sep 17, 2019

I'm hesitant to add to the list of todos because I really want this feature, but it might be worth adding fern as another crate to integrate with. I don't use it (I use slog), but it seems like it has quite a lot of usage (600,000+ all time downloads on crates.io).

@MOZGIII
Copy link

@MOZGIII MOZGIII commented Oct 16, 2019

Should Source::count return a Result<usize> instead of a plain usize?
What if we move the Source::count to a separate trait:

trait SourceCount: Source {
    fn count(&self) -> usize;
}

Rationale is it looks more reasonable to just make it a type-system question whether a certain type has a notion of a count of it's kvs or not.

And if that's not the desired use case for making count return Result<usize> - then what is? What do you see as possible error values? Why not just Option<usize>? Some lazy-evaluated kv sources might not have count known upfront, i.e. something like processes list (ps ax) might be collected on demand.

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Oct 17, 2019

What if we move the Source::count to a separate trait

@MOZGIII unfortunately a separate trait wouldn't really work because we'd lose the implementation when erasing as dyn Source.

What do you see as possible error values?

The desire for Result would just be to be able to pass through any errors from the default implementation that calls the fallible visit_pair method. But Option seems worthwhile!

Some lazy-evaluated kv sources might not have count known upfront

Hmm, I don't think the current implementation really supports this kind of scenario because pairs can be visited and fetched at any point. I also don't think it's really desirable to support either, because pairs may be inspected and visited multiple times any expensive work or externally changing information could be confusing to end consumers.

@MOZGIII
Copy link

@MOZGIII MOZGIII commented Oct 17, 2019

Given all that, I'm having hard time imagining a case when the count would be None (or Err). Do you have an example in mind?
How is the count used? Is it like iterator's size_hint?

@MOZGIII
Copy link

@MOZGIII MOZGIII commented Oct 17, 2019

Hmm, I don't think the current implementation really supports this kind of scenario because pairs can be visited and fetched at any point. I also don't think it's really desirable to support either, because pairs may be inspected and visited multiple times any expensive work or externally changing information could be confusing to end consumers.

From how the code is set up, it looks fine: the Source can be used at any time to collect values via Source::visit. There is an implementation note on visit fn A source should yield the same key-value pairs to a subsequent visitor unless that visitor itself fails. that advocates against sources being dynamic. After looking around the documentation more, I found the uses for Source, and it makes a lot of sense for it to be required to not be dynamic. 👍

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Jan 31, 2020

I've spent a little time looking at how we could support attempting to coerce structured values to concrete types. This is something that would make logging frameworks that want to enrich log::Records with their own well-known types nicer. There's a PR in #379

Once we have that and #378, we can add support for std::error::Error to structured values too. It might look something like:

impl<'v> Value<'v> {
    pub fn from_error<E>(err: &'v E) -> Self
    where
        E: Error + 'static;

    pub fn get_error(&self) -> Option<&(dyn Error + 'static)>;
}

@softprops
Copy link

@softprops softprops commented Jun 28, 2020

I see there's a 'kv_unstable' feature flag in the latest version of the log crate but it's unclear how to interface with in from an application. Is this not something we can do today?

@softprops
Copy link

@softprops softprops commented Jun 28, 2020

I found this commit which seems like it reverted an interface users could leverage is https://docs.rs/kv-log-macro the crate users should be guided towards?

@yoshuawuyts
Copy link
Member

@yoshuawuyts yoshuawuyts commented Jun 30, 2020

@softprops yeah, we're still in the process of figuring out the macros. Currently kv-log-macro is one of the most convenient ways to use the kv feature, but the goal is to eventually have this functionality in log's macros itself.

@pickfire
Copy link

@pickfire pickfire commented Jun 30, 2020

Have we figured out other ways to preserve states? The current log!("", { key: value }) will be quite duplicated when the same { key: value } needs to be reused in different scopes.

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Jun 30, 2020

@yoshuawuyts and I discussed the path to stabilization for the kv support and came to the conclusion that there's really not much blocking calling the kv module stable and continuing to work on it without breakage. I think we've explored enough of the design space to cover macro support and integration with tracing properly.

What about macros?

We don't currently have macro support in log itself, but since we're bound to std::fmt the macros we can provide in here in a backwards compatible way can't be as capable as what we produce externally. We will still provide macro support in log itself, but that doesn't need to block removing the kv_unstable feature gate.

Right now, if you want macro support you can check out kv_log_macro. It's an almost drop-in replacement for the existing macros.

I'm also experimenting with something completely greenfield for macros that I hope to talk about more when it's fleshed out.

So what's left?

I'd like to get #396 merged, which lets us capture values using a trait bound like impl Display + 'static but still determine at compile-time whether the actual value is one of a number of primitives like i32. Until 1.46.0 is released in August it'll have a small runtime hit though. I think that's the last breaking behavioural change we need to make then can do a thorough API review.

What's after stabilization?

@yoshuawuyts brought up the possibility of adding logging/tracing support into std, which I think is worth exploring. I've been framing it as a structured/contextual enhancement to std::fmt but going down that route might be the best possible way to make the existing log macros natively structured.

@softprops
Copy link

@softprops softprops commented Jul 6, 2020

What's the story for the kv visitor trait and serde. It seems today the visit api gives you a access to a Key and Value. You can get the &str value of a Key but values are a bit opaque. I know that they implement Display and Debug but i'd rather depend on their serde::Serialize impl when serializing them out.

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Jul 6, 2020

Hey @softprops 👋

It would be pretty unfortunate for structured values to only support the unstructured Debug or Display outputs. I wrote a bit about serde in #388 (comment).

The gist of it is we currently support serde in a roundabout way using a different structured framework, but should definitely add first-class serde support using erased-serde (we could fall back to the alternative framework when alloc isn’t available). There’s nothing blocking this and it can be implemented at any time.

@softprops
Copy link

@softprops softprops commented Jul 6, 2020

@KodrAus that's awesome! Is anyone actively working on this? I'd love to help but am not caught up on all the current and prior discussion.

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Jul 12, 2020

@softprops As far as I know it’s not something being actively worked on yet. If you’d like to take a look that would be great! Otherwise it was next on my list after #396, since that changes the shape of the value internals a bit.

To add serde support we’ll effectively add the following public API:

impl<‘v> Value<‘v> {
    pub fn from_serde(value: &’v (impl serde::Serialize + ‘static)) -> Self;
}

impl<‘v> serde::Serialize for Value<‘v> {}

The things we’d need to do internally would be:

  • Add optional dependencies on erased-serde and serde-fmt
  • Add a kv_unstable_serde feature that pulls in the std feature, sval/serde, serde-fmt and serde
  • Add a new kv::value::internal::serde module, probably following a similar structure to internal::sval
  • When both kv_unstable_sval and kv_unstable_serde are available, we can use sval‘s serde bridge to implement the Visitor::sval method in the serde module and the Visitor::serde method in the sval module
  • Use serde-fmt to implement the Visitor::debug and Visitor::display methods in the serde module
  • Add a new Serde variant to internal::Inner
  • Add a new serde method to internal::Visitor
  • Add kv_unstable_serde to our GitHub Actions on its own and combined with kv_unstable_sval

But if anybody waned to dig in I’d be happy to help! The best place to reach out would probably be on Zulip.

@softprops
Copy link

@softprops softprops commented Jul 12, 2020

This sounds great. I'll be sure to drop a note here if I start anything.

@uazu
Copy link

@uazu uazu commented Oct 8, 2020

I had some issues getting 0.4.11 structured logging working. I've written my own logging macros because I have some specific requirements, and I'm also writing my own Logger to receive all this data because the outgoing logging channels I have to interface to are custom. I chose to put it all through log crate because I also want to capture unstructured logging. Some observations/questions:

  • If the Visitor only has one method, it would be better to just replace it with a FnMut, making Logger code simpler
  • However right now to get a binary value out, the Visitor has to call various different to_* methods in turn on the Value to find what type it contains, so wouldn't it be better to turn that around and have several different methods on the Visitor which would be called according to the type of the value? e.g. visit_pair_u64, visit_pair_str, etc. The types handled by OpenTelemetry for example are limited to i64, bool, f64 and str, plus arrays and tables. So ignoring the nested values for the moment, and adding u64, that requires just five visitor methods to cover it.
  • It might be good to point people to the fact that Value implements Display, because for text-log output, I started out by querying the different to_* methods and formatting for each one. However this didn't work because I couldn't find a way to get a value created with Value::from_display out using the to_* methods. So maybe it would be best to have two visit calls, one for text-style output, which just takes a FnMut and gives it a &str for each value (which would point to either the original value or else an internal temporary format buffer), and another for getting binary data out, which has a Visitor with multiple entry-points (for i64/u64/f64/bool/str/etc). I haven't implemented binary output yet, but it seems like right now I'll have to try various to_* methods in turn, and then finally try a format!("{}", val). This doesn't seem very efficient. (The log crate is advertised as "lightweight".)

I was trying to understand the current code. There is a lot of dynamic dispatch going on, and a lot of type machinery. Is all this necessary? If it is, then fine. However did you consider something like a HCons-style approach? i.e. construct a nested type which would result in all the heterogenous keys/values being stored sequentially in stack memory with minimal padding (e.g. KvInt(("key", 123, KvStr(("key2", &strvar, KvEnd))))), and let the compiler build a single visit function for it all that accepts a Visitor and visits them all (where the Visitor has an entry-point for each fundamental supported type). That would make good use of static knowledge from the callsite to optimise at least the first level of processing. (Nested arrays or tables would need some kind of dynamic approach.) I can provide some prototype code if this isn't clear.

@uazu
Copy link

@uazu uazu commented Oct 9, 2020

Actually, thinking about it some more, would it be possible to make it work with a closure like this:

logger.log(&Record::builder()
           .whatever(...)
           .key_values(|visitor| {
               Value::from(&myvar1).visit(visitor, "key1");
               Value::from(&myvar2).visit(visitor, "key2");
               /// etc
           })
           .build());

That way, assuming all the Value-related calls are marked #[inline] it should all get inlined to produce flat code that just executes a series of calls on the visitor instance. This assumes that the Visitor has separate entry points for major types (u64, i64, f64, bool, str). The closure will build a hidden structure on the stack which contains the references to all the variables being logged. So this should be really efficient and lightweight. I could prototype this if it is of any interest.

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Dec 21, 2020

Hey @uazu! 👋

Sorry for taking so long to circle back and thanks for putting together your thoughts 😃

If the Visitor only has one method, it would be better to just replace it with a FnMut, making Logger code simpler

The reason Visitor is a trait instead of just a FnMut is that it gives us a place to add provided methods like get and count, so Visitor::visit_pair is like Iterator::next. You only implement that one method but get all the other combinators on top. For your last example, you can already do this:

logger.log(&Record::builder()
           .whatever(...)
           .key_values(&[
               ("key1", Value::from(&myvar1)),
               ("key2", Value::from(&myvar2)),
               /// etc
           ])
           .build());

However right now to get a binary value out, the Visitor has to call various different to_* methods in turn on the Value to find what type it contains, so wouldn't it be better to turn that around and have several different methods on the Visitor which would be called according to the type of the value? e.g. visit_pair_u64, visit_pair_str, etc. The types handled by OpenTelemetry for example are limited to i64, bool, f64 and str, plus arrays and tables. So ignoring the nested values for the moment, and adding u64, that requires just five visitor methods to cover it.

The goal of the Value::to_* methods is for more ad-hoc inspection. If you're writing a log sink that wants to do something particular with a timestamp or trace identifier for instance you can check whether the key-value pairs you're given contains a value called timestamp, and then check whether that timestamp is a string or a chrono::DateTime<Utc>. They're really inconvenient for trying to serialize structured values directly. In order to visit all pairs and serialize them you use either the sval::Value or serde::Serialize implementations which have the API you're describing.

The kicker for the visitor API is the plus arrays and tables part. Once you have to support complex datastructures the API becomes a lot more complex and the open design space a lot bigger than just simple primitives. Not supporting arrays and maps isn't really an option because you certainly want to log them. So we offer two options in log for properly serializing values: there's sval for no-std serialization and serde for wider compatibility. Both can handle any shaped datastructures and require fairly large APIs to do so.

There is a lot of dynamic dispatch going on, and a lot of type machinery. Is all this necessary? If it is, then fine. However did you consider something like a HCons-style approach?

I'd be interested to see a HCons-style approach! My experience with H-lists in Rust has been that they're somewhat going against the grain of the language and so require a lot more type machinery, various hacks, and tend to produce large and incomprehensible error messages. One of the goals of log as well is to generate as little code in callsites as possible, ideally just a single function call. But that doesn't mean exploring different approaches isn't worthwhile! We settled on trait objects fairly early and never deeply considered trying to go the other way into fully generic code.

@uazu
Copy link

@uazu uazu commented Dec 21, 2020

@KodrAus Thanks for replying.

In my opinion, since 'log' is supposed to be a lightweight logging crate, the key-value logging mechanism should be lightweight too. It doesn't need to support everything -- a superset of JSON and OpenTelemetry style values would be enough. It should be streamlined for dumping the data to a stream, since that's the most common operation.

In case it's of interest: My original reason to look into this was to add minimal key-value logging support to my Stakker crate. In the end I implemented my own minimal logging interface directly in Stakker, with the logging macros in a separate crate: stakker_log. This was done because none of the crates out there did quite what I needed. Also because I needed to transparently pass through a span ID (which is effectively an actor number) with the log records, to support span-based logging. Also, since Stakker is single-threaded, it's more efficient to log within the same thread, rather than to something which then has to use a Mutex or whatever. The intention is that the user-provided logging handler could either dump out directly, or forward logging to whatever logging crate the user prefers. This does also handle nested values (arrays and maps), although in a very minimal visitor style: just a closure at the point of logging which dumps the data as calls to a visitor closure which outputs it.

@KodrAus
Copy link
Contributor Author

@KodrAus KodrAus commented Dec 23, 2020

since 'log' is supposed to be a lightweight logging crate, the key-value logging mechanism should be lightweight too.

That's absolutely important! One of the balancing acts we're playing in log is that it needs to be lightweight, but because it exists between other libraries and frameworks it also needs to be permissive enough to retain as much context as it can when dealing with data from other sources. So by default it supports visiting pairs of displayable key-value pairs and additional features let it integrate with more feature-full serialization frameworks. Your thoughts did end up prompting #426, which makes it easier to serialize the complete set of key-value pairs as either a map or list.

In the end I implemented my own minimal logging interface directly in Stakker

Ah it's nice to see the kind of API you've been talking about in practice there. Your LogVisitor trait actually looks pretty similar to what I ended up with in sval. That library was actually spiked out within log originally, but I pulled it out because it become too big to maintain inline. That's the same story with value-bag, which is the basis of log's value API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants