Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'tracing' library instead of 'log'. #1579

Closed
wants to merge 4 commits into from

Conversation

jebrosen
Copy link
Collaborator

@jebrosen jebrosen commented Mar 15, 2021

Supersedes #1410.

I likely missed a few spots and will need to make a few minor changes in follow-on commits, but this basically "ready to review".

TODOs:

  • examples/todo should use rocket::tracing, not log
  • trace_env_logger example: hi should take and return an &str, not String. And every macro except trace is used via rocket::; should we just import them?
  • Some config values on launch are colored white and others aren't.
  • Change the hash used to disambiguate requests in the log, and/or the way it is formatted. Span Ids generated by Registry are a multi-level bit-packed generation index, and I threw something together that seemed to look nice on 64-bit systems.
  • Lots of lines now exceed 100 characters, largely those that name the trace/span. We should avoid this.
  • Remove in_scope calls
    • especially the large ones.
  • Benchmark compile and run time with and without tracing's compile time filters to estimate overhead
  • Integrate additions and changes made in logger.rs that have not been rebased into trace.rs
  • Review every log message and make sure it's been updated if necessary
    • name spans consistently, to avoid confusion with message
    • Use span fields in more places. e.g.: config errors
  • API organization, re-exports, and docs
    • Potentially re-export the entire tracing and tracing-subscriber crates, like is done for futures, tokio, and figment today.

@SergioBenitez SergioBenitez mentioned this pull request Mar 17, 2021
9 tasks
Copy link
Member

@SergioBenitez SergioBenitez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read through trace.rs itself - I'll hold off on this until the rest of the code has undergone review - but I've left some comments on the rest of the changes. Some higher-level comments:

  • The rocket library feature should be called tracing or similar. I see now that the log feature is for enabling compatibility with the log crate. It does not appear that we can disable tracing at all. What kind of compile-time overhead are we adding? What kind of runtime overhead are we adding?

  • Is the feature intended to be disabled? If so, it should be documented in
    lib.rs, be added as a tested feature in scripts/test.sh, and explicitly
    enabled in examples that make use of it. If not, why is it a feature at all?
    At the moment, codegen seem to depend on this being enabled, so either this
    should be fixed or it shouldn't be a feature.

  • If codegen depends on this, why are the --core tests passing?

  • What is the overhead this is adding? This is now in the critical path. If we
    could really disable the feature, resulting in trace macro calls compiling
    to nothing, we should be able to measure this accurately.

  • examples/todo should use rocket/tracing, not log.

  • The trace_env_logger example needs some motivation. Is it just there to
    check that log integration "works"? In any case, hi should take and
    return an &str, not String. What's more, every macro except trace is
    used via rocket::; should we just import them?

  • Some config values on launch are colored white and others aren't.

  • In some places, tracing fields come before the message and sometimes after.
    We should be consistent. trace!("msg", field = foo);

  • Sometimes span fields are used, sometimes they aren't. For example, when
    logging config errors, why aren't all of those {} fields?

  • Why add a name to some spans and not others? What is the utility? Can we
    just use module_name!() or something to that effect?

  • Lots of lines now exceed 100 characters, largely those that name the
    trace/span. We should avoid this.

  • The config changes are particularly difficult to read.

  • Running arbitrary, non-trace related code in_scope() feels quite wrong to
    me. There must be a better way.

  • It seems like we really want a second macro, to avoid all of the .in_scope(|| { }):

    // the `, $expr` should be optional.
    error_span!("Some message", {
        info!("inside the span's scope");
    });

contrib/lib/src/databases/connection.rs Outdated Show resolved Hide resolved
contrib/lib/src/helmet/helmet.rs Outdated Show resolved Hide resolved
@@ -40,7 +40,7 @@
//! This crate is expected to grow with time, bringing in outside crates to be
//! officially supported by Rocket.

#[allow(unused_imports)] #[macro_use] extern crate log;
#[allow(unused_imports)] #[macro_use] extern crate tracing;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using whatever rocket is using?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you suggest changing here? I assume #[macro_use] was used to avoid needing to add use statements to each file, and I'm not aware of another way to do this (aside from making all of the different logging macros top-level exports from rocket, and #[macro_use] extern crate rocket;)

contrib/lib/src/serve.rs Outdated Show resolved Hide resolved
contrib/lib/src/templates/context.rs Show resolved Hide resolved
warn!("secrets enabled without a stable `secret_key`");
info_!("disable `secrets` feature or configure a `secret_key`");
info_!("this becomes an {} in non-debug profiles", Paint::red("error"));
warn_span!("unconfigured_secret_key", "secrets enabled without a stable `secret_key`").in_scope(|| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with code in_scope.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I agree on this particular usage. The more distance, if statements, etc. between calling enter() and the end of the scope, the harder it is to tell which messages are still "indented"; that's why I have (so far) used in_scope here. There is also the alternative of info!(parent: span, ..), although it is more repetitive than the other options.

Comment on lines +112 to +114
.instrument(span)
.await

}.in_current_span());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both of these?

Copy link
Collaborator Author

@jebrosen jebrosen Apr 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instrument call is what set ups the Request: line in the log (and indents everything under it). (https://docs.rs/tracing/0.1.25/tracing/span/struct.Span.html#in-asynchronous-code explains why this is necessary)

The in_current_span is there for the same reason as the other call to it.

Comment on lines -354 to +357
pub(crate) async fn handle_error<'s, 'r: 's>(
pub(crate) fn handle_error<'s, 'r: 's>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? It seems unnecessary? The code also got harder to read.

In general, these tail instrument calls are hard to follow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could alternatively remain an async fn and use the #[instrument] attribute, like:

#[tracing::instrument(
    level = "warn",
    skip(self, status, req),
    fields(message = %format_args!("Responding with {} catcher.", Paint::red(&status)),
)]
pub(crate) async fn handle_error<'s, 'r: 's>(
    // ...

core/lib/src/server.rs Outdated Show resolved Hide resolved
}
});

// NOTE: `hyper` uses `tokio::spawn()` as the default executor.
#[derive(Clone)]
struct InCurrentSpanExecutor;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this here? Isn't this what .instrument() and .in_current_span() are for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added so that hyper's calls to tokio::spawn will "inherit" the active spans (e.g. the current connection, request, etc.), instead of being logged at the root level.

@jebrosen
Copy link
Collaborator Author

I added most of those comments to the top-level comment as additional TODOs, and will keep working through them along with the other inline comments.

* The `trace_env_logger` example needs some motivation. Is it just there to
  check that `log` integration "works"?

I believe it was, yes. That, and/or as a demonstration to users how Rocket's and users's tracing messages look as viewed by a log-based logger. @hawkw ?

* In some places, tracing fields come before the message and sometimes after.
  We should be consistent. `trace!("msg", field = foo);`

The order is not flexible in that way: info!(field = value ..., "format", arguments) or info_span!("span_name", field = value ..., "format", arguments). I think this is a actually a consequence of:

* Why add a name to some spans and not others? What is the utility? Can we
  just use `module_name!()` or something to that effect?

Every span has a name. I mixed several of these up very early on, which now look like they have only a message, and never cleaned them all up.

* Running arbitrary, non-trace related code `in_scope()` feels quite wrong to
  me. There must be a better way.

Earlier versions of this PR did use the Span::enter API in more places, which returns a guard that exits the span when Dropped. However, the API depends on having some sort of task-local or thread-local state, which hasn't been nicely solved for async/await and is easy to misuse accidentally (see caveats in documentation). in_scope() is much more difficult to use in an incorrect way.

I do agree that some of the in_scope calls would be better as enter(), so that's also on the top-level todo list now.

@hawkw
Copy link
Contributor

hawkw commented Apr 5, 2021

Answering some of the questions from above:

@jebrosen

I believe it was, yes. That, and/or as a demonstration to users how Rocket's and users's tracing messages look as viewed by a log-based logger. @hawkw ?

Yeah, that's correct; the example was supposed to demonstrate how to use Rocket with the log ecosystem. I thought this was valuable to provide for users who might already have a preferred log logger that they want to continue to use. It's probably not hugely important to provide an example of this, but I thought it's better to have more examples when possible?

@SergioBenitez

  • Running arbitrary, non-trace related code in_scope() feels quite wrong to
    me. There must be a better way.

Running "arbitrary, non-trace related code" in in_scope is actually an intended use of the API; see the documentation for details. In general, correct use of tracing places any code that's part of the logical unit of work or context represented by a span inside that span --- this allows spans to be used not only for providing context to log messages, but also for timing code sections or for distributed tracing.

@jebrosen

  • In some places, tracing fields come before the message and sometimes after.
    We should be consistent. trace!("msg", field = foo);

The order is not flexible in that way: info!(field = value ..., "format", arguments) or info_span!("span_name", field = value ..., "format", arguments).

Yeah, this is correct. Format-string-based messages must always come last in the tracing macros, as the macros must parse them as any number of arbitrary token trees in order to pass them to format_args!. Spans always have names, which are string literals that come first in the macro invocation, and may optionally have format string messages at the end of the macro invocation.

@SergioBenitez

  • What is the overhead this is adding? This is now in the critical path. If we
    could really disable the feature, resulting in trace macro calls compiling
    to nothing, we should be able to measure this accurately.

See here for details on statically disabling specific tracing levels, or all traces, at compile-time. :)

Since the compile-time filtering is based on feature flags, which are always additive, a library dependency like Rocket should generally not enable them, since this prevents the user application from being able to configure them. However, we might want to document the existence of these feature flags in Rocket's docs so that users are aware of them, and they can be used for benchmarking.

@hawkw
Copy link
Contributor

hawkw commented Apr 5, 2021

  • The rocket library feature should be called tracing or similar. I see now that the log feature is for enabling compatibility with the log crate. It does not appear that we can disable tracing at all. What kind of compile-time overhead are we adding? What kind of runtime overhead are we adding?

  • Is the feature intended to be disabled? If so, it should be documented in
    lib.rs, be added as a tested feature in scripts/test.sh, and explicitly
    enabled in examples that make use of it. If not, why is it a feature at all?
    At the moment, codegen seem to depend on this being enabled, so either this
    should be fixed or it shouldn't be a feature.

It would also definitely be possible to make the tracing dependency itself optional. To do this, we would need to wrap the macros with Rocket's own macros that expand to nothing when the feature flag is disabled. I didn't do this since tracing's compile-time filtering is sufficient to avoid runtime overhead, and it was a lot of extra code. However, this would allow us to avoid downloading and compiling the tracing dependency when it's not in use.

jebrosen and others added 3 commits April 15, 2021 21:50
'tracing' is now used for all messages and for Rocket's default logger.

Co-authored-by: Eliza Weisman <eliza@buoyant.io>
Co-authored-by: Jeb Rosen <jeb@jebrosen.com>
@SergioBenitez SergioBenitez marked this pull request as draft June 30, 2021 21:16
@oren0e
Copy link

oren0e commented Jan 6, 2022

Hi, any news about when should we expect the move to tracing to be completed? I was just trying to do something related in #982, i.e., insert a request_id to monitor the full cycle of a request-response. Currently I have to setup a custom subscriber and formatting layer like:

    LogTracer::init().expect("Unable to setup log tracer!");

    let (non_blocking_writer, _guard) = tracing_appender::non_blocking(std::io::stdout());
    let formatting_layer = fmt::layer().json().with_writer(non_blocking_writer);
    let subscriber = Registry::default()
        .with(EnvFilter::new(env!("LOG_LEVEL")))
        .with(formatting_layer);
    tracing::subscriber::set_global_default(subscriber).expect("Failed to set global subscriber");

    rocket_builder().launch();

Which is not ideal because it interferes with rocket's logging configuration and then I end up getting ugly jsons with terminal color codes in raw form, which look like the following:

{"timestamp":"2022-01-06T10:23:04.323575Z","level":"INFO","fields":{"message":"\u001b[1;49;39mOutcome:\u001b[0m \u001b[49;32mSuccess\u001b[0m","log.target":"_","log.module_path":"rocket::rocket","log.file":"/Users/me/.cargo/registry/src/github.com-1ecc6299db9ec823/rocket-0.4.10/src/rocket.rs","log.line":303},"target":"_"}

And notice that I don't have a request_id here as well for some reason...

@SohumB
Copy link

SohumB commented Aug 12, 2022

Hi there,

I notice this PR has stalled. Is there anything a third party (like me) can do to help push it along? I'm eagerly looking forward to using tracing in Rocket and would absolutely be willing to put work into making it happen.

Thanks!

@SergioBenitez
Copy link
Member

I'm closing this as Jeb has indicated that he won't be able to push it forward. This is something I will personally prioritize for Rocket 0.6.

@SergioBenitez
Copy link
Member

By the way: if anyone is interested in making this happen, I would love to actively mentor that person.

@oren0e
Copy link

oren0e commented Mar 31, 2023

I might be interested, let me check availability over the weekend and I'll get back to you with an answer.

@SergioBenitez
Copy link
Member

@oren0e Sounds good! Hop on the Matrix channel if/when you're ready.

@oren0e
Copy link

oren0e commented Mar 31, 2023

ah what is the Matrix channel? You have a discord server?

@SergioBenitez
Copy link
Member

It's in the README and the website. Here's a direct link: https://chat.mozilla.org/#/room/#rocket:mozilla.org

@wrenix
Copy link

wrenix commented May 20, 2023

any news @oren0e or @SergioBenitez or newer state of work?

maybe i would like to take a look

@oren0e
Copy link

oren0e commented May 20, 2023

You're welcome to work on this, I will try to help with this as much as I can but I think I can't be the owner of this migration currently. I initially thought that I will have much more time to devote to this task where in reality I have something like only 10% of my initial estimation. I'm sorry that I inform of this only now.

@@ -62,8 +63,8 @@ async fn run_blocking<F, R>(job: F) -> R

macro_rules! dberr {
($msg:literal, $db_name:expr, $efmt:literal, $error:expr, $rocket:expr) => ({
rocket::error!(concat!("database ", $msg, " error for pool named `{}`"), $db_name);
error_!($efmt, $error);
error!(concat!("database ", $msg, " error for pool named `{}`"), $db_name);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please just one log per message / error + please keep logmessage clean from values (it is easier to indexing, search and anonymize in log systems, if all values are in defined files):

Suggested change
error!(concat!("database ", $msg, " error for pool named `{}`"), $db_name);
error!(
database.name = $msg,
database.pool = $db_name,
error = $error,
$efmt
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants