Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code refactoring + cleanup + maybe some new features #84

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

piegamesde
Copy link

See #83.

At the moment, I'm trying to get rid of all large enums and replace them with traits or something else suitable.

I'm not sure where I'm actually heading with this, but I have a vague idea of exposing a lot of log_impl as a public API. The goal is to assimilate Output and Log, since at the moment they are 99% the same. I'd like to have outputs be individual structs instead of methods on the Output struct.

@piegamesde
Copy link
Author

Well, that turned weird quickly 😅. It usually never takes me long to wreck the commit history, and I also tend to run into the limits of Rust's type system fairly quickly …

Needless to say, I'm unsatisfied with my new approach. However, I see multiple alternatives going forward from here:

  • Remove the logging specific max_level (and thus, LogExt). It provides rather low benefit for the pain it causes. I can't really judge the performance gains it brings in the wild. Removing it would turn Dispatch into a glorified Log multiplexer, greatly reducing complexity.
  • Separate logger and builder. This means renaming LogExt to Output or something. But the key change would be to not let that trait extend Log, and instead have it provide a to_log method.

Copy link
Owner

@daboross daboross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far!

I know this is still a draft, but I've left some comments just since I had some time to look over it.

I'm definitely liking the changes here - opening everything up is great, and does seem to simplify things.

src/builders.rs Outdated
@@ -1184,103 +1029,17 @@ impl fmt::Debug for Dispatch {
"format",
&self.format.as_ref().map(|_| "<formatter closure>"),
)
.field("children", &self.children)
.field("children_count", &self.children.len())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Wondering if there's a simple way we could keep this functionality? It probably isn't too useful for production code, but I imagine printing the full logger could be useful when debugging complicated fern setups.

I don't think it should be too much of a burden to require Debug, but then again it would definitely mean requiring double-indirection if we ever want to accept Box<dyn Log>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requiring Debug indeed isn't a hard constraint. The problem is that Rust is really difficult when working with trait objects, especially when multiple traits are involved. There simply is no such thing as dyn Log + Debug. I could build a custom trait, but that would lead me back to dyn LogExt which cannot be cast to dyn Log.

I'll see if I can come up with an external solution that isn't too expensive, but no guarantees

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I can propose you to try the following:

  • Attach a type information string to Output that gets filled in when constructed
  • When using general Log implementations instead of the provided methods, no information can be provided
  • The Dispatch struct contains compressed information of all its children
  • The log_impl::Dispatch does not. I do not want to keep this information after initialization
  • Especially, since a nested Dispatch gets converted into a log_impl::Dispatch, this means that you'll only get the full filter information for the outermost level, and only coarse structure information for inner dispatchers.

src/lib.rs Outdated
@@ -220,6 +238,11 @@ mod builders;
mod errors;
mod log_impl;

/// Logger implementations for different output targets
pub mod logger {
pub use crate::log_impl::{Panic, Reopen, Sender, Stderr, Stdout, Writer};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we expose these directly, I think we should at least make all the fields of these structs private - they're public to the crate right now, but exposing them to the world would lock us in on these particular structures, and that's a pain when updating code.

If we get rid of the configuration altogether, or rather, merge config and log_impl, the loggers name you have here would make a good new name for the module file as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree. I haven't decided on what API these will have, but probably simple a few constructors and From implementations. My goal is indeed to move them to an logger (or loggers module), but I'm using the export trick right now to keep the diff more manageable.

src/log_impl.rs Outdated
@@ -428,6 +446,12 @@ impl<'a> FormatCallback<'a> {
// No need to write this twice (used for Stdout and Stderr structs)
macro_rules! std_log_impl {
($ident:ident) => {
impl $ident {
pub fn new(stream: io::$ident) -> Self {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Stdout and Stderr constructors, if public, should take no arguments, as the streams they wrap are singletons.

src/builders.rs Outdated

/* Sadly we cannot store and compose std::fmt::Arguments due to lifetime issues.
* In order to avoid unnecessary write calls nevertheless, we must enumerate all options
*/
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried composing them using nested matches, with no temporary variables? I think something like this should work:

format_args!("{}{}", match time { ... => format_args!(...) }, match level { ... => format_args!(...) })

At least in the past, I've never been able to store the result of format_args! in a variable, but it seems to always work to let it live temporarily while being passed to another function. That's the origin of out.finish's design, too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, no. As soon as I wrap the inner format_args! in an expression, the error message appears. To be honest, this is really frustrating and I'll open an issue to change this in Rust because according to a quick search we're not the only ones running into this.

src/builders.rs Outdated
chrono: bool,
level: bool,
target: bool,
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for this, and we can definitely keep it here for now, but I'd rather the final "refactor" PR not include any new features. Ideally I'd be able to merge one PR which refactors everything but keeps the exact same feature set, and then review and merge future PRs which depend on that refactoring separately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Do you prefer to have the formatter feature PR first or the other way around? At the moment, they are completely independent from each other.

@daboross
Copy link
Owner

Needless to say, I'm unsatisfied with my new approach. However, I see multiple alternatives going forward from here:

* Remove the logging specific max_level (and thus, `LogExt`). It provides rather low benefit for the pain it causes. I can't really judge the performance gains it brings in the wild. Removing it would turn `Dispatch` into a glorified `Log` multiplexer, greatly reducing complexity.

* Separate logger and builder. This means renaming `LogExt` to `Output` or something. But the key change would be to not let that trait extend `Log`, and instead have it provide a `to_log` method.

Right - I'm down for option one. I believe I originally added max_level as part of a bigger plan to like, "optimize" the log chain during construction - thinking maybe fern could see a complex configuration, and through logical analysis, make it as simple as possible. That never really manifested, and while I still think it would be cool, it is not the direction the crate's been going in since then.

@daboross
Copy link
Owner

I'm not sure where I'm actually heading with this, but I have a vague idea of exposing a lot of log_impl as a public API. The goal is to assimilate Output and Log, since at the moment they are 99% the same. I'd like to have outputs be individual structs instead of methods on the Output struct.

I definitely like the internal refactoring here, but I'm wondering about the external API. I originally designed Output to put almost everything you can do with this crate in one place - no need to search through docs of 10 different structs, each with multiple constructors, to figure out what's available. There's a little bit of this with log_file and log_reopen, but most of everything is still on Output.

That does make it a bit messy, but I'd like to think the experience of searching through docs on Output, especially with all the examples, is an overall nice experience.

I don't think there's anything else which actually keeps all the functions still up and available in the same way, besides just keeping a single struct - and that doesn't really match the new paradigm. One possibility I was thinking of for a refactor like this which would retain some of the flatness would be having all constructor functions exported as public functions of the fern crate - so the docs wouldn't be open, but you'd be able to see all available ways to make loggers under "Functions" on the fern root docs. What would you think of that?

The one thing I'm mostly against here is having the API be 5+ different structs with 2-3 constructors each. It's nice from an internal organization perspective, but in my opinion would not be that great to look through. It requires users to care about what internal fern struct they're making, and I don't want to require that.

Ideally, I want someone looking at fern to be able to choose from a flat list of things to log to. They should not care whether "a logging file with a static name", "a logging file whose name changes based on time" and "a logging file whose name rotates based on size" are the same struct or different structs - we represent them differently right now, but that's an implementation detail. I'm OK exposing that, but understanding that choice shouldn't be part of the main API (and if we didn't expose it at all, that would almost be better?)


That's a lot of rambling from me about API design. Thoughts on this?

I'm open to almost anything here, but I do hold some opinions on what's user-friendly, enumerated above. I'd like contradicting opinions, but I also wanted to write all that to give background on why I've designed it the way I have so far.

src/builders.rs Outdated
#[cfg(feature = "chrono")]
{
self.chrono
.then(|| chrono::Local::now().format("%Y-%m-%d %H:%M:%S,%3f"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this could easily become a standard-ish formatter, what would you think of matching the env_logger format, or at least, matching it as closely as we can given the enable features?

For apps which don't need custom configuration, env_logger seems to be fairly standard, and so at least from my perspective, that format (something like [2017-11-09T02:12:24Z ERROR main]) is "normal" for rust apps. It didn't used to be my personal favorite, and env_logger was not as popular when fern started, so this was not the format I included in the original docs. However, given that it's already fairly normalized now, I think it would make for a pretty safe choice?

@daboross
Copy link
Owner

Hope too many comments aren't bad! Well, if they are, that's what I've written anyways :p

@piegamesde
Copy link
Author

Thanks for having a look at this!

  • Remove the logging specific max_level (and thus, LogExt). It provides rather low benefit for the pain it causes. I can't really judge the performance gains it brings in the wild. Removing it would turn Dispatch into a glorified Log multiplexer, greatly reducing complexity.

Right - I'm down for option one. I believe I originally added max_level as part of a bigger plan to like, "optimize" the log chain during construction - thinking maybe fern could see a complex configuration, and through logical analysis, make it as simple as possible. That never really manifested, and while I still think it would be cool, it is not the direction the crate's been going in since then.

I tried removing max_level, but it turned out to be necessary in order to know which global log level to set (the alternative would be to always Trace). I think solved this by still keeping OutputInner for the Dispatch variant. That way, our logging tree has Dispatches as inner nodes and Logers as leaves, which I find rather beautiful.

I definitely like the internal refactoring here, but I'm wondering about the external API. I originally designed Output to put almost everything you can do with this crate in one place - no need to search through docs of 10 different structs, each with multiple constructors, to figure out what's available.

Personally, I kind of like the style of having a list of structs for each thing I want to do and then a list of constructors for each configuration how I want to do it. But I can see why you're against this.

I suggest adding one convenience method for each to Output, and additionally keep all structs in a documented separate module. Thus there is an easy way to get started while keeping the possibility to do deep custom configuration. I'm not sure if this will actually turn out to work well, but I'll try it out and we'll see.

@piegamesde
Copy link
Author

I've thought some more about the public output API. Maybe making all those structs public isn't that a great idea. However, I can now point down my problem with the current way of doing things: it's inconsistent.

See, it's really cool to be able to call .chain(io::stdout()) or .chain(some_file), but then there is also Output::file() but no Output::stdout() and no Output::panic(). So I have to search multiple places depending on the output that I want, which is exactly what should be avoided.

@piegamesde
Copy link
Author

I know this isn't part of this PR (I'll factor it out later, but at the moment it makes dogfooding easier), but why do we have a copy of the colored crate in colors? It feels like kind of the same thing in some ways, but more inferior and harder to use in others.

@daboross
Copy link
Owner

Just wanted to acknowledge that I've seen this, and haven't had time to review it in depth yet! Looking forward to it, just been busy.

A few comments just from reading this thread, without rereading the code:

I tried removing max_level, but it turned out to be necessary in order to know which global log level to set (the alternative would be to always Trace). I think solved this by still keeping OutputInner for the Dispatch variant. That way, our logging tree has Dispatches as inner nodes and Logers as leaves, which I find rather beautiful.

Right, I'd forgotten about that. If we can keep the transformation on Dispatch and nothing else, and keep the API consistent, that sounds good.

My one worry is that this could turn into a situation where chaining another dispatch is not as intuitive as chaining a "regular" logger, and that breaks some of the niceness of things.

There's also a potential of adding more non-Dispatch loggers in the future which chain things. I think the main one I've considered is a "dispatch-to-thread" logger which takes in one child logger, and handles records by formatting them to a String, sending that + metadata over a channel to a dedicated thread, and then giving that string + metadata to its child logger. That may or may not end up being a design we actually want, but things like that make me think it might be nice to leave room for non-Dispatch things as non-leaf nodes?

Personally, I kind of like the style of having a list of structs for each thing I want to do and then a list of constructors for each configuration how I want to do it. But I can see why you're against this.

I suggest adding one convenience method for each to Output, and additionally keep all structs in a documented separate module. Thus there is an easy way to get started while keeping the possibility to do deep custom configuration. I'm not sure if this will actually turn out to work well, but I'll try it out and we'll see.

I've thought some more about the public output API. Maybe making all those structs public isn't that a great idea. However, I can now point down my problem with the current way of doing things: it's inconsistent.

See, it's really cool to be able to call .chain(io::stdout()) or .chain(some_file), but then there is also Output::file() but no Output::stdout() and no Output::panic(). So I have to search multiple places depending on the output that I want, which is exactly what should be avoided.

Thanks for pointing that out - that's something I think I completely missed in designing the current interface.

I'm definitely interested in fixing this inconsistency. However, I'm not yet convinced that individual constructor methods are the way to do it - I think we could still have a solution which adds all possible outputs as methods in one place, such as on Output.

I've got two concerns with a solution which has individual constructors and Output methods:

First, if we only add a single convenience method to Output for each struct, but add more options in the struct constructors, I think this brings us back to an inconsistent interface wherein one has to look at each individual struct to figure out what all is actually possible.

Second, if we mirror everything, or even if we only mirror some of the individual methods in Output, where do the docs end up? I'd like to believe the detailed examples and descriptions of each logger are one of fern's strengths, and I'd ideally like to maintain each of those in one place, not split between two methods one wrapping the other.


I'm coming around somewhat to the idea of having just individual struct constructors. My main worry is still just exposing too many implementation details by how we split up the structures, and locking ourselves into a specific split even if another one comes up which makes more sense later.

I'd be down for a different compromise though - what if we had everything as individual constructors, but they all returned impl Log, rather than each individual struct?

That could let us change internal details in the future, if we want to, while keep existing constructors around. As an example, say we wanted to merge Stdout and Stderr into a Stream logger. We could then replace Stdout and Stderr with deprecated blank structs, and since Stdout::new and Stderr::new both return impl Log, they could both be changed to return a Stream.

Thanks so much for continuing to push this forward, and I hope my delayed responses are still useful! Like I said I haven't had a lot of time, but I'm happy to continue forward with this, and I'll see if I can carve out some time to look at code again soon.

@piegamesde
Copy link
Author

piegamesde commented Jul 24, 2021

My main worry is still just exposing too many implementation details by how we split up the structures, and locking ourselves into a specific split even if another one comes up which makes more sense later. […] an example, say we wanted to merge Stdout and Stderr into a Stream logger.

I think that having all structs opaque should give us enough room to work with: in your example, we could simply change the inner of Stdout to delegate to the new Stream (and also deprecate it).

Btw I've just tried to unify the structs using generics, because they seemed so similar. Sadly, the stdout/stderr ones stick out because they don't use a Mutex to guard their value. I tried factoring out that part into a closure, but it failed because it would require to be generic over lifetimes, which Rust cannot (easily) do. It should be possible in theory, if we stopped using lock() on Stdout and Stderr. I don't know what the implications of this are or if they are optional anyways. Using more trait objects instead of generics would help too.


I've been thinking a lot about possible API designs and how to keep them backwards compatible. One thing that always sticks out as odd is the line_separator option. One single option doesn't really benefit from a full-blown builder pattern like DateBased does. I don't really know where to put it. If I was designing the API from scratch with this requirement, I'd probably go for an Option<String> (inspired by Gtk-rs), but it would require a method signature change (also it has some other downsides). On the other hand, without that option things would be a lot easier.


Some other random findings I should write down before I forget them again:

  • The current usage of Into<Output> effectively results in Rust-y method overloading, which is interesting. I still need to decide on whether I like this as pattern or not.
  • One important limitation though is that it cannot be used together with blanket implementations, so one could not use this to make chain accept any Log implementation.
  • I'd like to add one design constraint to the new outputs: they should be usable as independent loggers. That way, fern would double as an inter-operable collection of generic log helpers. This could be achieved by impl Log for Output among other things, but exposing the outputs as individual structs is an equal way to achieve that goal.

My one worry is that this could turn into a situation where chaining another dispatch is not as intuitive as chaining a "regular" logger, and that breaks some of the niceness of things.

Suppose I made .chain for Dispatch-like loggers only, and added another method, call it fanout, add_output or chain_logger or whatever to add "normal" log implementations. Would you consider this "nice enough"? (Don't worry about backwards compatibility on this one for now)

@piegamesde
Copy link
Author

Picking this up again. I noticed I had kind of painted myself into a corner and things ground to a halt. I thus lifted my requirement of keeping backwards compatibility and then progress came back. This is obviously a draft, but please generate the docs and have a look.

  • First of all, I moved a lot of things in the modules around. I'm still not sure whether the logger module should be called output instead.
  • The new public API focuses mainly on methods and on the builder pattern:
    • Where possible, the only constructor is a method that returns impl Log + Into<Output>.
    • There is the common case where may set an alternative line separator. I implemented that feature using a trait, keeping the types as opaque as possible.
    • The Panic and Null loggers are zero-sized structs. I could equally well have made them methods instead. You decide.
    • The Write internally unifies a lot of implementations. However at the moment this means that stdout and stderr are doubly wrapped in a Mutex. I still need some idea to make the Mutex optional without reintroducing all that code duplication.
    • The DateBased is a more complex builder pattern, as before. The build method is implicit in the Into<Output> implementation (as before).
      • I do like the concept, but I'll probably add a build or into_log method that allows getting a Log type instead.
  • Btw I removed the line separator from the Sender logger. I think that should be part of the receiver.
  • The syslog module now contains method that explicitly creates the wrapper types (again, opaque)
  • The Output design has changed a little:
    • The methods on the implementation got moved to the logger and syslog modules.
    • The From implementations for non-loggers (i.e. io::StdOut or File) got removed. You now need to explicitly convert them to a Log using the methods provided. This is an attempt to increase consistency of the API at the cost of conciseness.
    • Output implements From<T> for as many logger types as possible, so that it can be used with chain. I added a convenience chain_logger but the ultimate goal is to remove that once we have trait specialization.

@daboross
Copy link
Owner

Awesome, thanks for being on top of this! Wanted to say I saw this, and I'm looking forward to reading your comments and code, and I'll reply in depth once I have. It may be more than a few days due to my current schedule and mental health, but I'll get to this ASAP.

@piegamesde
Copy link
Author

Sure, no need to rush. A quick note on the completeness state of the changes:

  • Documentation is a complete TODO
  • Tests run again, examples compile (except for doc examples / doc tests)
  • I have one API change planned based on experience from dog-fooding: fern::log_file will be merged into fern::logging::file. Same for log_reopen.
  • Once everybody's happy, we can talk about code formatting and making the history pretty. Also the formatter will probably not be part of the final changes (I'd like to iterate on it some more)

@daboross
Copy link
Owner

Hi! Sorry for the 6 month delay here.

I'm hoping to get a full review soon. If you want to, we can then go back to back-and-forth & get it merged, or otherwise I'll probably make additional changes myself & merge it. I want to get these changes in, but especially given how unresponsive I've been here, I don't want to require any more of your time than necessary.

@piegamesde
Copy link
Author

Thanks. Do it however suits you best. I am still interested in this and ready to invest some more time. I'm using my fork here and there as dependency for a while now, and I'm really happy with the changes, although it still is not finished. The logging and output handling is okay IMO, but the formatters and coloring will need some more work. I think the best way forward would be to take these into a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants