Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating conserve library and utility functionality #173

Closed
wants to merge 44 commits into from

Conversation

WolverinDEV
Copy link
Contributor

@WolverinDEV WolverinDEV commented Aug 3, 2022

Aim

The aim is to separate library and the actual conserve command line utility.
Currently, progress output was vastly achieved with the conserve library therefore prevent other uses except for the conserve command line utility. The goal is provide a simple but yet powerful developer interface to interfere and monitor conserve process state.

Due to the initial aim, to provide a better way of logging this PR archives two things:

  1. Providing a more powerful developer interface (the main goal, described above)
  2. Implementing a logging system conserve utility (the terminal cli)

Attention:
This PR submits API breaking changes!

Providing a more powerful developer interface

Tapping into library functions as well as observe the current function progress has been achieved by so called Monitors.
These monitors can be passed as the last function argument. Depending on the function the monitor receives general progress notifications or other kind of events.

Implementing a logging system conserve utility

The logging system has been based on the tracing logging system and includes support for a logfile (fixing #104).
The actual output should not differ from the previous conserve utility output.

@WolverinDEV WolverinDEV changed the title [WIP] Using tracing for logging and implement file logging [WIP] Separating library functions from log/display utilities Aug 3, 2022
@WolverinDEV
Copy link
Contributor Author

@sourcefrog any thoughts on this one?
Turned out logging got into a more complex thing to do.

@sourcefrog
Copy link
Owner

Hi, yeah, I had not realized you were going to start right away. I can give you some guidance on the weekend.

@WolverinDEV
Copy link
Contributor Author

Ahh yeah all right.
I got some spare time yesterday so I thought I would give it a shot.

After some fiddling I came up with the concept of extracting the whole logging thing from the library and move it into the binary/into the responsibility of the library users. The prevents conflicts with the use of any other logging systems along with other libraries.

My approach would be passing some kind of statistic callbacks for progress logging.
May I push some first drafts this evening (UTC+2).

Copy link
Owner

@sourcefrog sourcefrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without going line by line here are a few thoughts:

  • Please don't delete -D, I use it.
  • This would be easier to review as a few PRs, but I understand how you might need to write the whole thing first to understand the code or what you want to change. e.g. introducing better structured error codes could be its own PR.
  • Converting messages to info! is probably OK in general, but as I mentioned on the bug I'm not sure it's a good idea for e.g. file listings that might be consumed by other programs, or show_versions. Or for example json will not be useful if it's mixed with log prefixes.
  • The src/bin entry point should remain small and should configure things in the library. I don't have a goal to migrate lots of code into the binary, rather the reverse. But, making the library configurable by different binaries that e.g. don't want terminal output, would be good.

.gitignore Outdated Show resolved Hide resolved
@WolverinDEV
Copy link
Contributor Author

WolverinDEV commented Aug 4, 2022

This would be easier to review as a few PRs, but I understand how you might need to write the whole thing first to understand the code or what you want to change. e.g. introducing better structured error codes could be its own PR.

Yes, perspective changed a bit.
Initially I thought, piping output trough a logging system including several targets (terminal and file in particular) would do the job. But this seemed to be a bit inflexible especially if you want to use conserve as library and not as stand alone tool.

I'm not sure it's a good idea for e.g. file listings that might be consumed by other programs, or show_versions

I skipped answering that point since I agree on that one as well, this is still one of my concerns.
I'm playing with some ideas for that:

  1. General log only into a file (which kinda defeats the whole purpose of this)
  2. Add a flag to omit all message prefixes (I'm not really a fan of that)
  3. Using a different output stream for the log and process output (e.g. log into stderr and output into stdout)

Right not I'm not feeling well about any of these solutions.
As the PR title mentions it's WIP and seems to bring more issues than initially thought.

The src/bin entry point should remain small

Okey, I'll keep that in mind.

Edit:
Basically this PR includes multiple steps:

  1. Split-up library usage, process logging, and output logging
  2. Determine feasible output formats for the end user
  3. Implement proper logging targets

@sourcefrog
Copy link
Owner

I'm happy to talk about it more when you want! I would suggest perhaps starting with options to write trace to a log file that is not stdout, without removing -D. But, it can of course towards whatever you're energized by.

@WolverinDEV
Copy link
Contributor Author

WolverinDEV commented Aug 5, 2022

I'd thought about the system in the recent time and think, I came up with a pretty need system.
Currently I've mainly focused on the backup function (my test dummy :P). All other functions will be refactored to follow that pattern. I've also resolved the issue with logging to stdout and using a progress bar at the same time (I'm not a that big fan of the current solution, but there might be some clean-ups).

Using ProgressMonitors like I've shown in backup would be my way to go.
This easily allowed me to factor out the UI related code (ref here). There is still a lot work needed, but the basic concept is pretty much done. Lot's of polishing and cleanup is needed tought!

PS: Don't worry, you'll get your -D switch back.

@WolverinDEV WolverinDEV marked this pull request as draft August 6, 2022 09:38
@WolverinDEV WolverinDEV changed the title [WIP] Separating library functions from log/display utilities Separating library functions from log/display utilities Aug 6, 2022
@sourcefrog
Copy link
Owner

Thinking about this a bit more overnight: perhaps the monitor interface should look more like something that just accepts Event and Problem enums.

Then, the implementation used in testing can simply record the events/problems into a vec to inspect later.

A different implementation can report significant events to log events. Another monitor (or perhaps the same one with a flag set) can update nutmeg models.

pub enum Event {
    StartListBlocks,
    ListBlocks { block_count: usize },
    FinishListBlocks { block_count: usize },
    StartValidateBand { band_id: BandId },
}
pub enum Problem {
    ...
}
trait Monitor {
    fn report_event(&self, event: &Event);
}

This would still have, to some extent, parallel state machines in the application code and the monitor. But, the states are not really that complex: validate just proceeds through a few phases. So that would be fine.

@WolverinDEV
Copy link
Contributor Author

perhaps the monitor interface should look more like something that just accepts Event and Problem enums.

I had the same thought at the beginning, but didn't go for it.
The reason why I didn't is that I actually tried it (first draft I had in mind):

When using an enum to propagate all status events we could just pass a callback functions which takes one argument (the
progress enum). I've tried that pattern and imo DX wasn't really good. It just felt off, made code harder to understand and didn't look appealing to me. Maybe try it yourself and see how it goes, might just be me :)
Side note: A problem is only an event which would mean intuitively you rather have a ProbleEvent than a separate callback for that.

In terms of testing, your right.
Saving events in a Vec<_> would be way more easy.

@WolverinDEV
Copy link
Contributor Author

The more I think about it, the satisfied am I with the amount of callback methods currently supplied.
I'll think any experiment a bit more as well.

@WolverinDEV
Copy link
Contributor Author

Hey,
I've been pretty busy with my last exam for this semester but I had a couple of thoughts about the enum thing.

I intuitively used separate functions since I mainly use that kind of pattern when the state of a struct needs to be observed.
An example would be a server with rooms where each room has a room monitor which gets called on user join, leave, etc...
For such concept having extra methods works well because most likely the callback code will be a bit larger.

Adapting this approach for events, especially if these happen in a predefined order works but might not be the best.
Instead propagating the progress should just be seen as one "event" containing the current state.
These progress updates should not be used for error reporting.

Therefore I'll adjust the monitor style accordingly.

@WolverinDEV
Copy link
Contributor Author

@sourcefrog any notes on the changes made?

@sourcefrog
Copy link
Owner

Hi, I appreciate all the work you put into it.

At the moment this gets only a small amount of time from me because it's a spare time project and I'm trying to finish cargo-mutants on my weekends.

I'm not sure about merging it just as is. It is very big. It does at least indicate some interesting directions. I think the proliferation of monitor methods, the handling of bulk output, and the particular way that trace interacts with the nutmeg view is not quite what I'm comfortable with at the moment.

@sourcefrog
Copy link
Owner

Hi, so I finally came back to this over the break to try to work out how to merge or otherwise resolve it. I think overall this is a really good step. The monitor seems fine as it is.

The main thing I'd want to change is to have a concept of output to stdout that is not a log message: listings whether json or text of files, versions, etc. I'll try to build on this to put that back in...

@WolverinDEV WolverinDEV changed the title Separating library functions from log/display utilities Separating conserve library and utility functionality Jan 4, 2023
@WolverinDEV
Copy link
Contributor Author

The main thing I'd want to change is to have a concept of output to stdout that is not a log message

If I' not mistaken you're referring to "show_index_json" within the "show" command?
I agree, that piping the json output trought the logger is quite a crual task to to,
but I haven't really had any better idea. Maybe use for the conserve binary some kind of logging abstraction?

PS: I've updated the PRs description and tile to a more (imo) accurate one as it has quite changed over time.

@sourcefrog
Copy link
Owner

What might work for this kind of case:

  1. A library API like "write a list of versions as json to this Write."
  2. The CLI can call that, asking it to just write to stdout.

Both of these could then be extended:

  • There are a few cases of "produce a list of things, render them as text or json, or maybe verbose text": versions, file listings. (Maybe just two for now?) Possibly those should change to iterators of objects that impl Serialize and Display.
  • Maybe the CLI code that coordinates logging and so on should have a function that when called turns off trace to stdout and progress, to make sure that nothing interferes with this bulk output.
  • Perhaps really trace and progress should go to stderr rather than stdout, so that you could potentially redirect output but still see error messages. (This may be annoying if it breaks a lot of existing tests making assertions about stdout.)

@sourcefrog
Copy link
Owner

It occurs to me that it might be possible to separate out just the addition of the monitor concept from everything else. That would, at least, get the PR size down a bit, be a step forward itself, and also probably be possible to land without regressing any behavior by introducing log prefixes on bulk output and without needing to update so many tests...

@WolverinDEV
Copy link
Contributor Author

Well, along with the monitor come the logging/ui changes. A core part of the new monitor system is to separate the whole display part. Therefore we could not just leave the old UI stuff within the conserve library. A possible solution tough would be to replace the logging done via tracing within the binary by just println!. I suggest keeping tracing as the logging backend for the library. We could just change the tracing subscriber and print them as standard messages (without the fancy colouring) as well.

@sourcefrog
Copy link
Owner

08d26ce has a work-in-progress experiment with using the monitor pattern only for validate, including collecting a structured list of problems, which will make testing validation/corruption better.

A possible solution tough would be to replace the logging done via tracing within the binary by just println!. I suggest keeping tracing as the logging backend for the library. We could just change the tracing subscriber and print them as standard messages (without the fancy colouring) as well.

I'm not sure I'm following here. I agree with using tracing for reporting errors, warning, debug info, status messages and so on. I just think that file listings, version listings, and so on are a different kind of thing and probably should not go through tracing.

@sourcefrog
Copy link
Owner

I decided to instead merge #197 which builds on these ideas. Thanks for the thoughts and provocation!

@sourcefrog sourcefrog closed this Feb 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants