Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor logging into a separate thread (fix + enhancements) #642

Merged
merged 21 commits into from
Apr 1, 2024

Conversation

AntoniosBarotsis
Copy link
Contributor

@AntoniosBarotsis AntoniosBarotsis commented Mar 11, 2024

See #632 for discussion. This PR does not implement the progress bar, just tackles the (performance) issues discussed in the PR related to logging.

Going to set this as a draft until the following checklist is complete just in case anyone wants to take a look while it's still being worked on.

Checklist

  • Update all info logs
  • Update all warning logs *(done with one exception, see 4th point in Things to discuss below)
  • Consider changing (simplifying) the boilerplate needed to log a message
  • Fix log output (level is currently missing + colours)

Things to check before merging

  • The log parameters (level/text/accessibility etc) have been maintained correctly
  • The logs look exactly the same as they did before (colors and formatting)

Note that I am currently getting 2 failing tests locally, these are failing because the log output is not updated yet (point 4 of the checklist), the messages are correct but [INFO].

Benchmarks

So far I've only tested decompression. I decided to compress the ouch repository for this (folder is around 2GB). I copied the contents of the ouch directory into a ouch2 folder and zipped both that and the original to test1.zip and test2.zip. The reason behind this is so when they get extracted in parallel, the filesystem trying to delete 1 copy and overwrite it with the other is not part of the benchmark (test1.zip -> ouch, test2.zip -> ouch2). Be sure to build ouch in release beforehand. I then ran the following: ouch d test1.zip test2.zip -y or cargo r -- d test1.zip test2.zip -y for both files and then for each file individually.

Here's the results:

Version To extract Time
ouch 0.5.1 both 49sec 942ms
ouch 0.5.1 test1/test2 31sec 466ms/31sec 211ms
fork both 12sec 957ms
fork test1/test2 9sec 565ms/9sec 352ms

Of course it is worth pointing out that the performance gain is so big because there are a lot of files present in the archive and thus a lot of logs, the gain would be non-existent/negligible in cases where for instance one large file is being decompressed since there are not enough logs to speed up there.

Things to discuss

  • warning logs currently ask for accessible even though it does not matter fixed
  • log_sender being needed in so many methods can get annoying (especially in testing), is it worth making it optional in some places? Explained in my next comment
  • can macros be used to make any of this easier for the person trying to log things I think it is simple enough as is now
  • Currently only one warning macro is not updated. This is because a log_sender would be needed inside this FromIterator implementation. That obviously won't work however we could consider making a normal function for this instead of using the trait implementation. It seems that this is only really used here. Another idea would be to keep warnings outside of this logging thread system as there should not be that many warnings ever so performance would not really be a concern for them.

@AntoniosBarotsis
Copy link
Contributor Author

Update: checklist is done! All info logs have been replaced as well as all warning logs with the 1 exception mentioned above:

Currently only one warning macro is not updated. This is because a log_sender would be needed inside this FromIterator implementation. That obviously won't work however we could consider making a normal function for this instead of using the trait implementation. It seems that this is only really used here. Another idea would be to keep warnings outside of this logging thread system as there should not be that many warnings ever so performance would not really be a concern for them.

Not sure what's best for this but I'm leaning more towards just moving away from the FromIterator implementation and using a normal function that can take a logger as an argument. This is the last thing I can think off of changing before a proper code review on my end.

I also mentioned

log_sender being needed in so many methods can get annoying (especially in testing), is it worth making it optional in some places?

After visiting this again, I don't really think there's a good reason for this. My initial idea was just to make the tests a bit easier but there is only one test that would benefit from this and the "boilerplate" is just 2 lines anyway.

Lastly, I seem to not be getting any colors in my terminal for normal/accessible mode in neither my fork nor v0.5.1, am I supposed to? 😅

@marcospb19 marcospb19 changed the title Logging optimization + concurrent logging fix Refactor logging into a separate thread (fix + enhancements) Mar 13, 2024
@marcospb19
Copy link
Member

Lastly, I seem to not be getting any colors in my terminal for normal/accessible mode in neither my fork nor v0.5.1, am I supposed to? 😅

Not on Windows 😛, does your terminal support it? You could try changing this check and see what happens.

ouch/src/utils/colors.rs

Lines 18 to 19 in e992da2

#[cfg(not(target_family = "unix"))]
pub static $name: &&str = &"";

Basically, I don't know what Windows terminals do parse ANSI codes properly, crossterm is (as pointed out by the name) cross-platform, so maybe we can use it to have colors in Windows too.

Regarding the PR, let me try something then I'll get back to you.

@vrmiguel
Copy link
Member

Awesome work @AntoniosBarotsis !

@AntoniosBarotsis
Copy link
Contributor Author

I'll likely be busy until Sunday so I'll check the colour fix then. I get colors from stuff like cargo commands so they are supported, not sure if they work differently though

@marcospb19
Copy link
Member

Hey @AntoniosBarotsis is it OK for you if I make some edits on top of your branch changing how we set up the logging?

@AntoniosBarotsis
Copy link
Contributor Author

Sure thing go ahead! Feel free to also discuss any potential issues you see with the implementation as is currently.

Copy link
Member

@marcospb19 marcospb19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed your code commit by commit, approved. ✔️

@AntoniosBarotsis
Copy link
Contributor Author

Just let me know what you think we should do about that remaining warning log, Ideally I want to remove both those macros since they aren't really used anymore.

I'll also take a look at everything myself again on Sunday. In the meantime, go ham with any nitpicks you have and I'll take a look if there's any. Was fun working on this! :)

@marcospb19
Copy link
Member

Was fun working on this! :)

Relatable, it was fun so I spent more time on this than I expected, I found some problems with warning output and STDIN synchronization, but I think I got it fixed now.

@AntoniosBarotsis can you review my commits? (I recommend commit-by-commit instead of the whole diff) And maybe check benchmarks with the latest commit.

@AntoniosBarotsis
Copy link
Contributor Author

I'll take a detailed look tomorrow and check the benchmarks again. Could you briefly explain what the stdin issue was?

@marcospb19
Copy link
Member

I'll take a detailed look tomorrow and check the benchmarks again. Could you briefly explain what the stdin issue was?

Yeah running the following code:

ouch decompress archive.zip.gz

Was not properly showing the warning.

I also suspect that multiple STDIN questions could land simultaneously from different threads if decompressing in parallel.

@AntoniosBarotsis
Copy link
Contributor Author

Ah I see, the issue was that I just forgot to log the warning when one was sent, works fine now indeed, nice catch.

I also suspect that multiple STDIN questions could land simultaneously from different threads if decompressing in parallel.

So the way it worked before (with my changes) was that this piece of code would lock both stdout and stdin, the thinking being that other logs will be delayed (but not block any work since they would be collected by the logging thread (which now that I think of it, would cause some logs to disappear since I join 10 in a string and then clear so if stdout was already locked, some logs would arrive between receiving those 10 and actually logging them)) and then since you can ask for 1 input at a time, stdin would also be locked and block other threads that wanted to ask for user input since they can't do any work anyway.

I don't think that anything is wrong with too many threads trying to lock stdin at the same time but we just need to make sure no logs are lost while everything is locked.

Everything else looks fine to me but would be great to get another pair of eyes to review these.

src/utils/logger.rs Outdated Show resolved Hide resolved
@AntoniosBarotsis
Copy link
Contributor Author

Ah yes just one thing I wanted to comment on was that maybe instead of sending a None message to indicate a flush we could have an enum for clarity but like you are the maintainer so if it makes sense for you keep it 😅

@marcospb19 marcospb19 marked this pull request as ready for review March 16, 2024 17:25
src/utils/logger.rs Outdated Show resolved Hide resolved
@AntoniosBarotsis
Copy link
Contributor Author

AntoniosBarotsis commented Mar 17, 2024

Rerun my benchmarks on the same zip archive from before:

  • Both zips: 16sec 795ms (+3.8s)
  • zip1: 12sec 975ms (+3.4)
  • zip2: 11sec 381ms (+2s)
  • Both in accessible mode: 15sec 870ms

I didn't run these enough times to average them out nicely but they seem to be slightly slower than before.
Removing these brings it down to 14sec 202ms (+1.3s) for unzipping both files so safe to say most of the slowdown is because of that additional lock. Of course losing 2 seconds is preferable to the user not seeing the warning.

After a bit more fiddling around, I found that most of the performance drop is in these lines that flush as removing those but keeping everything else as is brings this back to 13sec 423ms (+0.5s).

Note that in my test I am using -y so the question itself is never being asked, it just takes the always yes path which means that locking and flushing is technically not necessary (for this scenario). At the same time, when asking is required, performance of the thread making the question is not relevant anymore since waiting for the user to answer needs to happen anyway.

Edit: Not entirely sure if the question itself flushes (and so also prints the warning), but even if not, it would make more sense to add a flush there instead of where it is now. I haven't pushed any changes because I'm not sure if you have any local uncommitted changes from yesterday.

@marcospb19
Copy link
Member

Removing these brings it down to 14sec 202ms

Can you send me a new link to this piece of code, this one is not working.

@AntoniosBarotsis
Copy link
Contributor Author

I dont know how I keep messing up these links 😅

Lines 7 and 9 from here (so the flushes).
@marcospb19

@marcospb19
Copy link
Member

marcospb19 commented Mar 24, 2024

Hey, it's taking me even longer to respond cause I got a new job ( 😃😃🎉🎉 ).


I dont know how I keep messing up these links 😅

Lines 7 and 9 from here (so the flushes). @marcospb19

This is the same as the other link, is there any chance you confused it and sent the wrong one? :v


About the performance regression, I think that's OK if it's just for this corner case of .zip and .7z, so I'd accept it as it is, but let me know if you think we should do something else.


I haven't pushed any changes because I'm not sure if you have any local uncommitted changes from yesterday.

I don't have uncommitted changes, do you want to make any additional changes? If so, feel free to push them.

@AntoniosBarotsis
Copy link
Contributor Author

AntoniosBarotsis commented Mar 24, 2024

Hey, it's taking me even longer to respond cause I got a new job ( 😃😃🎉🎉 ).

Congrats!!! 😄

This is the same as the other link, is there any chance you confused it and sent the wrong one? :v

Ok I am convinced something weird is happening because the links point to my fork, when I click it I get this:
image
image

In retrospect, honestly the performance gain is really not that noticeable from delaying the flush until actually necessary and I feel like I might miss some edge case if I try and do that so in all honesty we can leave it as it is now. I was so focused on trying to figure out where those extra few seconds of slowdown came from I didn't actually consider if fixing them would be that important relatively to everything else.

I think with that, I don't have anything else to add. I could try and make a proper criterion benchmark so it can be reused in the future when/if you end up looking into optimisations or regressions again nevermind 😓 . The benchmarks already present shouldn't show any difference since logs are not printed (?)

@marcospb19
Copy link
Member

Ok I am convinced something weird is happening because the links point to my fork, when I click it I get this:

I'm also getting the same, however, the two links point to the same location, while in the original message you talk about different results, so I assume both links should point to different locations.

I didn't actually consider if fixing them would be that important relatively to everything else.

hahahaha it happens! 😅

About the benchmarks, for .tar.zst and .tar.lz4 archives I think this PR improves overall performance, it's just the STDIN interaction piece that messes things up, so I think we're good here!

I'll check what's pending in the unresolved conversations to finish and merge this.

Copy link
Member

@marcospb19 marcospb19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hartelijk dank!

Again, 😄.

@marcospb19 marcospb19 merged commit a14bbfc into ouch-org:main Apr 1, 2024
82 checks passed
@marcospb19
Copy link
Member

WOOOO HOOOOOOOO

@AntoniosBarotsis
Copy link
Contributor Author

AntoniosBarotsis commented Apr 1, 2024

Just want to remind that that potential deadlock bug was not fixed, what I wrote in the review was a suggestion! @marcospb19

I'll take a look at this again in a few weeks (and make an issue if its still there by then). I want to try and see if I can actually cause the bug and then fix it as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run logging IO in a separate thread (improve multithread capabilities)
3 participants