New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up #441
Speed up #441
Conversation
Codecov Report
@@ Coverage Diff @@
## master #441 +/- ##
==========================================
- Coverage 88.48% 88.29% -0.20%
==========================================
Files 36 36
Lines 3535 3289 -246
==========================================
- Hits 3128 2904 -224
+ Misses 407 385 -22
Continue to review full report at Codecov.
|
3430c54
to
e5eac65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR. I was not able to get this big a difference but a huge improvement however.
4dde336
to
13d7321
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey man! you're right some changes weren't necessary, I reverted some changes and tidied some expressions in some places. This commit now speeds things up in 2 ways:
String allocation
- Use
Cow
to thePath
instead of an ownedString
in theName
struct, so allocating a newString
only happens if and when needed. - Use
to_ascii_lowercase
instead ofto_lowercase
since it is faster and the keys we have inIcons
only contain, and probably will always only contain, ascii letters.
Syscalls
On linux fetching the file info for all the files was quite slow, especially when the information isnt needed (like -R
and --tree
). The syscalls have been deferred to when needed in rendering the Meta
entry.
15fd1af
to
1610728
Compare
1f06f8f
to
67b8913
Compare
this pr reduces syscalls, especially repeated ones. We should only get the metadata once per file. Reduce allocating extra data, especially when not used.
// Check through libc if stdout is a tty. Unix specific so not on windows. | ||
// Determine color output availability (and initialize color output (for Windows 10)) | ||
#[cfg(not(target_os = "windows"))] | ||
let tty_available = unsafe { libc::isatty(io::stdout().as_raw_fd()) == 1 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/display.rs
Outdated
) | ||
} | ||
|
||
generate_counter!(DIR_COUNT, u32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to use this here? I am thinking it would be better(simpler) to just have usize
values created in fn tree
.
src/display.rs
Outdated
let index = match flags.blocks.0.iter().position(|&b| b == Block::Name) { | ||
Some(i) => i, | ||
None => 0, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let index = match flags.blocks.0.iter().position(|&b| b == Block::Name) { | |
Some(i) => i, | |
None => 0, | |
}; | |
let index = flags.blocks.0.iter().position(|&b| b == Block::Name).unwrap_or(0); |
Codecov Report
@@ Coverage Diff @@
## master #441 +/- ##
==========================================
+ Coverage 80.71% 81.16% +0.44%
==========================================
Files 35 35
Lines 3449 3413 -36
==========================================
- Hits 2784 2770 -14
+ Misses 665 643 -22
Continue to review full report at Codecov.
|
Hi @0jdxt, thanks so much for working for this massive PR! I want to try to make this PR fire just now, but I found it mixed up with several functionality changes, and some of it break the original function. for example:
how about break down the PR into several ones which contains one functionality only and we can make sure it work as expected, also we can do some quicker reviews and merges. |
I'm closing this PR for now as it has deviated quite a bit from master, but let me know if you are still interested in this and we can open it back up. |
Whilst working on my other PR I found
--tree
to be quite slow compared totree
and so analysedcargo flamegraph
results resulting in reducing unnecessary allocations and system calls, producing a significant speed gain for the--tree
and-R
options.Original flamegraph: (--tree)
After optimisations: (--tree)
After optimisations: (-R)
Originally, the main bottleneck was system calls in order to retrieve
uid
andgid
properties and the following processing into strings when the-R
and--tree
options do not need this information so now on unix, the information is lazy loaded.Now, in the optimised
--tree
graph, we see the main bottleneck now is creating the rest of theMeta
struct and that the sorting and display isn't too shabby. In the optimised-R
graph, we see the displaying and sorting of information into a grid is the biggest bottleneck for this option. This would indicate for future optimisations, some information and processing may need to be lazy loaded for theMeta
struct or perhaps simply the algorithms need optimising, on top of improving the grid/tree display performance.Nevertheless, overall this PR has managed to make the following speed gains, tested only on my linux x86_64 machine with hyperfine, compared with native equivalents
ls -R
andtree
:~ 5 000 directories, 56 000 files, 27G
lsd --tree
lsd --tree
(opt.)tree
lsd -R
lsd -R
(opt.)ls -R