Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up #441

Closed
wants to merge 23 commits into from
Closed

Speed up #441

wants to merge 23 commits into from

Conversation

0jdxt
Copy link
Contributor

@0jdxt 0jdxt commented Oct 27, 2020

Whilst working on my other PR I found --tree to be quite slow compared to tree and so analysed cargo flamegraph results resulting in reducing unnecessary allocations and system calls, producing a significant speed gain for the --tree and -R options.

Original flamegraph: (--tree)

flamegraph-slow

After optimisations: (--tree)

flamegraph-fast

After optimisations: (-R)

flamegraph-r

Originally, the main bottleneck was system calls in order to retrieve uid and gid properties and the following processing into strings when the -R and --tree options do not need this information so now on unix, the information is lazy loaded.

Now, in the optimised --tree graph, we see the main bottleneck now is creating the rest of the Meta struct and that the sorting and display isn't too shabby. In the optimised -R graph, we see the displaying and sorting of information into a grid is the biggest bottleneck for this option. This would indicate for future optimisations, some information and processing may need to be lazy loaded for the Meta struct or perhaps simply the algorithms need optimising, on top of improving the grid/tree display performance.

Nevertheless, overall this PR has managed to make the following speed gains, tested only on my linux x86_64 machine with hyperfine, compared with native equivalents ls -R and tree:

~ 5 000 directories, 56 000 files, 27G

Command Mean [s] Min [s] Max [s] Relative
lsd --tree 2.374 ± 0.016 2.350 2.400 6.89 ± 0.15
lsd --tree (opt.) 0.537 ± 0.007 0.528 0.550 1.56 ± 0.04
tree 0.345 ± 0.007 0.338 0.360 1.00
Command Mean [s] Min [s] Max [s] Relative
lsd -R 2.495 ± 0.027 2.458 2.535 19.77 ± 0.33
lsd -R (opt.) 0.649 ± 0.011 0.639 0.668 5.14 ± 0.11
ls -R 0.126 ± 0.002 0.123 0.130 1.00

@codecov-io
Copy link

codecov-io commented Oct 27, 2020

Codecov Report

Merging #441 (3224f87) into master (7254816) will decrease coverage by 0.19%.
The diff coverage is 73.10%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #441      +/-   ##
==========================================
- Coverage   88.48%   88.29%   -0.20%     
==========================================
  Files          36       36              
  Lines        3535     3289     -246     
==========================================
- Hits         3128     2904     -224     
+ Misses        407      385      -22     
Impacted Files Coverage Δ
src/core.rs 0.00% <0.00%> (ø)
src/main.rs 9.09% <0.00%> (-0.91%) ⬇️
src/meta/permissions.rs 86.95% <ø> (+3.62%) ⬆️
src/meta/mod.rs 38.79% <29.62%> (-0.69%) ⬇️
src/meta/inode.rs 66.66% <40.00%> (-2.09%) ⬇️
src/flags.rs 91.66% <50.00%> (-8.34%) ⬇️
src/meta/symlink.rs 88.88% <63.63%> (+19.50%) ⬆️
src/display.rs 75.73% <68.88%> (+2.80%) ⬆️
src/meta/owner.rs 61.11% <75.00%> (-13.89%) ⬇️
src/meta/name.rs 94.14% <80.88%> (+0.36%) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7254816...3224f87. Read the comment docs.

@0jdxt 0jdxt force-pushed the speed-up branch 2 times, most recently from 3430c54 to e5eac65 Compare October 28, 2020 22:30
Copy link
Member

@meain meain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR. I was not able to get this big a difference but a huge improvement however.

src/meta/name.rs Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
src/meta/name.rs Outdated Show resolved Hide resolved
src/meta/symlink.rs Show resolved Hide resolved
Copy link
Contributor Author

@0jdxt 0jdxt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey man! you're right some changes weren't necessary, I reverted some changes and tidied some expressions in some places. This commit now speeds things up in 2 ways:

String allocation

  • Use Cow to the Path instead of an owned String in the Name struct, so allocating a new String only happens if and when needed.
  • Use to_ascii_lowercase instead of to_lowercase since it is faster and the keys we have in Icons only contain, and probably will always only contain, ascii letters.

Syscalls

On linux fetching the file info for all the files was quite slow, especially when the information isnt needed (like -R and --tree). The syscalls have been deferred to when needed in rendering the Meta entry.

src/meta/symlink.rs Show resolved Hide resolved
@0jdxt 0jdxt force-pushed the speed-up branch 13 times, most recently from 15fd1af to 1610728 Compare November 12, 2020 22:28
@meain
Copy link
Member

meain commented Mar 28, 2021

@0jdxt Had to merge in #489 . You will have to rebase your PR on top of current master.
This looks mostly good to go otherwise.

@0jdxt 0jdxt force-pushed the speed-up branch 2 times, most recently from 1f06f8f to 67b8913 Compare March 28, 2021 14:23
this pr reduces syscalls, especially repeated ones. We should only get
the metadata once per file. Reduce allocating extra data, especially
when not used.
@0jdxt
Copy link
Contributor Author

0jdxt commented Mar 28, 2021

@0jdxt Had to merge in #489 . You will have to rebase your PR on top of current master.
This looks mostly good to go otherwise.

Rebased! should be good to go

// Check through libc if stdout is a tty. Unix specific so not on windows.
// Determine color output availability (and initialize color output (for Windows 10))
#[cfg(not(target_os = "windows"))]
let tty_available = unsafe { libc::isatty(io::stdout().as_raw_fd()) == 1 };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was used to check if lsd is outputting to a terminal or being piped to something else. This behaviour is not available with the new changes.

For example if we pipe lsd to less, we used to convert it to single line and remove the colors. But now it does not do that.

Screenshot

src/display.rs Outdated
)
}

generate_counter!(DIR_COUNT, u32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to use this here? I am thinking it would be better(simpler) to just have usize values created in fn tree.

src/display.rs Outdated
Comment on lines 40 to 43
let index = match flags.blocks.0.iter().position(|&b| b == Block::Name) {
Some(i) => i,
None => 0,
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let index = match flags.blocks.0.iter().position(|&b| b == Block::Name) {
Some(i) => i,
None => 0,
};
let index = flags.blocks.0.iter().position(|&b| b == Block::Name).unwrap_or(0);

@codecov-commenter
Copy link

Codecov Report

Merging #441 (38a6933) into master (b6bd1d3) will increase coverage by 0.44%.
The diff coverage is 63.06%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #441      +/-   ##
==========================================
+ Coverage   80.71%   81.16%   +0.44%     
==========================================
  Files          35       35              
  Lines        3449     3413      -36     
==========================================
- Hits         2784     2770      -14     
+ Misses        665      643      -22     
Impacted Files Coverage Δ
src/core.rs 0.00% <0.00%> (ø)
src/display.rs 25.26% <0.00%> (-0.23%) ⬇️
src/main.rs 10.00% <ø> (+10.00%) ⬆️
src/meta/permissions.rs 36.95% <ø> (+1.53%) ⬆️
src/meta/owner.rs 16.66% <15.38%> (-25.01%) ⬇️
src/meta/mod.rs 18.81% <18.18%> (+0.88%) ⬆️
src/meta/symlink.rs 90.24% <50.00%> (+20.43%) ⬆️
src/app.rs 92.15% <57.14%> (-3.93%) ⬇️
src/meta/name.rs 94.85% <75.55%> (+1.05%) ⬆️
src/meta/date.rs 98.79% <95.45%> (+5.53%) ⬆️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6bd1d3...38a6933. Read the comment docs.

@zwpaper
Copy link
Member

zwpaper commented Oct 6, 2021

Hi @0jdxt, thanks so much for working for this massive PR!

I want to try to make this PR fire just now, but I found it mixed up with several functionality changes, and some of it break the original function.

for example:

  1. Cargo.lock should not be added in .gitignore, check explains here
  2. termsize which used in the PR checking whether it is tty failed in pipeline
    image

how about break down the PR into several ones which contains one functionality only and we can make sure it work as expected, also we can do some quicker reviews and merges.

@meain
Copy link
Member

meain commented Sep 12, 2022

I'm closing this PR for now as it has deviated quite a bit from master, but let me know if you are still interested in this and we can open it back up.

@meain meain closed this Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants