Skip to content

Use jwalk instead of walkdir: parallel walking for performance boost#40

Merged
bootandy merged 6 commits into
bootandy:masterfrom
AdminXVII:parallel-walk
Dec 4, 2019
Merged

Use jwalk instead of walkdir: parallel walking for performance boost#40
bootandy merged 6 commits into
bootandy:masterfrom
AdminXVII:parallel-walk

Conversation

@AdminXVII

Copy link
Copy Markdown
Contributor

Use a parallel walkdir implementation that is able to fetch more resources at the same time. This should greatly improve directory trees with lots of branching, at the cost of threading. Also make sure to allocate enough space as much as possible ahead of time to avoid reallocation. Lastly, remove on a vec needs to make an allocation each time, so prefer the retain method for repeated removal.

Some benchmark on my computer (galago pro 3):

Folder 1 (medium, shallow):
dust: 0.055011367s
du -sh: 0.245766881s

Folder 2 (small, shallow):
du -sh: 0.007830246s
dust: 0.025789946s

Folder 3 (large, deep):
dust: 3.421158887s
du -sh: 21.322613990s

@bootandy

bootandy commented Nov 25, 2019

Copy link
Copy Markdown
Owner

Hey thanks for this,

I have been considering using Rayon to parallelize this too.

Unfortunately it seems there is a bug merging the data on large directories.
Here is it running on one of my folders:

$ time target/release/dust /home/andy/dev/python/problems                     [23:54:01]
 137M ─┬ problems
  85M  ├─┬ hackerrank

compared to du (original dust agrees with du):

$ du -d 1 -h  /home/andy/dev/python/problems                                  [23:53:55]
15G	/home/andy/dev/python/problems

I am going to guess that the cumulative sum of file size is not being summed up correctly.

@AdminXVII

Copy link
Copy Markdown
Contributor Author

I'll investigate further, thanks for your time.

@AdminXVII

Copy link
Copy Markdown
Contributor Author

The problem was that JWalk does not take hidden directories in account by default.

On a side note, do you want me to add a CLI flag to ignore hidden directories? That could be a nice feature.

@bootandy

bootandy commented Dec 3, 2019

Copy link
Copy Markdown
Owner

You were correct about the hidden directories. Nice.

I have been running your branch vs the existing branch locally and I see very little difference in execution speed. I certainly can't get anywhere near the speed of du so I'd like to know how you benchmarked that.

Update: I have been able to get slightly superior performance to the existing dust by limiting the number of threads created to approximately match my cpu cores. I think the thing to do might be to incorporate something like: https://crates.io/crates/num_cpus to ensure we don't spawn too many threads.

I don't think we need a flag to ignore hidden directories, (I personally cant see the usecase, but if several people ask I'll add it). However, I would like an optional flag to control the aforementioned number of CPU cores we use (Personally I don't like it when a tool always grabs all your cpu cores).

I do like the commit: Use more rusty patterns and preallocate enough space I think that improves things.

I'll wait a week to see if you want to add anything and if not I'll merge and extend your work myself.

Cheers,

@AdminXVII

AdminXVII commented Dec 3, 2019

Copy link
Copy Markdown
Contributor Author

The performance surely depends on your hardware, I have a SSD with a 8 core intel. Maybe HDD and/or a lower number of cores is the reason you see the difference in speed.

For the benchmark used, I did

[parallel-walk][~/dev/dust]$ cargo run --release # dont benchmark cargo
[parallel-walk][~/dev/dust]$ sync; echo 3 > /proc/sys/vm/drop_caches
[parallel-walk][~/dev/dust]$ time ./target/release/dust ..
  57G ─┬ ..
  15G  ├─┬ redox
 5.7G  │ ├─┬ cookbook
 4.9G  │ │ └── recipes
 3.0G  │ └─┬ .git
 3.0G  │   └─┬ modules
 2.9G  │     └── rust
 6.1G  ├─┬ shellac-server
 5.4G  │ └─┬ target
 3.0G  │   └── debug
 6.1G  ├─┬ robotique
 6.1G  │ └─┬ target
 4.2G  │   └─┬ debug
 3.2G  │     └── deps
 5.8G  ├─┬ iced
 5.4G  │ └─┬ target
 4.7G  │   └─┬ debug
 2.7G  │     └── deps
 3.3G  ├─┬ planet-gen
 3.3G  │ └── target
 2.6G  └── ion
real    5.535719165s
[parallel-walk][~/dev/dust]$ sync; echo 3 > /proc/sys/vm/drop_caches
[parallel-walk][~/dev/dust]$ time du -sh ..
58G	..
real    21.578478589s

jwalk uses rayon, which states:
"By default, Rayon uses the same number of threads as the number of CPUs available. Note that on systems with hyperthreading enabled this equals the number of logical cores and not the physical ones."
So theorically, the number of threads should already match the number of cores.

I added the threads option to the CLI to specify the number of threads to use. Default to the number of logical cores.

@bootandy bootandy merged commit 79416fd into bootandy:master Dec 4, 2019
@bootandy

bootandy commented Dec 4, 2019

Copy link
Copy Markdown
Owner

thanks for your work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants