Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dua takes too much time and it constantly hangs #116

Closed
c02y opened this issue Dec 24, 2021 · 13 comments
Closed

dua takes too much time and it constantly hangs #116

c02y opened this issue Dec 24, 2021 · 13 comments
Labels
question Further information is requested

Comments

@c02y
Copy link

c02y commented Dec 24, 2021

Is there some kind of log that I can check to let me know why dua takes a very long time to complete (most of the time, I just kill it since I don't know when it will finish)

but gdu always only takes a few seconds.

Honestly, most of the time dua will just hang, like 8 out of 10 times(across several months, using different dua versions), I can not move my mouse in it, can not use Ctrl-c to quit it. I've tried using -x or not using it.

Peek 2021-12-24 14-59

FYI:
ArchLinux
dua 2.14.7 install from Arch repo
gdu 5.12.1 install using go install

@Byron
Copy link
Owner

Byron commented Dec 26, 2021

Thanks for posting and for the reproduction video. It's very strange to see dua hang like that.
Even though I would have thought it's due to IO hangs, gdu doesn't seem to have such problem.

Maybe it's something else and dua hangs due to some other issue. Since dua is single-threaded, there is no synchronization going on at all. Instead jwalk does all the heavy lifting. Fortunately we can easily find out if it's coming from dua or from jwalk by running the jwalk/example/du program like this:

git clone https://github.com/Byron/jwalk
cd jwalk
cargo run --release --example du -- /

Note that my fork of jwalk changes du to not exclude hidden files, which is what dua does.

If it works, it's probably dua causing the hangs or something about the way it configures jwalk. Otherwise, it's most definitely something related to jwalk and we can try to solve the issue there.

Please let me know what you find.

@Byron Byron added the question Further information is requested label Dec 26, 2021
@c02y
Copy link
Author

c02y commented Dec 26, 2021

Peek 2021-12-26 12-37

BTW:

  1. I don't have any external drive mounted
  2. I don't have any other IO task running.

@Byron
Copy link
Owner

Byron commented Dec 26, 2021

Thanks for trying the experiment. This shows that despite being slow, it does complete. It's hard to imagine why dua wouldn't complete or take so long. I believe dua also does cycle checks which doesn't even happen in the du example, so that shouldn't be the source of issues either.

These spurious errors about the OS being busy seem interesting, as I think they might be worth a retry something dua doesn't currently do.

Could you also run dua (without the TUI) to see if this improves reliability? It has its own loop to consume the walkdir results and maybe that changes things.

Let me CC @jessegrosjean to add more experience to this thread.

@c02y
Copy link
Author

c02y commented Dec 26, 2021

Peek 2021-12-26 19-46

BTW:
it seems dua doesn't handle C-c/C-d/C-z correctly as you can see in the gif, it sometimes freezes my whole tmux panel(dua i mode), and I cannot even kill it using kill command.

@Byron
Copy link
Owner

Byron commented Dec 27, 2021

This is really interesting, as dua without TUI doesn't meddle with signals at all. This means, Ctrl+C sends a signal and the process aborts no matter what. If that's not happening, the process must be very, very stuck, probably on IO. Or in other words, aborting on Signal is automatic, and dua doesn't anything to handle this because it doesn't have to.

Probably that's an important hint about what's going on here.

My hypothesis is that even if only using a single thread it will still get stuck, what happens if dua -t 1 / is invoked?

Lastly, if that indeed also gets stuck, maybe it's a problem with traversing special files in /dev that gdu might naturally avoid.

Thanks for your help

@c02y
Copy link
Author

c02y commented Dec 27, 2021

I just tried dua -t 1 /, it is exactly the same as dua a /,

  1. takes long time to finish
  2. IO Errors at the end
  3. C-c cannot kill it when it is running

@Byron
Copy link
Owner

Byron commented Dec 27, 2021

Perfect, this truly means it's unrelated to threading (as jwalk falls back to a serial implementation then) and instead is related to trying to access special files which shouldn't be accessed or traversed.

gdu indeed handles sockets specifically which dua or does not, probably nor does jwalk. Maybe this is where blocking call happens.Interestingly directories will only be opened for entries if they appear to be one, so it's hard to imagine a socket poses as directory to cause that to happen. Otherwise only metadata calls are done, which leads to the next experiment.

dua -t 1 -A only checks the apparent size, and skips checking the files block size which might make a difference (but probably won't as the metadata was already retrieved, there is no way not to retrieve metadata.)

@c02y
Copy link
Author

c02y commented Dec 27, 2021

dua -t 1 -A / is the same with dua -t 1 /, got the exact 3 issues listed, and plus another one:

  • 141.40 TB
    at the end, which is not right from my perspective, I only have 1TB as you can see in previous gifs

@Byron
Copy link
Owner

Byron commented Dec 27, 2021

The difference in file size is due to the way it counts with -A, that's expected.

This outcome probably means that merely traversing the directory structure and querying metadata is causing the hangs.

Can you run gdu -i /foobar, assuming that this turns off the default ignore directories and replaces them with one that doesn't matter.

I'd expect the gdu invocation to block, which means dua should lean how to ignore a certain set of directories by default on linux at least.

Byron added a commit that referenced this issue Dec 27, 2021
On linux there are a few directories which shouldn't be traversed by
default as they may cause hangs and blocking.

With the new argument it's possible to specify absolute directories
to not enter during traversal, with a default set to avoid
problematic directories on linux right away.
@Byron
Copy link
Owner

Byron commented Dec 27, 2021

A new release is also available which mirrors the same logic as gdu.

Does that work better?

@c02y
Copy link
Author

c02y commented Dec 27, 2021

Yeah, gdu -i /foobar hangs for a little while, and it ignores Ctrl-c as well.

AND I just tested the new version of dua, it works fine now, like gdu, thanks.

Peek 2021-12-27 12-04

@c02y c02y closed this as completed Dec 27, 2021
@Byron
Copy link
Owner

Byron commented Dec 27, 2021

Great to hear. Maybe one more thing: if dua turns out to be slower than gdu, it might be worth playing with the -t flag to see how many threads are actually beneficial. On my machine, for instance, the value is at its best with only 4 out of 10 possible threads.

@c02y
Copy link
Author

c02y commented Dec 27, 2021

I tried -t 0~10, the best one is 4, all the other got over 2.7s results

>> time /tmp/dua -t 0 /                                                                                                                                                                                      [82/531]
 670.61 GB /1464290 entries

________________________________________________________
Executed in    2.79 secs    fish           external
   usr time   22.94 secs  324.00 micros   22.94 secs
   sys time   12.23 secs   46.00 micros   12.23 secs

>> time /tmp/dua -t 4 /
 670.61 GB /1452660 entries

________________________________________________________
Executed in    2.28 secs    fish           external
   usr time    8.46 secs    0.00 micros    8.46 secs
   sys time    2.46 secs  292.00 micros    2.46 secs

BTW gdu runs faster, but it is OK, I don't use this function frequently.

>> time gdu -ns /
625.2 GiB /

________________________________________________________
Executed in  910.33 millis    fish           external
   usr time    5.65 secs      0.00 micros    5.65 secs
   sys time    5.14 secs    252.00 micros    5.14 secs

dua vs gdu:

>> hyperfine "/tmp/dua -t 4 /" "gdu -ns /"
Benchmark 1: /tmp/dua -t 4 /
  Time (mean ± σ):      2.289 s ±  0.029 s    [User: 8.593 s, System: 2.432 s]
  Range (min … max):    2.264 s …  2.332 s    10 runs

Benchmark 2: gdu -ns /
  Time (mean ± σ):     758.9 ms ±  14.3 ms    [User: 4711.5 ms, System: 5071.2 ms]
  Range (min … max):   740.1 ms … 779.7 ms    10 runs

Summary
  'gdu -ns /' ran
    3.02 ± 0.07 times faster than '/tmp/dua -t 4 /'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants