Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verification Buffer is Extremely Small #1330

Closed
lps-rocks opened this issue Jun 22, 2020 · 22 comments
Closed

Verification Buffer is Extremely Small #1330

lps-rocks opened this issue Jun 22, 2020 · 22 comments
Labels
needs clarification More info is needed before work can proceed scope:core type:perf A code change that improves performance

Comments

@lps-rocks
Copy link

size_t const buflen = 1024 * 128; /* 128 KiB buffer */

The verification buffer is hard-coded and extremely small - This should be a user configurable option to allow for faster hashing.

@liut
Copy link

liut commented Dec 17, 2020

I have some large files that are over 400GB each, and the verification time is basically all over 20 hours, and I don't known why. Running on 32cores and 32G RAM, with 10T of disk space. Any idea?

@liut
Copy link

liut commented Dec 17, 2020

BTW. transmission(daemon) version is 3.0.

@GitHubGeek
Copy link

GitHubGeek commented Jan 31, 2021

Similar issue here. Running on beefy Xeon server. Ubuntu 20.04 server. Transmission was compiled from source (090a4b5)

Have this 900+GB torrent on verification for over 10 hours. Only 240 or so GB verified. Nothing was running on the server except Transmission.

The IOWAIT readings were very low during verification - A sign that Transmission was not reading from disk as fast as it could.

I feel this is a 3.0 specific issue, as past experience with 2.92 was very positive.

@liut
Copy link

liut commented Mar 12, 2021

BTW, my verifying logs:

I added a log at line 120 in verify.c
https://github.com/transmission/transmission/blob/master/libtransmission/verify.c#L120

tr_logAddTorDbg(tor, "Verified torrent piece #%d at %d seconds", pieceIndex, (int)(now - begin));

...
[2021-03-12 18:06:29.051] A_FOLDER Verified torrent piece #36 at 8 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:29.052] A_FOLDER Verified torrent piece #37 at 8 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:29.052] A_FOLDER Verified torrent piece #38 at 8 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:30.051] A_FOLDER Verified torrent piece #39 at 9 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:30.051] A_FOLDER Verified torrent piece #40 at 9 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:30.051] A_FOLDER Verified torrent piece #41 at 9 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:30.051] A_FOLDER Verified torrent piece #42 at 10 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:31.052] A_FOLDER Verified torrent piece #43 at 10 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:31.052] A_FOLDER Verified torrent piece #44 at 10 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:31.052] A_FOLDER Verified torrent piece #45 at 10 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:31.052] A_FOLDER Verified torrent piece #46 at 10 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:31.052] A_FOLDER Verified torrent piece #47 at 10 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:32.052] A_FOLDER Verified torrent piece #48 at 11 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:32.052] A_FOLDER Verified torrent piece #49 at 11 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:32.052] A_FOLDER Verified torrent piece #50 at 11 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:32.052] A_FOLDER Verified torrent piece #51 at 11 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:32.052] A_FOLDER Verified torrent piece #52 at 11 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:33.052] A_FOLDER Verified torrent piece #53 at 12 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:06:47.052] A_FOLDER Verified torrent piece #54 at 26 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:07:02.051] A_FOLDER Verified torrent piece #55 at 41 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:07:16.052] A_FOLDER Verified torrent piece #56 at 55 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:07:31.052] A_FOLDER Verified torrent piece #57 at 71 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:07:45.052] A_FOLDER Verified torrent piece #58 at 84 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:07:59.052] A_FOLDER Verified torrent piece #59 at 98 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:08:14.052] A_FOLDER Verified torrent piece #60 at 113 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:08:28.051] A_FOLDER Verified torrent piece #61 at 127 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:08:43.052] A_FOLDER Verified torrent piece #62 at 142 seconds (/app/libtransmission/verify.c:122)
[2021-03-12 18:08:57.051] A_FOLDER Verified torrent piece #63 at 157 seconds (/app/libtransmission/verify.c:122)
...

@liut
Copy link

liut commented Apr 14, 2021

I tracked the most time consuming functions as follows:

verifyTorrent(tr_torrent* tor, bool* stopFlag)
 >> tr_torrentSetHasPiece(tr_torrent* tor, tr_piece_index_t pieceIndex, bool has)
  >> tr_cpPieceAdd(tr_completion* cp, tr_piece_index_t piece)
   >> tr_cpBlockAdd(tr_completion* cp, tr_piece_index_t piece, tr_block_index_t block)
    >> tr_bitfieldAdd(tr_bitfield* b, size_t nth)

The bitfiledAdd takes a few hundred milliseconds each time after a critical value is reached.

@ckerr
Copy link
Member

ckerr commented Oct 14, 2021

I tracked the most time consuming functions as follows:

verifyTorrent(tr_torrent* tor, bool* stopFlag)
 >> tr_torrentSetHasPiece(tr_torrent* tor, tr_piece_index_t pieceIndex, bool has)
  >> tr_cpPieceAdd(tr_completion* cp, tr_piece_index_t piece)
   >> tr_cpBlockAdd(tr_completion* cp, tr_piece_index_t piece, tr_block_index_t block)
    >> tr_bitfieldAdd(tr_bitfield* b, size_t nth)

The bitfiledAdd takes a few hundred milliseconds each time after a critical value is reached.

This is almost certainly due to debugging assertions. Try compiling in release mode instead of debug mode and see if that makes any difference.

@ckerr ckerr added the needs clarification More info is needed before work can proceed label Oct 14, 2021
@liut
Copy link

liut commented Oct 19, 2021

@ckerr
Copy link
Member

ckerr commented Oct 19, 2021

@liut could you pastebin a verbose output of that build?

What I'm really interested in are the actual compiler flags being passed to g++/clang++. I want to see if -NDEBUG is included in them or not, i.e. whether Transmission is being built with assertions enabled or not.

@liut
Copy link

liut commented Oct 19, 2021

@ckerr I've added a patch.diff file to the above gist. I'm not sure this tr_logAddTorDbg func has anything to do with the DEBUG definition. I used --log-debug.

@ckerr
Copy link
Member

ckerr commented Oct 19, 2021

@liut sorry if I phrased that badly. I don't need to see a log of the run right now; I wanted to see it being built, e.g.

- RUN rm -rf build && mkdir build && cd build && cmake .. \
+ RUN rm -rf build && mkdir build && cd build && cmake -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON .. \

Also you might try building in Release mode to see if that improves performance. That should give the compiler more optimized flags and reduce the number of debugging assertion checks:

- RUN rm -rf build && mkdir build && cd build && cmake .. \
+ RUN rm -rf build && mkdir build && cd build && cmake -DCMAKE_BUILD_TYPE=Release .. \

@ckerr ckerr added type:perf A code change that improves performance scope:core labels Oct 19, 2021
@liut
Copy link

liut commented Oct 25, 2021

Due to this verification issue, all engines has been migrated to Deluge, maybe there is a chance later.

@ckerr
Copy link
Member

ckerr commented Jan 17, 2022

These assertions have been simplified in main. If the problem persists for anyone, please open a new issue with updated information. Thank you!

@ckerr ckerr closed this as completed Jan 17, 2022
@lps-rocks
Copy link
Author

lps-rocks commented Jan 17, 2022

@ckerr - Or how about we keep it open considering the buffer is still hard-coded and is abysmally small - This took 30 seconds of code searching to figure out that it's still an issue that still hasn't been solved. Prematurely closing issues without addressing them does nothing but muddy the waters when someone comes along opening a similar issue later on.

Even though this is using a Vector type - there's zero dynamic sizing of the array so what's the point of even using a vector? Now it's just bloating memory for no good reason since vectors take up more space than a hard coded length array (what we were dealing with before).

This is a good first step but is still woefully inadequate. Memory is dirt cheap. There's no reason to be a) stingy with buffers and b) hard-coded.

auto buffer = std::vector<std::byte>(1024 * 256);

@lps-rocks
Copy link
Author

Does increasing that buffer that should be on the stack BTW really affects speed?

Yes, it does, drastically.

@ckerr
Copy link
Member

ckerr commented Jan 18, 2022

Does increasing that buffer that should be on the stack BTW really affects speed?

Yes, it does, drastically.

I'm happy to spend more short-term memory use if the speed numbers warrant it. Do you have benchmarks?

@ckerr ckerr reopened this Jan 18, 2022
@ckerr
Copy link
Member

ckerr commented Feb 11, 2022

This issue is currently blocked on some numbers. @lps-rocks can you run some tests on the speed difference of different buffer sizes on your system?

@ile6695
Copy link
Contributor

ile6695 commented Feb 13, 2022

One interesting thing to check is if increasing buffer to some magic number like size of a file or L3 cache size would give the most benefit.
Increasing the buffer a lot could be implemented by keeping the currently working buffer size as is and catching an exception that gets thrown if using std::reserve for more memory fails. It has strong exception guarantee, so continuing normally would work in devices where there is not enough contiguous memory.

@ckerr
Copy link
Member

ckerr commented Feb 14, 2022

If we use so much memory that we get a std::length_error thrown, we're doing something very wrong 😄

I'm fine with the idea of increasing the buffer size, but I would like some objective numbers to know what sizes give more speedups and at what point we have diminishing returns. We could just guess, but since some people in this thread have implied firsthand knowledge of different rates, it would be great to share that knowledge

@ckerr
Copy link
Member

ckerr commented Feb 18, 2022

I was in the neighborhood of this issue while coding #2626 and tried tests where the buffer size was doubled, quadrupled, and then finally set to the piece size. I didn't see any change in speed -- in fact the larger buffer was slightly slower, though this is probably random chance.

It is intuitive that a larger buffer size should speed things up, but it doesn't seem to bear out in practice, at least in my tests. Because of that, I'm going to close this issue.

If someone has a way to speed up the verify pass and the numbers to back it up, please open a new issue with details. Thank you!

@ckerr ckerr closed this as completed Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs clarification More info is needed before work can proceed scope:core type:perf A code change that improves performance
Development

No branches or pull requests

6 participants
@ckerr @liut @GitHubGeek @lps-rocks @ile6695 and others