Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mysteriously growing memory usage with many parallel requests #8933

Closed
justchen1369 opened this issue May 29, 2022 · 12 comments
Closed

Mysteriously growing memory usage with many parallel requests #8933

justchen1369 opened this issue May 29, 2022 · 12 comments

Comments

@justchen1369
Copy link

justchen1369 commented May 29, 2022

When running curl compiled from the master branch with

./curl2 -w '%{url}\t%{http_code}\n' \
    -so /dev/null \
    --cookie "Sayonara=Dewa_mata" \
    -IZ --parallel-max 300 \
    'http://glencoe.mheducation.com/sites/[1960000000-1969999999]/'    > output2.tsv

memory usage consistently continues to grow, taking upwards of 100MB of memory per 1 million requests iirc.
running strings on a memory dump of the newly allocated space yields repeated instances of something like

/etc/ssl/certs
/etc/ssl/certs/ca-certificates.crt
Sayonara=Dewa_mata
.com
/dev/null/
7.83.1TP/1.
http://glencoe.mheducation.com/sites/1973909851/

There seems to be a memory leak somewhere?

@bagder
Copy link
Member

bagder commented May 29, 2022

up to 300 parallel (TLS) transfers are likely use quite a lot of peak memory so I don't think 100MB sounds wrong.

If there's a leak, then surely the amount of used would continue growing? An even better way to verify would be to run a few tests with valgrind and see if it reports any leaks. I am not aware of any.

@justchen1369
Copy link
Author

justchen1369 commented May 29, 2022

Memory usage continues to grow well beyond 100MB- I've had to split it up to avoid OOM.

@justchen1369
Copy link
Author

justchen1369 commented May 29, 2022

Re: valgrind tests

something like this? (with --tool=massif)

==409013== 204,802 bytes in 2 blocks are still reachable in loss record 641 of 644
==409013==    at 0x4843839: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==409013==    by 0x48AD22B: Curl_preconnect (in /home/yay/curl/build/lib/libcurl.so)
==409013==    by 0x48AD4A5: multi_runsingle (in /home/yay/curl/build/lib/libcurl.so)
==409013==    by 0x48AECD9: curl_multi_perform (in /home/yay/curl/build/lib/libcurl.so)
==409013==    by 0x12D905: parallel_transfers (in /home/yay/search/curl)
==409013==    by 0x12E1EE: run_all_transfers (in /home/yay/search/curl)
==409013==    by 0x12E594: operate (in /home/yay/search/curl)
==409013==    by 0x125274: main (in /home/yay/search/curl)
==409013== 
==409013== 352,704 bytes in 501 blocks are still reachable in loss record 642 of 644
==409013==    at 0x4848A23: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==409013==    by 0x125B31: add_per_transfer (in /home/yay/search/curl)
==409013==    by 0x12766E: single_transfer (in /home/yay/search/curl)
==409013==    by 0x12E0E9: transfer_per_config (in /home/yay/search/curl)
==409013==    by 0x12E149: create_transfer (in /home/yay/search/curl)
==409013==    by 0x12D709: add_parallel_transfers (in /home/yay/search/curl)
==409013==    by 0x12DB59: parallel_transfers (in /home/yay/search/curl)
==409013==    by 0x12E1EE: run_all_transfers (in /home/yay/search/curl)
==409013==    by 0x12E594: operate (in /home/yay/search/curl)
==409013==    by 0x125274: main (in /home/yay/search/curl)

@bagder
Copy link
Member

bagder commented May 30, 2022

"still reachable" is not a memory leak, and why the use of massif?

I tried the command line below and it reports no leaks for me. On Linux using OpenSSL 3.0.3

valgrind ./src/curl -ZI 'https://site/#[1-600]' -o /dev/null --max-time 10 --parallel-max 300

@justchen1369
Copy link
Author

justchen1369 commented May 31, 2022

It doesn't seem to be a memory leak, yeah- the problem is the increasing memory usage without bounds. This is especially apparently with millions of requests.

Someone reported that a range of 5 million urls slowly grew to ~10 GB RSS.

@bagder
Copy link
Member

bagder commented May 31, 2022

It really cannot grow "without bounds" when it isn't a leak.

@justchen1369
Copy link
Author

justchen1369 commented May 31, 2022

Sure. I don't know what's causing the excessive memory usage, however.

@JustAnotherArchivist
Copy link
Contributor

JustAnotherArchivist commented May 31, 2022

As a datapoint, I was running a very similar command to the above, and its memory usage kept growing until it OOM'd after 12 million URLs at about 28 GB RSS. So it certainly seemed to grow without bounds there. This was with a rather old version of curl though (7.74.0 from Debian Buster backports), and I'm not able to test that specific example with master now.

@justchen1369
Copy link
Author

justchen1369 commented Jun 1, 2022

This is pure speculation, but is there a buffer somewhere that's not getting cleaned up?

Growing memory usage: https://asciinema.org/a/Pvco7kx7gj4cnHIkVAB51xir7

@bagder
Copy link
Member

bagder commented Jun 1, 2022

This is pure speculation

I think we need a more scientific debugging method.

@bagder
Copy link
Member

bagder commented Jun 4, 2022

Does it reproduce if the URL is HTTP:// only? Does it reproduce if you do it against a local server? How many requests does it take to reproduce?

bagder added a commit that referenced this issue Aug 30, 2022
When doing a huge amount of parallel transfers, we must not add them to
the per_transfer list frivolously since they all use memory after all.
This was previous done without really considering millions or billions
of transfers. Massive parallelism would use a lot of memory for no good
purpose.

The queue is now limited to twice the paralleism number.

This makes the 'Qd' value in the parallel progress meter mostly useless
for users, but works for now for us as a debug display.

Reported-by: justchen1369 on github
Fixes #8933
@bagder
Copy link
Member

bagder commented Aug 30, 2022

If you are able to, please apply #9389 and check if it fixes the problems. I believe it does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants