Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive memory usage #67

Closed
hading opened this issue Mar 30, 2018 · 9 comments · Fixed by #94
Closed

Excessive memory usage #67

hading opened this issue Mar 30, 2018 · 9 comments · Fixed by #94

Comments

@hading
Copy link

hading commented Mar 30, 2018

This is with nginx 1.8.1 and mod_zip 1.1.6.

I'm trying to download a zip package for around 170,000 files and 100GB.

I use the workaround I described in issue #42 because of the large number of files we sometimes deal with, but this doesn't seem related.

When I watch under top, nginx just uses all the memory of the machine. I tried this with a machine with as much as 160GB of memory, and nginx used it all in less than a minute. This seems to be happening while it is dealing with the manifest file; the second nginx server in the above workaround never gets hit before memory is exhausted.

I don't see anything in the configuration that I think would be causing this problem. I will attach a minorly scrubbed version:

nginx.conf.txt

Doing some other trials the problem seems to be with the number of files to go into the zip, not the overall size in bytes.

When I was developing the code that uses this (beginning of 2016) I had experimented with some even larger (in terms of number of files) downloads than this, and they were fine, but I don't really have good records to suggest what might be different.

Honestly I'm at a loss to think how nginx even could consume that much memory doing this, but I watched it happen.

@hading
Copy link
Author

hading commented Mar 30, 2018

Let me add another piece of data: I was able to do a 40,000 file, 130GB zip on a server with 4GB memory and 4GB swap. So whatever is going on here was not scaling the way I expected. (And even that one still used almost all of the resources of the server, which still seems a lot.)

@hading
Copy link
Author

hading commented May 24, 2018

I've been working on the project that I have that uses this and was reminded of the issue.

I wonder if there is something about the particular file list (either just in raw size or with it having some sort of peculiarity) that is is causing the Ragel parser to blow up, which seems possible based on a bit of searching.

I have an example file list that blows it up, but I want to see how far I need to cut it down to get it to work to produce a more minimal example (that will blow up my 4GB mem/4GB swap machine). It is about 170000 files (essentially the same one I mentioned initially).

@hading
Copy link
Author

hading commented May 24, 2018

The file that blows up my machine has the following properties:

hding2$ wc manifest.txt.bad
   12500   86557 7531526 manifest.txt.bad

So it seems like parsing this 7.2MB file is taking over 8GB. Taking it down to 12250 lines it can do it, just barely. If I remember I'll attach the file once the AWS links in it have expired in a week, but it's just a lot of lines like:

- 95308 /long/aws/url?with_lots_of_query_parameters /long/path/to/zip/location

Does the parsing try to do the whole file at once rather than splitting into individual lines on '\r\n' and then parsing each individual line?

@evanmiller
Copy link
Owner

You can see the parser logic here:

https://github.com/evanmiller/mod_zip/blob/master/ngx_http_zip_parsers.rl

It's possible that the grammar is ambiguous, and Ragel blowing up with internal state (i.e. for backtracking). You might try recompiling the .rl file with other Ragel options, e.g. -G2, and seeing if that fixes things.

@hading
Copy link
Author

hading commented Jun 20, 2018

I think that it's possible that what is going on here is that the URI field allows \r and \n and that is forcing a lot of backtracking information to be kept.

I'm going to try regenerating things with that possibility removed and then try it on one of my test cases to see if there is an improvement.

@hading
Copy link
Author

hading commented Jun 20, 2018

No, that was not enough. I'm not really familiar with Ragel, so I'll look at it some more to see if it provides any facilities that I think might help.

@nahuel
Copy link

nahuel commented Jan 29, 2020

hi @evanmiller , do you see another possible source of this problem besides the Ragel parser?

@dimaryaz
Copy link
Contributor

dimaryaz commented Aug 5, 2022

I've figured out the bug; it's not the parser.

Every time mod_zip reads a chunk of input (about 8KB), it allocates a new buffer, copies old all contents, but does not free the old buffer: https://github.com/evanmiller/mod_zip/blob/master/ngx_http_zip_module.c#L153-L161

This results in an O(N²) memory usage, so even 10MB of input will take up GBs of memory.

Freeing the old buffer fixes it - though it's still inefficient in terms of CPU. Better yet, just use ngx_array_t, which doubles in size as needed. (It actually still leaks memory, but only O(N)...)

Here's a fix: quiltdata@1b6813a. If that looks reasonable, I'll send out a proper PR.

@evanmiller
Copy link
Owner

@dimaryaz Excellent find! Yes feel free to open a PR. An array seems like an unusual data structure to hold a string, but it seems like it will work.

dimaryaz added a commit to quiltdata/mod_zip that referenced this issue Aug 5, 2022
Currently, every time mod_zip reads 8KB, it allocates a new string, copies all existing data,
and doesn't free the old string. This results in GBs of memory use for a 10MB input.

Just use an nginx array which automatically doubles in size as needed.
(Still leaks memory, but not as much.)

Fixes evanmiller#67
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants