Large uploads cause quadratic string concatenation #2471

hniksic · 2017-04-27T15:03:37Z

In our application that uses wfastcgi.py with IIS, we noticed intense slowdowns with uploads of large attachments. Although the server is idle and has sufficient free memory, it is unable to process e.g. a ~100MB uploaded file. While investigating this, we discovered that the process gets stuck in in wfastcgi.py before ever reaching our code.

The IIS FastCGI module splits attachment data into ~10KB chunks and sends them one by one to the fastcgi handler (implemented in wfastcgi.py). wfastcgi.py concatenates the chunks into a string object, using regular python string addition. For a 100MB file this requires creation of 10 thousand string objects of sizes increasing linearly from 10KB to 100MB, totaling in 465GB data to be copied to create them. For a 2GB file, the process takes too long to wait for it to finish.

Using += to concatenate a large list of strings is a known antipattern in Python and is fortunately easy to avoid by using BytesIO.

We have fixed the performance of wfastcgi.py uploads by applying a patch like the one in #2476. Since handle_response.__enter__ actually creates a BytesIO object out of the resulting string, using a BytesIO all along comes even more natural.

A more complete patch would probably make the equivalent change for read_fastcgi_data.

The text was updated successfully, but these errors were encountered:

zooba · 2017-04-27T15:29:15Z

Thanks for the analysis! This is likely the cause of problems we've seen in the past as well.

I agree with the proposed fix: res['wsgi.input'] and res['wsgi.data'] should be BytesIO instances by default, and if needed we should change wsgi.data to plain bytes in handle_response.__enter__.

zooba · 2017-04-28T18:35:47Z

This should be fixed with #2476, though we'll hold off publishing a new version of it for a little while in case we find any more problems.

If you're encountering this, I recommend downloading wfastcgi.py directly for now and replacing your existing one (either the file or by referencing the newer one from your web.config).

zooba added bug help wanted labels Apr 27, 2017

zooba added this to the 3.0 milestone Apr 27, 2017

orirmi mentioned this issue Apr 28, 2017

Efficiently buffer FastCGI input data. #2476

Merged

zooba closed this as completed Apr 28, 2017

rlokc mentioned this issue Oct 26, 2018

PyPi version of wfastcgi is outdated #4821

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large uploads cause quadratic string concatenation #2471

Large uploads cause quadratic string concatenation #2471

hniksic commented Apr 27, 2017 •

edited

zooba commented Apr 27, 2017

zooba commented Apr 28, 2017

Large uploads cause quadratic string concatenation #2471

Large uploads cause quadratic string concatenation #2471

Comments

hniksic commented Apr 27, 2017 • edited

zooba commented Apr 27, 2017

zooba commented Apr 28, 2017

hniksic commented Apr 27, 2017 •

edited