-
-
Notifications
You must be signed in to change notification settings - Fork 29.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quote_from_bytes uses a lot of memory for larger bytestrings #95865
Comments
I think this is one source of the slowdown: Line 909 in 73d8ffe
It's noted above that " return ''.join(map(quoter, bs)) A benchmarking script to verify that this improves the situation: from pyperf import Runner, perf_counter
from itertools import repeat
from urllib.parse import _byte_quoter_factory
import base64
def join_listcomp(loops, size):
quoter = _byte_quoter_factory('/')
bs = size * b'A'
t0 = perf_counter()
for _ in repeat(None, loops):
''.join([quoter(char) for char in bs])
t1 = perf_counter()
return t1 - t0
def join_map(loops, size):
quoter = _byte_quoter_factory('/')
bs = size * b'A'
t0 = perf_counter()
for _ in repeat(None, loops):
''.join(map(quoter, bs))
t1 = perf_counter()
return t1 - t0
runner = Runner()
btf = runner.bench_time_func
for e in [2,3,4,5,6,7,8]:
btf(f"join_listcomp 10**{e}", join_listcomp, 10**e)
btf(f"join_map 10**{e}", join_map, 10**e) My results from 3.11.0b4 on Windows:
|
It may also be possible to change the implementation of Potential benefits of laziness:
Drawbacks of laziness:
Benefits of eagerness:
Drawbacks of eagerness:
|
For what it's worth, |
Why 90 is much more efficient: When you use 100, your base64 result happens to end with Lines 903 to 904 in 759227f
With 80, you happen to get the slow path again. |
#95872 reduced some CPU time, but memory consumption could be improved. One approach would be to work in chunks, so less memory is caught up in pointers. I'm thinking of something like CHUNK = ...
chunks = [''.join(map(quoter, bs[i:i+CHUNK]))
for i in range(0, len(bs), CHUNK)]
return ''.join(chunks) Estimate if memory used here:
Total: roughly To minimize:
|
I benchmarked these: def f1(bs, quoter):
return ''.join(map(quoter, bs))
def f2(bs, quoter):
N = len(bs)
chunksize = _isqrt(N)
chunks = [''.join(map(quoter, bs[i:i+chunksize]))
for i in range(0, N, chunksize)]
return ''.join(chunks) Result:
So we could implement a cutoff: --- a/Lib/urllib/parse.py
+++ b/Lib/urllib/parse.py
@@ -29,6 +29,7 @@
from collections import namedtuple
import functools
+from math import isqrt as _isqrt
import re
import types
import warnings
@@ -906,7 +907,14 @@ def quote_from_bytes(bs, safe='/'):
if not bs.rstrip(_ALWAYS_SAFE_BYTES + safe):
return bs.decode()
quoter = _byte_quoter_factory(safe)
- return ''.join(map(quoter, bs))
+ if (N := len(bs)) < 200_000:
+ return ''.join(map(quoter, bs))
+ else:
+ chunksize = _isqrt(N)
+ chunks = [''.join(map(quoter, bs[i:i+chunksize]))
+ for i in range(0, N, chunksize)]
+ return ''.join(chunks)
def urlencode(query, doseq=False, safe='', encoding=None, errors=None,
quote_via=quote_plus): @iforapsy would something like this help you? @orsenthil is listed as urllib expert in the devguide. Is this sort of large-string (200 MB) constant-factor optimization worth including in the module, or would it be better to let clients do the chunking (which might be equally effective)? |
Sorry for the late response. It's taken a while to set up my program that calls the AWS API to test the impact of the patch. From my limited attempts, it's not a complete fix but it does help. For some of the inputs that I tried, there's less than a 1% decrease in peak memory usage according to |
This issue is similar to the |
on large input values. Based on Dennis Sweeney's chunking idea.
Bug report
When passed a bytestring that is over a hundred mebibytes (MiB), the
urllib.parse.quote_from_bytes
function uses much more memory and CPU than one would expect.repro.py:
I use
/usr/bin/time
to track how much CPU and memory is used.The function ends up at one point needing ten times the size of the bytestring to quote it (i.e. 1.31 GiB). It also takes several seconds to return. I expect it to return in under a second. Fortunately, there's no memory leak as the interpreter does return the memory after the function returns.
Interestingly, if I reduce 100 to 90 in the line marked "note 1", the function returns in half a second and uses only 250 MiB, which is much more in line with my pre-bug expectations.
This function consuming so much memory affects the AWSSDK for Python, boto3, as a lot of AWS APIs are called with URL-encoded parameters. boto3/botocore calls
urllib.parse.urlencode
to do that encoding. That ends up calling the problematicquote_from_bytes
. Sample stack trace:Your environment
Python 3.8.10 on Ubuntu 20.04 running on a t3.large EC2 instance. I have also been able to reproduce it with Python 3.10.6 and 3.11.0rc1+. I also reproduced it on Windows 10 running Python 3.9.13.
The text was updated successfully, but these errors were encountered: