-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
gh-139938: Fewer assignments in html.escape #139939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
IIUC your screenshot correctly, the proposed is actually slower when |
I ran the benchmark you provided locally, and increased the number of runs to 50, the results:
|
It would be useful to add a wider range of string sizes. Also, is there a reason to not use D = {"&": "&", "<": "<", ">": ">"}
DQ = {"&": "&", "<": "<", ">": ">", "'":""", '\'': "'"}
TD = str.maketrans(D)
TDQ = str.maketrans(DQ)
def escape(sx:str, quote=True) -> str:
t = TDQ if quote else TD
return sx.translate(t) For string of length For small N, the choice of translate vs replace won't really matter. |
The micro-benchmarks are less pronounced:
The Benchmark scriptimport pyperf
def escape_ref(s, quote=True):
s = s.replace("&", "&")
s = s.replace("<", "<")
s = s.replace(">", ">")
if quote:
s = s.replace('"', """)
s = s.replace('\'', "'")
return s
def escape_new(s, quote=True):
s = (
s.replace("&", "&")
.replace("<", "<")
.replace(">", ">")
)
if quote:
return s.replace('"', """).replace('\'', "'")
return s
D = {"&": "&", "<": "<", ">": ">"}
DQ = {"&": "&", "<": "<", ">": ">", '"':""", '\'': "'"}
TD = str.maketrans(D)
TDQ = str.maketrans(DQ)
def escape_alt(sx, quote=True):
t = TDQ if quote else TD
return sx.translate(t)
test_html_strings = [
"Hello, world!",
"1234567890",
"Simple text with spaces",
"Hello, &world!",
"12345<67890",
"Simple text with spaces&",
"<b>bold</b>",
"<i>italic</i>",
"<u>underline</u>",
"<p>paragraph</p>",
"<div>content</div>",
"<div><script>alert('x')</script></div>",
"<a href='https://example.com'>link</a>",
"<img src='x' onerror='alert(1)'>",
"<b><i>nested</i> bold</b>",
]
def add_cmdline_args(cmd, args):
cmd.append(args.implementation)
if __name__ == "__main__":
runner = pyperf.Runner(add_cmdline_args=add_cmdline_args)
runner.argparser.add_argument(
"implementation", choices=["ref", "new", "alt"]
)
args = runner.parse_args()
if args.implementation == "new":
func = escape_new
elif args.implementation == "alt":
func = escape_alt
else:
func = escape_ref
for case in test_html_strings:
runner.bench_func(f"escape[{case}]", func, case)
runner.bench_func(f"escape[{case}, quote=False]", func, case, False) |
Just to add a bit of context to my motivation for this, html.escape is used by django (among others) in its html escaping logic, so this function may be called on the order of billions of times per day. The impact of this is just to remove several bytecode instructions per call, meaning it should be more noticeable when escaping small strings; the work of escaping longer strings is dominated by the actual |
I think we should leave such improvements to the JIT instead. I think this is where it would shine (where we could reuse the same variable, though I don't know if it's already the case [@Fidget-Spinner is it a JIT feature to be able to optimize such calls?]).
Well, the problem is that it's not just this function that could slow down the process. And even if it were called billions of times per day, calls for strings that are already a bit longer don't see any improvements (e.g., " It's usually not used as a single instruction, and is likely put inside other more complex logic, which itself may or may also be slower. I'm sorry but I think this change is not significant enough. What would be interesting is to benchmark a production server to see if this is indeed a bottleneck. |
For speeding this up, looking at the That means best case (no escapes found, quote=True) scan the string 5 times never copy and worst case scan the string 5 times and allocate a new string + copy 5 times (each Unfortunately I don't think this is worth moving to a C implementation that does that specifically (it'll be a lot of one-off code in a security conscious path). |
An alternative would be to rebuild the string from scratch and perform a |
Not at the moment but with TOS caching (I forgot the issue/PR but it's by Mark Shannon) it should become just register moves which is nearly free and some refcounting which is the expensive part. I'm not on a computer right now but can you examine if the original bytecode says LOAD_FAST_BORROW instead of LOAD_FAST? If so then it's going to have no refcounting at all in the JIT in the future and then it will just be register to register moves without refcounting which are basically free on modern CPUs |
Yeah so I checked and it is indeed |
One minor correction: when I say in the future for the JIT, I don't know how far into the future that may be. So if you really need this to be optimized right now, I'd recommend proceeding with it. |
On my machine, the benchmarking code below shows approximately a 1.2x speedup in the default case of
escape(s, quote=True)
. Thequote=False
case remains more or less unchanged on my machine. The underlying logic in the function should be unaltered.Here's the code I ran to benchmark:
The output on my mac is as follows:

The disassembled bytecode after the change is 6 instructions shorter.