-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Use html.escape to replace _quote_html in http.server #70772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In http.server, _quote_html is used to escape content for BaseHTTPServer. The function has already been implemented by html.escape. |
It's BaseHTTPRequestHandler, not BaseHTTPServer. |
Removing the redundant _quote_html() function seems like a good idea to me. I wonder if the code should be using the html.escape(..., quote=False) option though, because it does not need to encode quote signs. It might be good to add a test. It looks like there may not be anything testing the HTML encoding. |
At first I also want to use html.escape(..., quote=False) since the spec only asks to escape quote signs in attribute. But after some search on Google, there are articles recommends escaping quote in content too: https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet |
I add two tests for html escaping. One for the error message and the other for display name in SimpleHTTPRequesthandler. |
make some change to test for html escape in SimpleHTTPRequestHandler |
Thanks for the tests. I left a couple comments. About encoding quotes: Personally I don’t see much value unless you are encoding an attribute value, in which case I would prefer to use xml.sax.saxutils.quoteattr(). Encoded quotes would only become useful if the “error_message_format” attribute was modified. A more practical downside is that if “error_content_type” is set to say text/plain, we are adding two somewhat common characters that will get messed up. E.g. the “explain” string for 429 Too Many Requests will include the double-quoted "rate limiting". And an apostrophe could easily be given in a custom error message, e.g. “Can't write a clean error message”. |
Thanks for your review. Set quote to False when html.escape is OK to me. But except for the usage in BaseHTTPRequestHandler, I think we should also set quote to False for the usage in SimpleHTTPRequestHandler since they do not appear in attribute either. And I left a reply to your comment. How do you think about these? |
Yeah I would be happy if you want to change to html.escape(quote=False) in list_directory() as well. BTW I am getting Undelivered Mail replies from the review comments. I guess they might be getting rejected due to looking like spam (they tend to be classified as spam by default for me, something about how the server sends them). |
OK. You left the comment preferring using ascii or utf-8 to encode the filename in test. But actually the response SimpeHTTPRequestHandler send is explicitly encoded in filesystemdefaultencoding. So in such a case, am I doing right? |
Oops. You have already left new comment. No notify either. :( I like the idea that extract the actual encoding from response header. I will update my patch then. |
The uploaded patch is updated. Hope you have time to review. |
Thanks for the reviews. I have updated the patch. |
I left a couple notes on minor style nits (which I can fix when I commit this). But perhaps we should fix the business with tempdir_name and absolute URL paths (bpo-26609) first. |
New changeset bf44913588b7 by Martin Panter in branch 'default': |
Thanks for the patch Xiang |
Happy to see it works. Thanks for your reviews too. :) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: