-
-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix LP:#1074996 #122
Fix LP:#1074996 #122
Conversation
In python3, urlopen expects a byte stream for the POST data. this patch encodes the data in utf-8 before transmission. Essentially a hack, since a proper fix involves allowing the user to specify an encoding scheme of choice.
I'd accept a patch that properly encodes the data to a user defined encoding and sets the content type header appropriately. |
In that case, can you suggest a preferred way? would adding an optional parameter (default: ISO-8859-1) to submit_form be okay? Just found out that ISO-8859-1 is the default for a HTTP POST request when nothing is explicitly specified. |
I admit that this is really tricky because of the signature of the user defined OTOH, if users provide their own urlopen function, I think we can assume that they know how to deal with encoding the form data, so it should be enough to handle only the default case. So, yes, add an optional parameter that only gets passed through into While taking a look, I also noticed that the docstring is outdated. It would be nice if you could clean that up while you're at it anyway. Thanks for giving it a try. |
@scoder I have added an optional parameter to Most browsers do not set any headers on a direct POST request, and I cave decided to do the same. I am unsure of changes to the docstring, and have left it untouched as of now. Please find the new commit in the same branch, phanimahesh:patch-2. |
@scoder Have you reviewed the diff? |
gentle ping @scoder |
The fix is simpler: Note: it is unrelated to the character encoding that |
The character encoding was selected to match the default expected value for a POST request. But from the specification on how to select the form submission encoding in general, I agree that it must be UTF-8. And @scoder specifically requested that the user should be able to specify encoding here, hence I had to introduce an optional parameter. In the code, GET request is being handled separately. Should the url encoding be specified there, at |
There are two character encodings here. One encoding is used to convert "urlencoded" Unicode string (that never ever contains non-ascii characters) into bytes to be accepted as Another encoding is unrelated to the issue. It is the character encoding used before "percent encoding" is applied. It is the form submission encoding. As I understand it, @scoder talks about the second character encoding. Changing it, configuring it doesn't fix GET request is unrelated to the issue. |
In python3, urlopen expects a byte stream for the POST data. this patch encodes the data in utf-8 before transmission. Essentially a hack, since a proper fix involves allowing the user to specify an encoding scheme of choice.