-
Notifications
You must be signed in to change notification settings - Fork 184
Fixes of unicode troubles #13
Conversation
when message is encoded into form markup. Especially when sending AX/SReg responses with non-ascii data.
…ed form values. The .toFormMarkup() method that generates a <form> HTML structure had a bug when the form field values contained UTF-8 encoded strings with characters outside the 7-bit ASCII space. If the lxml implementation of the ElementTree API was in use these values would result in a ValueError being raised (ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters). If the stdlib implementation of ElementTree was used these characters were silently replaced by their XML character reference equivalents (&#XXX;). This patch generates the form using Unicode values for everything and then serializes the form to a UTF-8 encoded string ensuring that the final form is what is expected and constant regardless of the ElementTree API implementation.
In generating the argument dictionary the .toPostArgs() method (apparently)
assumed that values were all Unicode objects and called
``value.encode('utf-8')`` on them unconditionally. However, the values appear
to be a mixed set of Unicode objects and UTF-8 encoded strings (most being of
the latter group).
Calling .encode('utf-8') on a string will implicitly decode the string into a
Unicode object before encoding it to the selected encoding. This automatic
decoding happens using the ``sys.getdefaultencoding()`` encoding which is by
default 'ascii'. The original call therefore works only as long as the values
are 7-bit ASCII and breaks when they contain higher bit characters.
The patch ensures that the resulting values in the returned dictionary are
UTF-8 encoded strings regardless if the input values were Unicode objects or
UTF-8 strings.
Conflicts: openid/message.py
…ling stores is same as other extension requests and responses.
…ling stores is same as other extension requests and responses.
Conflicts: openid/extensions/ax.py
…ode pages." That caused serious troubles with encoding of all other pages. Unescaping is much easier solution. This reverts commit 5e757a7.
Only problem was to use StringIO instead of cStringIO beacuse it does not support unicode strings. Except this well hidden bug was previous solution correct. Revert "Revert "Fix error in encoding which occured on discovery of some unicode pages."" This reverts commit 2b5235a.
|
I had some troubles with fix that helped parsing unicode pages, but finally I was able to make it right in commit 08382e5. |
|
Any news if this commits are going to be merged? |
|
Not yet |
other than openid namespace. See added test for example. Also fixed tests for associate requests with session_type and not assoc_type.
|
I should write some tests for this as well. |
- Add test for assoc_type in openid2
|
is this issue related to mine: sometimes I receive an error saying "OpenID authentication failed: return_to does not match return URL" and then I see expected url is the same with received, but received is a unicode one? |
|
I have not yet encountered that behaviour. Problems I found was related to inability of parsing HTML with HTML entities under some conditions. Are you sure that your URLs are different? Typo is very difficult to see for human :) You may also try to trace comparing function to see when it fails. Difference in string type should not be problem for comparing function. |
|
Well I've just copied the strings from error message to Python interpreter and their comparison was fine. I do agree with you, it should work in any case. This issue is really hard to reproduce. I will keep an eye on this, will let you know, may be there is an error is in error message :) |
|
Ok, now I know what happens and going to create an issue with description :) |
|
What exactly needs to be done for this issue to proceed? |
|
one needs to get attention of the project maintainers in the mailing list or private email. |
|
First attempts to contact maintainers. We will see in few days. |
|
I contacted lillialexis 15 days ago over github and got no response. I wonder if we should join forces together and create a fork. |
|
I have already though of cleaning commits and create reasonable fork for this change, so I can create also new branch as continuation of openid/master if needed. |
|
I have clean this problem in separate branch |
|
Created duplicate of pull request with cleaned commits I will leave this pull request opened for a while. If nobody from openid respond I will create new master branch under my profile to hold all necessary commits. |
|
I am closing this one as I replaced it with pull request #15 a while ago. |
There is several error caused by not expecting unicode strings in library.