Skip to content
This repository was archived by the owner on Jul 24, 2023. It is now read-only.

Conversation

@ziima
Copy link
Contributor

@ziima ziima commented Nov 25, 2010

There is several error caused by not expecting unicode strings in library.

  • incorrect encoding of data in responses with AX/Sreg extensions
  • error raised by discovering page with unicode and html entities in html attribute values

Vlastimil Zima and others added 14 commits July 26, 2010 09:26
when message is encoded into form markup. Especially when sending
AX/SReg responses with non-ascii data.
…ed form values.

The .toFormMarkup() method that generates a <form> HTML structure had a bug
when the form field values contained UTF-8 encoded strings with characters
outside the 7-bit ASCII space.

If the lxml implementation of the ElementTree API was in use these values
would result in a ValueError being raised (ValueError: All strings must be XML
compatible: Unicode or ASCII, no NULL bytes or control characters). If the
stdlib implementation of ElementTree was used these characters were silently
replaced by their XML character reference equivalents (&#XXX;).

This patch generates the form using Unicode values for everything and then
serializes the form to a UTF-8 encoded string ensuring that the final form is
what is expected and constant regardless of the ElementTree API
implementation.
In generating the argument dictionary the .toPostArgs() method (apparently)
assumed that values were all Unicode objects and called
``value.encode('utf-8')`` on them unconditionally. However, the values appear
to be a mixed set of Unicode objects and UTF-8 encoded strings (most being of
the latter group).

Calling .encode('utf-8') on a string will implicitly decode the string into a
Unicode object before encoding it to the selected encoding. This automatic
decoding happens using the ``sys.getdefaultencoding()`` encoding which is by
default 'ascii'. The original call therefore works only as long as the values
are 7-bit ASCII and breaks when they contain higher bit characters.

The patch ensures that the resulting values in the returned dictionary are
UTF-8 encoded strings regardless if the input values were Unicode objects or
UTF-8 strings.
Conflicts:
	openid/message.py
…ling

stores is same as other extension requests and responses.
…ling

stores is same as other extension requests and responses.
Conflicts:
	openid/extensions/ax.py
…ode pages."

That caused serious troubles with encoding of all other pages. Unescaping is much easier solution.

This reverts commit 5e757a7.
This reverts commit 4679afa, reversing
changes made to 2b5235a.
Only problem was to use StringIO instead of cStringIO beacuse it does not support unicode strings.
Except this well hidden bug was previous solution correct.

Revert "Revert "Fix error in encoding which occured on discovery of some unicode pages.""

This reverts commit 2b5235a.
@ziima
Copy link
Contributor Author

ziima commented Dec 10, 2010

I had some troubles with fix that helped parsing unicode pages, but finally I was able to make it right in commit 08382e5.

@rochacon
Copy link

Any news if this commits are going to be merged?

@ziima
Copy link
Contributor Author

ziima commented Jan 17, 2011

Not yet

ziima added 2 commits January 31, 2011 11:08
other than openid namespace. See added test for example.

Also fixed tests for associate requests with session_type and not assoc_type.
@ziima
Copy link
Contributor Author

ziima commented Feb 10, 2011

I should write some tests for this as well.

@krvss
Copy link

krvss commented Mar 28, 2011

is this issue related to mine: sometimes I receive an error saying "OpenID authentication failed: return_to does not match return URL" and then I see expected url is the same with received, but received is a unicode one?

@ziima
Copy link
Contributor Author

ziima commented Mar 28, 2011

I have not yet encountered that behaviour. Problems I found was related to inability of parsing HTML with HTML entities under some conditions.

Are you sure that your URLs are different? Typo is very difficult to see for human :) You may also try to trace comparing function to see when it fails. Difference in string type should not be problem for comparing function.

@krvss
Copy link

krvss commented Mar 28, 2011

Well I've just copied the strings from error message to Python interpreter and their comparison was fine. I do agree with you, it should work in any case. This issue is really hard to reproduce. I will keep an eye on this, will let you know, may be there is an error is in error message :)

@krvss
Copy link

krvss commented Mar 31, 2011

Ok, now I know what happens and going to create an issue with description :)

@domenkozar
Copy link

What exactly needs to be done for this issue to proceed?

@buriy
Copy link

buriy commented Apr 28, 2011

one needs to get attention of the project maintainers in the mailing list or private email.

@ziima
Copy link
Contributor Author

ziima commented Jun 2, 2011

First attempts to contact maintainers. We will see in few days.

@domenkozar
Copy link

I contacted lillialexis 15 days ago over github and got no response. I wonder if we should join forces together and create a fork.

@ziima
Copy link
Contributor Author

ziima commented Jun 3, 2011

I have already though of cleaning commits and create reasonable fork for this change, so I can create also new branch as continuation of openid/master if needed.

@ziima
Copy link
Contributor Author

ziima commented Jun 6, 2011

I have clean this problem in separate branch
https://github.com/vzima/python-openid/commits/unicode

@ziima
Copy link
Contributor Author

ziima commented Jun 6, 2011

Created duplicate of pull request with cleaned commits
#15

I will leave this pull request opened for a while. If nobody from openid respond I will create new master branch under my profile to hold all necessary commits.

@ziima
Copy link
Contributor Author

ziima commented Jul 14, 2011

I am closing this one as I replaced it with pull request #15 a while ago.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants