Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3: unrestrictedTraverse checks for str #674

Closed
reinhardt opened this issue Jul 19, 2019 · 16 comments
Closed

Python 3: unrestrictedTraverse checks for str #674

reinhardt opened this issue Jul 19, 2019 · 16 comments

Comments

@reinhardt
Copy link

@reinhardt reinhardt commented Jul 19, 2019

unrestrictedTraverse checks whether path is an instance of str:

https://github.com/zopefoundation/Zope/blob/master/src/OFS/Traversable.py#L175

From the comment and the way it works with python 2 it seems that unicode strings are forbidden without being explicitly rejected. Everything that is not a binary string is just assumed to be a sequence of binary strings.

However, with python 3, str is the unicode type, so now you have to pass unicode for it to work. That seems unintended.

I guess the solution is to replace str with six.binary_type, or am I missing something?

reinhardt added a commit to collective/collective.solr that referenced this issue Jul 23, 2019
Workaround for zopefoundation/Zope#674
Fixes TypeError: 'int' object is not subscriptable
@ale-rt

This comment has been minimized.

Copy link
Member

@ale-rt ale-rt commented Jul 23, 2019

six.binary_type will not work on Python3, you need something like the safe_nativestring function defined here: https://github.com/plone/Products.CMFPlone/blob/bd98bd05a3eaf1c5b9579a7fb49da9c37f6721fd/Products/CMFPlone/utils.py#L525

@reinhardt

This comment has been minimized.

Copy link
Author

@reinhardt reinhardt commented Jul 23, 2019

In what way do you think it won't work?
Note that I'm not suggesting to convert the string, only to fix the isinstance check.

@ale-rt

This comment has been minimized.

Copy link
Member

@ale-rt ale-rt commented Jul 23, 2019

Because on Python3:

>>> isinstance(b"", str)
False
@reinhardt

This comment has been minimized.

Copy link
Author

@reinhardt reinhardt commented Jul 23, 2019

No, I mean, why will this not work:

if isinstance(path, six.binary_type):
@ale-rt

This comment has been minimized.

Copy link
Member

@ale-rt ale-rt commented Jul 23, 2019

six.binary_type is bytes in Python3 and thus it will work on Python2 only.
On both Python2 and Python3 path is expected to be a str instance.

reinhardt added a commit to collective/collective.solr that referenced this issue Jul 23, 2019
any more.
See also zopefoundation/Zope#674
@reinhardt

This comment has been minimized.

Copy link
Author

@reinhardt reinhardt commented Jul 23, 2019

If it is expected to be str in both then it is effectively expected to be something different in Python 3 than in Python 2. Can you explain the reasoning behind this?

@ale-rt

This comment has been minimized.

Copy link
Member

@ale-rt ale-rt commented Jul 23, 2019

Can you explain the reasoning behind this?

Not really, at least in a few sentences :)

@reinhardt

This comment has been minimized.

Copy link
Author

@reinhardt reinhardt commented Jul 23, 2019

OK, but you're saying it's correct like this? So you could also put it like this then:

if six.PY2 and isinstance(path, six.binary_type) or six.PY3 and isinstance(path, six.text_type):

which is of course longer but doesn't look like a mistake so much.

Also, the comment "Unicode paths are not allowed" is wrong, isn't it? In Python 3 unicode paths are just fine IIUC.

@ale-rt

This comment has been minimized.

Copy link
Member

@ale-rt ale-rt commented Jul 23, 2019

I think it is correct for this to be a native string and there is no issue with the code and I do not feel it being a mistake.

Also, the comment "Unicode paths are not allowed" is wrong, isn't it? In Python 3 unicode paths are just fine IIUC.

Yes the comment is not uptodate. It should be "paths should be native strings" or something like that.

@reinhardt

This comment has been minimized.

Copy link
Author

@reinhardt reinhardt commented Jul 23, 2019

OK, thanks. I don't really get why this is necessary or even sensible, but I'll close it for now. I'll reopen in case I encounter any actual issues.

@reinhardt reinhardt closed this Jul 23, 2019
@d-maurer

This comment has been minimized.

Copy link
Contributor

@d-maurer d-maurer commented Jul 23, 2019

@reinhardt

This comment has been minimized.

Copy link
Author

@reinhardt reinhardt commented Jul 24, 2019

@reinhardt Note that this is equivalent to if isinstance(path, str):.

That's exactly my point. I come from the Plone world where the philosophy is apparently the opposite of what you describe for Zope 4. We (I?) avoid concepts like native strings and think about the semantics to decide what type to use. Is the operation about text, or should we consider it binary data? Unfortunately str is text in one case and binary in the other, but six encapsulates that nicely. That's what my longer version of that if statement illustrates. It seems much clearer to me this way because I'm used to dealing with six's text_type and binary_type abstractions.

Anyway, thanks for the additional input! I have a better understanding of what's going on now. I'll keep my eyes open when working at the boundaries between Plone and Zope.

@d-maurer

This comment has been minimized.

Copy link
Contributor

@d-maurer d-maurer commented Jul 24, 2019

@reinhardt

This comment has been minimized.

Copy link
Author

@reinhardt reinhardt commented Jul 24, 2019

six assigns str to its binary_type on python 2. When I say "binary" I mean it in this sense - UTF-8 encoded characters mostly. "Text" is a higher level representation (unicode on python 2). So I don't think there is a misunderstanding, or if there is then it comes from six. But it's likely just an question of terminology.

See https://github.com/benjaminp/six/blob/master/six.py#L40

Btw., if you limit to ASCII then you have a degenerate case that's not really interesting. You can put ASCII characters in all of those types.

@d-maurer

This comment has been minimized.

Copy link
Contributor

@d-maurer d-maurer commented Jul 24, 2019

@reinhardt

This comment has been minimized.

Copy link
Author

@reinhardt reinhardt commented Jul 25, 2019

"UTF-8 encoded characters" are obviously text, no?

I don't think it helps to talk about what's obvious or not. The break in compatibility that python 3 introduced has created a bit of a mess, and one of the challenges is merely to talk about text/unicode/strings/characters etc. and how you want to handle them. You need some new terminology, partly because of these types that are called the same but do not behave exactly the same. So you can call all this "text", but then you simply need another word for stuff like "unicode in python 2 and str in python 3", or you have to spell it out all the time. I've "grown up" with the six terminology and I quite like it.

And of course it makes sense on some level that both of the str types have this name, but because of the break there are also a lot of situations where it does not make so much sense.

Not sure at what point this turned into a debate on principles. ;-) As I said, right now I don't have any issues with the above line and I don't need to fully understand why it works like this, so all's fine.

ale-rt added a commit that referenced this issue Oct 28, 2019
Refs #674
ale-rt added a commit that referenced this issue Oct 28, 2019
Refs #674
@ale-rt ale-rt mentioned this issue Oct 28, 2019
ale-rt added a commit that referenced this issue Oct 28, 2019
Refs #674
dataflake added a commit that referenced this issue Oct 29, 2019
* Update comment

Refs #674

* Python2 is not a thing anymore

* - change comment so they can be the same on master and 4.x [ci skip]
dataflake added a commit that referenced this issue Oct 29, 2019
* Update comment

Refs #674

* - change comment so it can be the same on master and 4.x [ci skip]
ale-rt referenced this issue in plone/plone.restapi Oct 31, 2019
ale-rt added a commit that referenced this issue Oct 31, 2019
The method ``unrestrictedTraverse`` raises an error when
the argument ``path`` is not something it can work with

Closes #674
dataflake added a commit that referenced this issue Nov 23, 2019
* Make unrestrictedTraverse more pedantic about the path

The method ``unrestrictedTraverse`` raises an error when
the argument ``path`` is not something it can work with

Closes #674

* - fix lint issue

* - restore original control flow
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.