Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

now works with 2to3, passes all the tests in py25-27,31,32 #231

Closed
wants to merge 5 commits into from

9 participants

@puzzlet

This branch enables Werkzeug to work with 2to3, passing all the tests in py25-27 and py31-32. (The Werkzeug's web page says it supports Python 2.4, but it depends on newer libraries like hashlib already, so I thought 2.5 would be fair.)

Small incompatibilities that might affect Python 2 users:

  • MapAdapter.match() and MapAdapter.dispatch() now has separate arguments for path: path and path_info.
    • This is due to the newer WSGI standard for Python 3. the WSGI servers give environ['PATH_INFO'] to applications as bytestrings decoded in latin1 (hence unicode strings) . On the other hand, web applications, internally, would just use "strings" (which are also unicode strings) to pass around their path informations. So we use two kinds of (unicode) strings to specify web locations, one is in bytes-like representation and another one is a normal form. To distinguish them, we name them path_info and path respectively.
    • There should be no problem for average cases, since many codes (including tutorials) have used the library with path_info (in older sense) as a positional argument, and path_info (in newer sense) is almost used internally in Werkzeug codes only.
  • Some functions could be more grumpy about passing str and unicode to their arguments.
@untitaker

I think you should explicitly catch NameError here (and in similar cases)

@untitaker
Collaborator

Neat. I think it would be useful if you add comments to each try-except block you're introducing, to indicate which part is happening in which Python versions.

try:
    import json
except ImportError:
    import simplejson as json

becomes

try:
    import json
except ImportError:
    import simplejson as json  # Python < 2.6
@untitaker

Why did you do this?

because 2to3 would incorrectly "fix" ptr.next into next(ptr). It seemed easier for me to change the name than 2to3's behaviour.

@puzzlet

This is also a workaround to avoid 2to3 "fixing" self.itervalues() into self.values().

@soulseekah soulseekah referenced this pull request
Closed

Python 3 port available #212

@mitsuhiko
Owner

Just to leave a feedback here: I will not pull a port that uses 2to3 and I will not pull a port that requires me to maintain two source trees. I have done the first mistake once with Jinja2 where the Python 3 version just randomly breaks because I don't use Python 3 myself.

I was looking into that particular port and I think most of it can be accomplished without 2to3 assuming we drop support for 2.5 which should be possible.

@puzzlet

@mitsuhiko If you're thinking of a single source that can be both run on py2 and py3, you have to ensure all the users of the library to declare from __future__ import unicode_literals, because we cannot use u"" literals in codes, ending up every strings to be ambiguous in py2. This is just a single example, and the whole logic on strings will become more unpredictable, compared to when we use 2to3.

There's no other option we can support Python 3 than two you just mentioned. "Randomly breaks" means a lack of py3 user base who can fix errors, but I think there are enough Python 3 users around here, and we can handle this with proper tests and knowledge.

And I'm not using Python 2 either.

@kennethreitz
Collaborator

@puzzlet if we require 3.3, you can use all the u'' you need. Quite appropriate since Armin wrote the PEP.

@mangecoeur

@puzzlet you can also use the "six" compatibility library which provides b() and u() functions

@puzzlet

@mitsuhiko If I port this with six, would you pull it?

@mangecoeur

@puzzlet @mitsuhiko it may be interesting to follow Django's guide on supporting both python 2 and 3 https://docs.djangoproject.com/en/dev/topics/python3/ - they use six and aim to write python 3 code that remains compatible with python 2 (rather than the other way round, or using 2to3)

@yegle

@mitsuhiko
u"unicode literal" is not available in <Python3.3, and <Python2.5 doesn't support except Exception as e statement, should we drop support for 2.5, 3.1, 3.2 at all?

@untitaker
Collaborator

@yegle Which support for 3.1, 3.2? The tests already fail for 2.5

@yegle

@untitaker
I means, as @mitsuhiko said he would consider dropping support for 2.5. My suggestion is we can drop 3.1 and 3.2 all together since they don't have u"unicode" literal support.

@kennethreitz
Collaborator

I personally think that's the only sane option.

@untitaker
Collaborator

Wouldn't from __future__ import unicode_literals work out, since you then can use literals like in Python 3 across 2.7, 3.2 and 3.3?

@puzzlet

@untitaker That means we drop the support for 2.6 (yet most popular version I think) , plus require library users to use from __future__ import unicode_literals to avoid confusion.

@kennethreitz
Collaborator

Using unicode_literals mostly sucks, and should be avoided when possible.

@puzzlet

@kennethreitz and which one is the sane option as you think?

@kennethreitz
Collaborator

@puzzlet 2.6, 2.6, 3.3.

@untitaker
Collaborator
@mangecoeur

@untitaker The point i think is not whether it's supported by the python version, but that it forces people who use the library to also import unicode literals in their files which they may forget or simply not want to do - for instance existing users would have to change all their current code. The strategy used to make the library compatible with py2 and 3 should be transparent to users.

@puzzlet

Working on py2-py3 polyglot (using six) in progress: https://github.com/puzzlet/werkzeug/tree/feature/py3-six

@SimonSapin

from __future__ import unicode_literals only changes the syntax of the module (.py file) it is used in. It does not have any effect on whether users need to use it as well. (And they might want to use it even if Werkzeug does not.) Even within the same package one could have only some modules using it. (Although I would not advise this because of the cognitive load of keeping context in mind.)

Also, just because Werkzeug supports both Python 2.x and 3.x with the same code base does not mean that any app that uses it has to do the same. (Non-library stuff often pick just one language version they target.)

That said, unicode_literals does prevent "native string" literals (bytes in 2.x, unicode in 3.x) which can be useful in WSGI code. I’m not especially opposed to only support 2.6, 2.7 and 3.3+.

@untitaker
Collaborator

Does that mean we might need a utility function that converts bytestrings or unicode strings to a specified stringtype, in order to allow any kind of stringtype as input?

@untitaker
Collaborator

JFTR,@RonnyPfannschmidt seems to have worked on this, or something related: https://github.com/RonnyPfannschmidt/werkzeug/tree/python3-runtests

IIRC @mitsuhiko will merge Ronny's changes.

Why isn't this mentioned anywhere?

@puzzlet

Currently @RonnyPfannschmidt's tree fails in most of the tests, contrary to his recent tweet: https://twitter.com/ronnypfannsch/status/304266297607352320

puzzlet@box /tmp/werkzeug (python3-runtests u=) $ python3.3 run-tests.py 2>&1 | tail
  File "/tmp/werkzeug/werkzeug/urls.py", line 145, in iri_to_uri
    path = _quote(path.encode(charset), safe="/:~+%")
  File "/tmp/werkzeug/werkzeug/urls.py", line 38, in _quote
    assert isinstance(s, str), 'quote only works on bytes'
AssertionError: quote only works on bytes

----------------------------------------------------------------------
Ran 258 tests in 0.784s

FAILED (failures=72, errors=135)
@untitaker
Collaborator

He did have a python3`branch. Maybe he was talking about that, @puzzlet

@RonnyPfannschmidt
Collaborator

@puzzlet the failures are to be expected, the tests just run, not pass

there is more work needed to fix the uri/iri bytes vs unicode problems correctly, which is one of the reasons why my port is going so slow

@RonnyPfannschmidt
Collaborator

@untitaker that branch is abandoned, i made the uri/iri handling incorrect there

@untitaker
Collaborator

@RonnyPfannschmidt Could you document your work and what needs to be done, so others can help? (Do you even need help?)

@untitaker
Collaborator
22:48:54      ronny_ | untitaker, there is 2 areas where i could use help - 2. discussion
                     | about propper api changes to deal with bytes vs native strings vs
                     | unicode
22:49:28      ronny_ | and 2. reviewing my changes
@untitaker
Collaborator

About bytes vs unicode, i think functions should generally (with exceptions) accept both. Not sure about the return value, this would probably depend on the task the function is for.

Maybe werkzeug could have a helper that takes a bytestring or unicode string and always returns unicode, using UTF-8 as an encoding when it needs to decode a bytestring. This also could be in form of a decorator which converts all bytestring args and kwargs to unicode. Its usability might be limited then, but at least it looks clean and readable.

But then, i am probably not the most competent person about such things. I do maintain a library that runs on Py2 and 3, but i did it just wrong there. The reason it actually works is because the stdlib json module is doing all the critical work.

EDIT: My ideas above don't seem to be applicable, at least for the werkzeug.urls module, where URIs are always passed around as bytestrings, and IRIs as unicode strings.

@mitsuhiko
Owner

Newer port is now in master.

@mitsuhiko mitsuhiko closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.