This branch enables Werkzeug to work with 2to3, passing all the tests in py25-27 and py31-32. (The Werkzeug's web page says it supports Python 2.4, but it depends on newer libraries like hashlib already, so I thought 2.5 would be fair.)
Small incompatibilities that might affect Python 2 users:
now works with 2to3, passes all the test in py25-27,31,32
I think you should explicitly catch NameError here (and in similar cases)
Neat. I think it would be useful if you add comments to each try-except block you're introducing, to indicate which part is happening in which Python versions.
import simplejson as json
import simplejson as json # Python < 2.6
Why did you do this?
because 2to3 would incorrectly "fix" ptr.next into next(ptr). It seemed easier for me to change the name than 2to3's behaviour.
This is also a workaround to avoid 2to3 "fixing" self.itervalues() into self.values().
specify exceptions and versions in backward-compat codes
Merge remote-tracking branch 'upstream/master' into feature/2to3
replace old py2 method file() -> open()
suppress unclosed file warning in Python 3.2
Just to leave a feedback here: I will not pull a port that uses 2to3 and I will not pull a port that requires me to maintain two source trees. I have done the first mistake once with Jinja2 where the Python 3 version just randomly breaks because I don't use Python 3 myself.
I was looking into that particular port and I think most of it can be accomplished without 2to3 assuming we drop support for 2.5 which should be possible.
@mitsuhiko If you're thinking of a single source that can be both run on py2 and py3, you have to ensure all the users of the library to declare from __future__ import unicode_literals, because we cannot use u"" literals in codes, ending up every strings to be ambiguous in py2. This is just a single example, and the whole logic on strings will become more unpredictable, compared to when we use 2to3.
from __future__ import unicode_literals
There's no other option we can support Python 3 than two you just mentioned. "Randomly breaks" means a lack of py3 user base who can fix errors, but I think there are enough Python 3 users around here, and we can handle this with proper tests and knowledge.
And I'm not using Python 2 either.
@puzzlet if we require 3.3, you can use all the u'' you need. Quite appropriate since Armin wrote the PEP.
@puzzlet you can also use the "six" compatibility library which provides b() and u() functions
@mitsuhiko If I port this with six, would you pull it?
@puzzlet @mitsuhiko it may be interesting to follow Django's guide on supporting both python 2 and 3 https://docs.djangoproject.com/en/dev/topics/python3/ - they use six and aim to write python 3 code that remains compatible with python 2 (rather than the other way round, or using 2to3)
u"unicode literal" is not available in <Python3.3, and <Python2.5 doesn't support except Exception as e statement, should we drop support for 2.5, 3.1, 3.2 at all?
except Exception as e
@yegle Which support for 3.1, 3.2? The tests already fail for 2.5
I means, as @mitsuhiko said he would consider dropping support for 2.5. My suggestion is we can drop 3.1 and 3.2 all together since they don't have u"unicode" literal support.
I personally think that's the only sane option.
Wouldn't from __future__ import unicode_literals work out, since you then can use literals like in Python 3 across 2.7, 3.2 and 3.3?
@untitaker That means we drop the support for 2.6 (yet most popular version I think) , plus require library users to use from __future__ import unicode_literals to avoid confusion.
Using unicode_literals mostly sucks, and should be avoided when possible.
@kennethreitz and which one is the sane option as you think?
@puzzlet 2.6, 2.6, 3.3.
@puzzlet 2.6 has support for unicode_literals.
@untitaker The point i think is not whether it's supported by the python version, but that it forces people who use the library to also import unicode literals in their files which they may forget or simply not want to do - for instance existing users would have to change all their current code. The strategy used to make the library compatible with py2 and 3 should be transparent to users.
Working on py2-py3 polyglot (using six) in progress: https://github.com/puzzlet/werkzeug/tree/feature/py3-six
from __future__ import unicode_literals only changes the syntax of the module (.py file) it is used in. It does not have any effect on whether users need to use it as well. (And they might want to use it even if Werkzeug does not.) Even within the same package one could have only some modules using it. (Although I would not advise this because of the cognitive load of keeping context in mind.)
Also, just because Werkzeug supports both Python 2.x and 3.x with the same code base does not mean that any app that uses it has to do the same. (Non-library stuff often pick just one language version they target.)
That said, unicode_literals does prevent "native string" literals (bytes in 2.x, unicode in 3.x) which can be useful in WSGI code. I’m not especially opposed to only support 2.6, 2.7 and 3.3+.
Does that mean we might need a utility function that converts bytestrings or unicode strings to a specified stringtype, in order to allow any kind of stringtype as input?
JFTR,@RonnyPfannschmidt seems to have worked on this, or something related: https://github.com/RonnyPfannschmidt/werkzeug/tree/python3-runtests
IIRC @mitsuhiko will merge Ronny's changes.
Why isn't this mentioned anywhere?
Currently @RonnyPfannschmidt's tree fails in most of the tests, contrary to his recent tweet: https://twitter.com/ronnypfannsch/status/304266297607352320
puzzlet@box /tmp/werkzeug (python3-runtests u=) $ python3.3 run-tests.py 2>&1 | tail
File "/tmp/werkzeug/werkzeug/urls.py", line 145, in iri_to_uri
path = _quote(path.encode(charset), safe="/:~+%")
File "/tmp/werkzeug/werkzeug/urls.py", line 38, in _quote
assert isinstance(s, str), 'quote only works on bytes'
AssertionError: quote only works on bytes
Ran 258 tests in 0.784s
FAILED (failures=72, errors=135)
He did have a `python3``branch. Maybe he was talking about that, @puzzlet
@puzzlet the failures are to be expected, the tests just run, not pass
there is more work needed to fix the uri/iri bytes vs unicode problems correctly, which is one of the reasons why my port is going so slow
@untitaker that branch is abandoned, i made the uri/iri handling incorrect there
@RonnyPfannschmidt Could you document your work and what needs to be done, so others can help? (Do you even need help?)
22:48:54 ronny_ | untitaker, there is 2 areas where i could use help - 2. discussion
| about propper api changes to deal with bytes vs native strings vs
22:49:28 ronny_ | and 2. reviewing my changes
About bytes vs unicode, i think functions should generally (with exceptions) accept both. Not sure about the return value, this would probably depend on the task the function is for.
Maybe werkzeug could have a helper that takes a bytestring or unicode string and always returns unicode, using UTF-8 as an encoding when it needs to decode a bytestring. This also could be in form of a decorator which converts all bytestring args and kwargs to unicode. Its usability might be limited then, but at least it looks clean and readable.
But then, i am probably not the most competent person about such things. I do maintain a library that runs on Py2 and 3, but i did it just wrong there. The reason it actually works is because the stdlib json module is doing all the critical work.
EDIT: My ideas above don't seem to be applicable, at least for the werkzeug.urls module, where URIs are always passed around as bytestrings, and IRIs as unicode strings.
Newer port is now in master.