Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

IPEP 4: Python 3 Compatibility #2440

Closed
bfroehle opened this Issue · 19 comments

6 participants

@bfroehle

See IPEP 4: Python 3 Compatibility.

[Meta] Python 2 and 3 compatibility without 2to3... "2and3"

We've discussed using a Python 2 & 3 compatible syntax in our source in the past, and have included changes from several of the 2to3 fixers. C.f.,

With the release of Python 3.3 in the next few days (which again allows unicode literals u"..."), porting our code to a jointly compatible syntax should be more feasible.

Now there are some headaches, but they all can be worked around. I've been playing with this on a branch bfroehle/py3k but it's an ugly mess at the moment.

If we are going to seriously consider this undertaking, I think we should first agree upon the parameters and break the process up into a lot of smaller chunks. In addition, since these changes will possibly conflict with existing pull requests, we should probably also decide on appropriate timing.

I expect this proposal to be rather controversial, so let me attempt to outline exactly what would be required... it's probably more than you expect.

The Bite-Sized Pieces

  • six: Take much of the six module and add it to IPython.utils.py3compat. Or decide to abandon part of py3compat and just ship a copy of six.
  • exec: In Python 3 exec becomes a function. This necessitates replacing each call exec code in globals, locals with exec_(code, globals, locals) where exec_ is a wrapper defined in six.
  • execfile: Add a Python3 compatible definition of execfile in ipython.py
  • long: Replace long literals (0L) with explicit calls to long (long(0)) or decide if the literal was actually necessary in the first place.
  • dict: Apply the 2to3 dict fixer (iteritems -> items, etc) with a lot of manual oversight.
  • print: Apply the 2to3 print fixer. (This is already partially applied).
  • import: Apply the 2to3 import fixer to differentiate between absolute and relative imports.
  • funcattrs: Workaround changes to attribute names on function objects (f.func_code -> f.__code__), possibly by getattr(f, _func_code) where _func_code is defined in six or py3compat.
  • imports: The builtin module was renamed from __builtin__ to builtins. In addition many other modules were renamed (cPickle -> pickle, StringIO.StringIO -> io.StringIO, ConfigParser.ConfigParser -> configparser.ConfigParser, urllib2, etc). This means lots of try: import ...; except ImportError: ... or perhaps using six.moves.
  • unicode/basestring: unicode and basestring no longer exist. Use compatibility wrappers from six. Also functions like os.getcwdu are just os.getcwd now. Lastly, raw unicode strings are not allowed, but this can be worked around like u'' + r'...'
  • metaclasses: The method to indicate a metaclass has changed. The 2and3 compatible syntax is obscure, but easy... just call the metaclass directly to instantiate the class (_FormatterABC = abc.ABCMeta("_FormatterABC", (object,), {}))
  • reraise: The method of raising an exception with a traceback changed. Use six.reraise.

A taste (Python 2.7 & Python 3.3)

The following snippet shows me running ipython in Python 2.7 and 3.3 with the same source:

[bfroehle@highorder ipython (py3k)]$ python ipython.py 
Python 2.7.3 (default, Aug  1 2012, 05:14:39) 
Type "copyright", "credits" or "license" for more information.

IPython 0.14.dev -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: quit

[bfroehle@highorder ipython (py3k)]$ python3 ipython.py 
WARNING: IPython History requires SQLite, your history will not be saved
Python 3.3.0rc3 (default, Sep 27 2012, 08:35:53) 
Type "copyright", "credits" or "license" for more information.

IPython 0.14.dev -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: quit
@minrk
Owner

This proposal is valuable and interesting, but I just don't think it's the time for it. When Python 3.3 is a reasonable base py3 version, then I think it's appropriate, but the necessary shenanigans like six's u("fake unicode literal") are just too gross to consider doing this while we support Python 3.2.

And these code-wide changes are indeed quite painful. The existing 2to3 fixes in 0.14 were the source of very nearly 100% of the headaches putting 0.13.1 together.

With Python 3 adoption being so incredibly slow, I think we can be fairly aggressive with advancing our py3 base dependency, but not quite yet.

@bfroehle

@minrk I agree about the fake unicode literal --- that is way too gross to support. One counter argument is that we could allow running in place on Python 3.3 while still enabling some select 2to3 fixers (remove unicode literals, callable, etc) on installation.

When I started preparing the specs for what this compatibility would require I expected the task to be fairly straightforward. As you can see above, it's actually quite involved. There are a lot more pieces to the puzzle than I naively expected.

Another difficulty, which I didn't mention above is the treatment of the modules in IPython.external. I'd rather not spend any effort bringing them up to par.

Personally, I'm actually leaning against this proposal, but perhaps others want to chime in.

@minrk
Owner

we could allow running in place on Python 3.3 while still enabling some select 2to3 fixers (remove unicode literals, callable, etc) on installation.

That's a clever idea, I hadn't though of it.

Personally, I'm actually leaning against this proposal, but perhaps others want to chime in.

Indeed, and thanks for enumerating what is involved, it makes it a lot easier to discuss what we should do, and when it makes sense to take these steps.

I expect it's a feasible project when our support baseline is Python 2.7, ≥ 3.3, but probably too much of a mess for 2.6 / 3.2.

@fperez
Owner

Kudos for laying this out in detail, indeed it helps us reason about the problem.

The fact that it's so invasive lends weight to the argument of not doing it, but I like a lot your idea of running in-place on 3.3 while using 2to3 for the rest. The reason is that I think 3.3 is going to be the first version of py3 that realistically will cause major shifting in the scientific community. It's certainly the first version of Python I've seen in the 3.x series that I actually care about, as it's starting to do some interesting/useful stuff while allowing for a more natural transition (unicode literals).

So I'd like to propose that as our plan moving forward: @bfroehle, since you know the details better, could you update the above with that as the target instead? If you prefer to keep this discussion for reference with the original plan unchanged, we can always close this as "won't fix" and open one more narrowly targeted at 3.3.

There is a big benefit to being able to run in-place as it will make our development cycle much more natural, and at some point the domino effect of the 3 transition will begin to kick in. So I find the idea of 3.3 in-place to be really, really appealing, as it puts us in a great position for that process.

@takluyver
Owner

I think I'd lean towards not doing this until we drop support for 3.2, because we'd still need to keep the 2to3 machinery, and the dependency on distribute, in place. I don't find the 2to3 translation a major obstacle in development - I guess it's a bigger gain if you're used to using setup.py develop, but I've never got into that habit.

For reference, 3.3 will only get into Ubuntu for 13.04, so we should be supporting 3.2 at least up to then, and I'd expect to support it at least until 14.04, the next LTS release.

Another alternative we could consider is the approach taken in Django - use __future__.unicode_literals, and wrap the cases where both versions require the native str type. I understand that those cases are relatively rare, so it shouldn't be nearly as ugly as the u() wrappers. Since @jstenar's recent work, we're already using unicode_literals in some key places.

@bfroehle bfroehle was assigned
@bfroehle

@takluyver As an aside, I think you'll be able to install Python 3.3 in Ubuntu 12.10 ... it just won't be the default Python 3 version until 13.04.

@takluyver
Owner
@bfroehle

I've bundled up the sum of my wisdom on this matter into IPEP 4: Python 3 Compatibility.

@asmeurer

My opinion is worth, well, whatever it's worth. Probably not much, as I'm just an occasional contributor. But I just wanted to chime in that maintaining a common code base is going to be a bit annoying from the developer side. This is doubly true if you are still running 2to3 as well. The main benefit of using 2to3 is that 99% of the time, you can just write your code as you would for Python 2, and when it gets to Python 3, it just works (maybe that percent is a bit smaller if you use strings a lot, but it's still quite high). To write for Python 2 and 3 at the same time, you have to remember a lot of little rules, which no one will remember (and new contributors will not even know about). And given that IPython's test coverage is still poor (unless I am mistaken, in which case, please correct me), little mistakes will slip through, and no one will notice until they try the certain behavior in Python 3.

The idea about applying them automatically is an interesting one. How would this work? Obviously, requiring the user to do it would be dumb (just use 2to3 if you are going to do that). Assumedly, then, you would require the developer to do it. How do you test this? I guess you could make sure that your transformers are idempotent, and apply it against PRs to see if they change. But making transformers idempotent is a lot harder than just making the transformers, and in some cases it might be impossible. So you may have to either trust that the developer did it correctly, or try to write not only transformers, but tests that they have been applied.

The issue of writing your own 2to3 fixers is that messing with the ast is tricky stuff. If anything, I would try to get the six project to do this, and not attempt it alone. The transformers themselves will require some people with a good knack for finding and testing corner cases (and by the way, you'll want to make sure that they are indeed well tested). You're also going to want to make it very clear what changes it does make each time, at least in the beginning, so you can be sure that it makes only those changes that are correct.

But my point is that what Fernando said, "it will make our development cycle much more natural," I believe to be false. This will make it natural development wrong, and correct development unnatural, because now all developers have to constantly keep in mind to use Python 2&3-isms, in the worst case, or if you go the completely automated route, it will add a new very complicated side to IPython that will be used solely for development.

Once again, take my opinion for whatever you feel it is worth. But if it is worth anything, I don't think this is a good idea. 2to3-based development works quite well. It can be slow, especially on a large code base (imho, 2to3 core should have been written in C). The inability to do "setup.py develop" is indeed annoying, but as far as developing goes (which assumedly, "setup.py develop" is chiefly intended for developers), it is far less annoying than trying to write code that runs in both Python 2 and Python 3 at once.

@minrk
Owner

As I have put it, I think we will do this or something similar, but I don't think it's time (consensus may be otherwise). Since there really is no such thing as a natural or pleasant way to have a 2/3 compatible codebase (as illustrated by IPEP 4), we should do this when we consider IPython to be a Python 3 project, with legacy support for Python 2, and prioritize Python 3 idioms (e.g. futures everywhere). My guess is 0.15 will be the first real candidate for this.

Many thanks for enumerating the steps involved!

@asmeurer

There's also 3to2. I don't know if it if it's a usable project or not, though.

@takluyver
Owner

I think Fernando is right about it making development easier. IPython has to do a lot of string handling, and the bytes/unicode distinction is absolutely critical. With the present setup, my experience is that correct code requires a mental model of Python 2, Python 3, and the changes made by 2to3. With this change, the developer would need to think of Python 2 & 3, but not what 2to3 is doing.

Increasingly it seems to me that we're effectively writing Python 3 code in a way that works on Python 2 - e.g. using io.open(), which is open() backported from Python 3.

Aaron, I wouldn't call test coverage poor, although it could certainly be better. ShiningPanda says it's hitting 45% of lines, and our local tests have more dependencies installed, like PyQt and matplotlib.

As the proposal stands, I wouldn't go for it until we drop Python 3.2 support, so we can cut 2to3 out of the loop completely. But I'd be interested in using from __future__ import unicode_literals, and supporting 3.2 natively as well. This is the same approach that Django has used, so it's clearly feasible for large codebases.

@bfroehle

I spent a bit of time today attacking this problem from the other direction, namely a ipython3.py script which hacks the regular import mechanism to run our 2to3 fixers on the fly. For speed, I'm caching the 2to3 converted code in __pycache__/NAME.ipy2to3.py.

It seems to break un-predictably -- sometimes it attempts to load non-converted code for reasons I don't understand -- but running `git clean -xfd` to purge all the `__pycache__` files brings it back to life.

Things like the qtconsole and notebook are probably a lost cause (because they spawn new python processes).

Check it out if you like bfroehle/ipython@ipython3. Oh, and it requires Python 3.3 because I was reading those docs to write the code and it doesn't seem to be backward compatible. Python 3.2 compatibility is probably not very difficult, but I haven't put any effort into it.

$ unset PYTHONPATH
$ unset PYTHONSTARTUP
$ git clean -xfd
$ python3.3 ipython3.py
WARNING: IPython History requires SQLite, your history will not be saved
Python 3.3.0rc3 (default, Sep 27 2012, 08:35:53) 
Type "copyright", "credits" or "license" for more information.

IPython 0.14.dev -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
---------------------------------------------------------------------------

In [1]: import IPython

In [2]: IPython.__file__
Out[2]: '/home/bfroehle/src/ipython/IPython/__pycache__/__init__.ipy2to3.py'

Edit: I fixed some stale cache issues by adding sys.path_importer_cache.clear() and added Python 3.2 support.

@bfroehle

@asmeurer Thanks for your long and persuasive comment. I played around with 3to2 a while ago and found it mostly broken. I doubt things have improved much on that front.

You bring up a good point regarding the difficulty in ensuring that the 2to3 fixers are idempotent. Thankfully I think we'd be able to skirt around this issue by using only a very limited set of fixers for Python 3.2 support --- ideally no more than one which strips the u prefix from unicode literals --- or perhaps not even that one if we went with from __future__ import unicode_literals like Django.

@takluyver
Owner

@vsajip has pointed out that another option is to use his uprefix project, which is small enough that we could include a copy in IPython.external. That provides an import hook so u'foo' strings will work for Python 3.2.

@bfroehle

Let's give Python 3 compatibility a rest for a while... there isn't enough interest, and it causes a lot more headaches than its worth.

Also, I'm pretty happy with my Runtime 2to3 Conversion project as a stopgap measure.

@bfroehle bfroehle closed this
@vsajip

Let's give Python 3 compatibility a rest for a while... there isn't enough interest, and it causes a lot more headaches
than its worth.

That's a real shame. In my experience with porting pip, virtualenv, Django and other projects to run in a single codebase, the single code base approach is the least troublesome, as long as you don't have to support Python 2.5 or earlier. Since IPython requires 2.6+, I can't see any big issues other than a lack of confidence prompted by the level of invasiveness of the changes (and the level of confidence in the test suite).

OTOH, if you use 2to3, that's a trickier path, since 2to3 (while great at what it does) doesn't go the whole distance.

@takluyver
Owner

@vsajip - to be clear, we already have Python 3 compatibility working, and the full test suite passing, using 2to3. We've already dealt with most of the trickiness on that route. What we're leaving for now is making the same thing work without 2to3.

Maybe it would have been better to do it without 2to3 in the first place, but having already gone down the 2to3 route, the costs of changing probably outweigh the benefits at the moment. In a year or two, it will likely be worth doing.

@vsajip

That seems fair enough. The 2to3 approach was the officially blessed path at the beginning, and only more recently has it become clear that the single codebase approach is better if <= 2.5 compatibility isn't needed.

@minrk minrk added this to the 2.0 milestone
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.