python 2.6, python 3 in single codebase #616

Closed
josef-pkt opened this Issue Jan 11, 2013 · 13 comments

Projects

None yet

4 participants

@josef-pkt
Member

This week there was a discussion on scipy devel and scikit-learn mailing lists about writing/converting the code to run without 2to3 on python >= 2.6 with six as a compatibility layer.
http://packages.python.org/six/

similar: instructions for developers writing compatible code
https://docs.djangoproject.com/en/dev/topics/python3/
see also the changes that Pauli introduced in scipy to convert the code to this.

We don't have any problems with 2to3 right now, but we might go this route at a future time.

Main advantage:
We can edit the source in py 2 and in py 3.
Currently when I make changes in statsmodels in the py 3 code, then I have to translate it back to py 2.5 and make sure 2to3 does the correct translation.

Note: requires dropping support for py 2.5, which everybody else (in SciPy land) has already dropped or decided to drop.

@josef-pkt
Member

see #1019 for links to how pandas converted to common py2 py3 codebase

A quick file comparison between current source and py33 converted source:

  • a few iterators: list(map...), list(zip...) and xrange
  • print statements in docstrings
  • int/long in foreign
  • some imports: dotted imports is trivial, but need to import from different locations in some cases
  • a few str, basestring, What do we do we current py3 compat?
  • maybe a few cases, like variable names, where we need to decide whether to support unicode, strings or binary names
  • several modules are completely unchanged

examples, from what I have seen before, require mainly changes of the print statements

possible problems where we don't have sufficient test coverage

Shouldn't take long to make the switch

@vincentarelbundock
Member

I could do a first pass on this using python-modernize: https://github.com/mitsuhiko/python-modernize

This does a lot of work automatically, and I can do some adjustments manually.

One strategy would be to go folder by folder, with one commit per folder. I can list the types of changes made on the pull request page.

How is Travis working these days? At least this could tell me if things still work using 2to3, and then we could test what happens after removing that crutch.

@josef-pkt
Member

Is it still possible to use 2to3 if the python-modernize has been run over the code?
My thought was removing 2to3 processing right away when starting the conversion.

Does python-modernize change docstrings?

From what I have seen: changing print in docstrings and adding the dot to the imports are the tedious parts. I would do all the other parts manually, maybe based on the suggestions of python-modernize.

Last point: I thought initially to wait a few weeks with this, to see if we need an 0.5.1.
One worry I have is that we run into merge and rebase conflicts, and needs checking with current PRs.
The only one I remember that touches existing code more extensively is GEE. However, in many modules the changes to the code will be very small, then this is not a problem.

@vincentarelbundock As a trial run to see how extensive the changes are, you could create a (throw-away) branch and just run python-modernize on the entire code.

@rgommers
Member

numpy code was converted over some time by applying individual 2to3 fixers, so 2to3 had to do less each time. I assume python-modernize will work the same. You won't break 2to3 by converting code to be 2-and-3-compatible code.

@josef-pkt
Member

I guess 2to3 will still work (in almost all cases) to create a working py3 version, but it will do redundant changes.
zip -> list(zip) and similar even if we are happy with the iterator zip of py3, I guess

My guess was it will take two solid working days to do the conversion, except for 2 or 3 edge cases, so I wouldn't drag out the dual stage (common code and 2to3)

@vincentarelbundock
Member

2to3 should still work after python-modernize is applied.

modernize does not change docstrings, so this will have to be done after using a different tool or by hand

I have limited experience with python-mechanize, but I've noticed that it can sometimes go really crazy, adding tons of repeated "from future import print_function" for example. I haven't found that applying it to the whole codebase was much more informative than what you posted up there.

Yes, it's unlikely to take too much time, but I will not have 2 straight working days to do this, so I thought an incremental strategy could work for me. But of course, if you have time, by all means :)

@jseabold
Member

You could likely do all of the docstring print conversions with one awk/sed command.

@josef-pkt
Member

There are not supposed to be any print statements in the main code. If there are still any, outside of a if debug, then they need to be converted to warnings or deleted.

More work for print is in the examples. There we still need something that is readable in both python 2 and python 3.

We are missing a way to check docstrings. Automatic, regex based changes to the docstring print statement will do most of the work but might leave some broken examples.

@josef-pkt
Member

I think it's time to switch to python 3.3
Minimize usage of six, when in doubt, use python 3.3 behavior (mainly for new iterators) *)
Ralf is just throwing away six.moves scipy/scipy#2803
so that we have very little code that is not "native" python3

check if we can use np.fromiter to replace things like np.array(list(zip(...))) as permanent solution

@josef-pkt
Member

to verify: are six.map six.zip ... the iterator versions ?
and especially range

@rgommers
Member

+1, >= 3.3 makes sense and makes your life easier

@josef-pkt
Member

@rgommers I didn't mean dropping 3.2 (in case I'm misunderstood)
I meant that from compat or six import ...... should have functions with the python 3 behavior
range with py 2 behavior won't exist anymore because range = xrange
(I'm still collecting notes, and haven't understood yet what six.moves is supposed to do.)

@josef-pkt josef-pkt closed this in 5931ec4 Apr 10, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment