This week there was a discussion on scipy devel and scikit-learn mailing lists about writing/converting the code to run without 2to3 on python >= 2.6 with six as a compatibility layer.
similar: instructions for developers writing compatible code
see also the changes that Pauli introduced in scipy to convert the code to this.
We don't have any problems with 2to3 right now, but we might go this route at a future time.
We can edit the source in py 2 and in py 3.
Currently when I make changes in statsmodels in the py 3 code, then I have to translate it back to py 2.5 and make sure 2to3 does the correct translation.
Note: requires dropping support for py 2.5, which everybody else (in SciPy land) has already dropped or decided to drop.
see #1019 for links to how pandas converted to common py2 py3 codebase
A quick file comparison between current source and py33 converted source:
examples, from what I have seen before, require mainly changes of the print statements
possible problems where we don't have sufficient test coverage
Shouldn't take long to make the switch
I could do a first pass on this using python-modernize: https://github.com/mitsuhiko/python-modernize
This does a lot of work automatically, and I can do some adjustments manually.
One strategy would be to go folder by folder, with one commit per folder. I can list the types of changes made on the pull request page.
How is Travis working these days? At least this could tell me if things still work using 2to3, and then we could test what happens after removing that crutch.
Is it still possible to use 2to3 if the python-modernize has been run over the code?
My thought was removing 2to3 processing right away when starting the conversion.
Does python-modernize change docstrings?
From what I have seen: changing print in docstrings and adding the dot to the imports are the tedious parts. I would do all the other parts manually, maybe based on the suggestions of python-modernize.
Last point: I thought initially to wait a few weeks with this, to see if we need an 0.5.1.
One worry I have is that we run into merge and rebase conflicts, and needs checking with current PRs.
The only one I remember that touches existing code more extensively is GEE. However, in many modules the changes to the code will be very small, then this is not a problem.
@vincentarelbundock As a trial run to see how extensive the changes are, you could create a (throw-away) branch and just run python-modernize on the entire code.
numpy code was converted over some time by applying individual 2to3 fixers, so 2to3 had to do less each time. I assume python-modernize will work the same. You won't break 2to3 by converting code to be 2-and-3-compatible code.
I guess 2to3 will still work (in almost all cases) to create a working py3 version, but it will do redundant changes.
zip -> list(zip) and similar even if we are happy with the iterator zip of py3, I guess
My guess was it will take two solid working days to do the conversion, except for 2 or 3 edge cases, so I wouldn't drag out the dual stage (common code and 2to3)
2to3 should still work after python-modernize is applied.
modernize does not change docstrings, so this will have to be done after using a different tool or by hand
I have limited experience with python-mechanize, but I've noticed that it can sometimes go really crazy, adding tons of repeated "from future import print_function" for example. I haven't found that applying it to the whole codebase was much more informative than what you posted up there.
Yes, it's unlikely to take too much time, but I will not have 2 straight working days to do this, so I thought an incremental strategy could work for me. But of course, if you have time, by all means :)
You could likely do all of the docstring print conversions with one awk/sed command.
There are not supposed to be any print statements in the main code. If there are still any, outside of a if debug, then they need to be converted to warnings or deleted.
More work for print is in the examples. There we still need something that is readable in both python 2 and python 3.
We are missing a way to check docstrings. Automatic, regex based changes to the docstring print statement will do most of the work but might leave some broken examples.
I think it's time to switch to python 3.3
Minimize usage of six, when in doubt, use python 3.3 behavior (mainly for new iterators) *)
Ralf is just throwing away six.moves scipy/scipy#2803
so that we have very little code that is not "native" python3
check if we can use np.fromiter to replace things like np.array(list(zip(...))) as permanent solution
to verify: are six.map six.zip ... the iterator versions ?
and especially range
+1, >= 3.3 makes sense and makes your life easier
@rgommers I didn't mean dropping 3.2 (in case I'm misunderstood)
I meant that from compat or six import ...... should have functions with the python 3 behavior
range with py 2 behavior won't exist anymore because range = xrange
(I'm still collecting notes, and haven't understood yet what six.moves is supposed to do.)
from compat or six import ......
range = xrange