Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single codebase python 2 and 3 support #154

Merged
merged 85 commits into from
Oct 29, 2016
Merged

Single codebase python 2 and 3 support #154

merged 85 commits into from
Oct 29, 2016

Conversation

tomkooij
Copy link
Member

@tomkooij tomkooij commented Jul 7, 2016

... because it's 2016!

This implements single codebase python 2/3 support by using __future__ imports and the six package.
Some changes:

  • Pytables couldn't store bytestring in vlarrays VLStringAtom (bug): fixed in pytables 3.2.3.
  • Pickling classes is incompatible between python 2 and 3. (sigh!). Almost fixed/worked-around in pytables 3.2.3. Will be fixed in 3.2.4. 3.3.
  • Python 3 has explicit difference between bytestrings and strings. b''and .decode() needed in various places.
  • Because VLStringAtom stores bytestrings, blobs and station paths (s_index) need to be encoded before writing to h5 and decoded after reading from h5.
  • octal 0777 is 0o777 in python 3.

This passes all tests on python 2 and 3, but it needs an unreleased patch to pytables. Most of the work was in fixing pytables anyway (which resulted in the recent pytables release). Unfortunately, it will require another small patch. With pytables-3.2.4 we will be able to read coriska.h5 files saved on python 2.7 in python3.

Currently this is based on v1.0.0, I haven't merged current master yet.

ToDo:

  • fix if six.PY2 (patching builtins.open)
  • treat warnings as errors on python3 (tables.VLStringAtom deprecation)
  • check bytes vs string comparisons on python3 (python -bb)
  • review zip/range iterators/lists
  • review encode()/decode(): which encoding? default? ascii? latin1? utf-8?
  • sphinx seems a requirement on py3, Investigate in py2/py3, fix. not able to reproduce.
  • pip install python setup.py develop on windows breaks on progressbar2 on py3 (Install fails in setup.py (from package requiring progressbar2) on python 3.5 wolph/python-progressbar#81, pip install does work)
  • revert pytables installation for travis to conda when 3.2.4 is released (a87d220)

Ready for production? NO!
Although this passes all tests, it is bound to break. It cannot be merged with master before pytables-3.2.4 is released and conda packages are available.
I do intend to use python3 from now. LiOs will have to stay on 2.7 for now.

pip install sphinx fails on python3
urllib.urlopen returns bytestring on python 3
Use six to alias python3 urllib changes
unittest: assertCountItems (py3) -> assertItemsEqual (py3)
zip() -> list(zip())
Use iterdecode to decode bytestrings (python3 urlopen)
process_events: blobs are vlarray of VLStringAtom() which
return bytes not str. Add decode() at zlib.decompress()
coincidence_queries.py and time_deltas.py:
s_path is a vlarray of VLStringAtom() which returns bytes
not str. Add decode().
analysis subpackage passes tests on py2 and py3
This reverts commit e16c88d.
Relative imports are fine.
These string should be ascii only, so encoding should be irrelevant.
Set encoding to utf-8 to prevent errors on systems with non-standard
default encoding.
_store_reconstruction() calls min(event['n1'], ...)
This fails on python 3 becuase MagicMock is **unordered** on Python 3.
events is created as a mock of a dictionary (numpy.array) which always
returns '42.' on every __getitem__
sapphire.corsika.blocks.RunHeader object in tables.node_attr
needs updated PyTables (3.2.4?) that uses 'latin1' encoding for
pickles after UnicodeDecodeError.
Some tests involving patch __builtin__.open still fail on py3
patch.object(six.moves.builtins, 'open') as mock_file does not work.
patch.object(six.moves.builtins, 'open') as mock_file does not work.
Implementation of random.randint is version specific. We need to
patch and mock it.
unittest.TestCase.assertEqual cannot compare (nan, nan) tuples.
unittest.TestCase.assertEqual cannot compare (nan, nan) tuples.
@153957
Copy link
Member

153957 commented Oct 28, 2016

I agree, I would say; either make it a context manager, or maybe add a 'finish()' method, like in some other classes, or both.

@tomkooij
Copy link
Member Author

I'll change close() to finish(). And make it a contextmanager now that I have the files open...

@tomkooij
Copy link
Member Author

tomkooij commented Oct 28, 2016

The interactive testrunner sapphire.tests.run_tests() fails on the relative imports of esd_load_data in test_publicdb and test_esd.

Absolute imports break python setup.py test... EDIT: should be fixed in 32de60d

relative imports either break imports from outside package (i.e. make
test) or import from within the package.
@tomkooij
Copy link
Member Author

I've done some CORSIKA stuff (running groundparticles sims locally and on stoomboot) and fixed a small problem.

Although it still lacks extensive testing, I have removed the don't merge yet label. This is safe for production without breaking everything at once. The relatively untested bits (corsika package) are not for the average user anyway.

@153957
Copy link
Member

153957 commented Oct 29, 2016

Thank you Tom! Very well done!
Both improving/fixing SAPPHiRE and PyTables!

@153957 153957 merged commit 9f1241a into master Oct 29, 2016
@davidfokkema
Copy link
Member

Damn. Does this mean I'll have to learn Python 3? Nice work, Tom!

@tomkooij
Copy link
Member Author

Thanks!

@davidfokkema
Copy link
Member

Seriously though, is everyone switching over to Python 3? Only SAPPHiRE? Or also publicdb, etc.?

@153957
Copy link
Member

153957 commented Oct 31, 2016

First SAPPHiRE since it it was users use, they are now free to choose between 2 and 3.

The systems on which publicdb/datastore/PySPARC/station-software.. etc are used are systems over which we have more control, so we can choose to convert them entirely to Python 3, without keeping Python 2 support.

@153957
Copy link
Member

153957 commented Oct 31, 2016

The trouble with the publicdb/datastore/station-software is that those have less test coverage than SAPPHiRE, so it will be harder to determine if switching to Python 3 breaks things.

@tomkooij
Copy link
Member Author

tomkooij commented Nov 1, 2016

For datastore/publicdb it would be better to change to python 3 when doing a big update. Since these don't seem to be necessary at the moment, we better stick to 2.7. We still have a few years...

I have been almost exclusively using python 3 since july (with sapphire on this branch) and I see no reason turning back. As we are slowly starting to use python 3 in programming classes, it is nice to be able to use python 3 for HiSPARC as well.

Next year, we can let the LiO's use Python 3.

@davidfokkema
Copy link
Member

The problem with 'we have a few years' is that the end-of-life of python 2 more or less coincides with Kasper's rush to finish his thesis. He won't have time to work on it then, and we might not be around. Already, I find it very hard to find some time to work on HiSPARC stuff, and Arne is finishing his thesis while having moved on to another job. Oh well...

@153957 153957 deleted the py2to3 branch November 1, 2016 12:19
@153957
Copy link
Member

153957 commented Nov 1, 2016

Ok, so @kaspervd better start working on converting datastore/publicdb/PySPARC/... to Python 3 😉.

I wonder how difficult it will be for some of those, for most you can probably get a long way with simply using 2to3.

@kaspervd
Copy link

kaspervd commented Nov 1, 2016

Noted and added to my to-do list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants