Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python2 3 #81

Closed
wants to merge 17 commits into from
Closed

Python2 3 #81

wants to merge 17 commits into from

Conversation

borrob
Copy link
Collaborator

@borrob borrob commented Jan 20, 2019

With the official support for python2 ending within a year (https://www.python.org/dev/peps/pep-0373/#id4), I totally agree with @sebastic: it's time to add python3 compatibility - see issue #23 .

I prepared a merge request for Stetl to work with both python2 and python3. All of the nosetests and all of the basic examples are working under both versions. Is there more to check?

A very unfortunate change I had to make is the use of 'custom' filters that are not included in the stetl library. For instance in basic example 12: I added stetl=False in the etl.cfg file. This was necessary for chaining the steps together in stetl/factory.py, line 23. Not ideal at all since it migth break other tools using stetl.

Please: any suggestions, comments or ideas are welcome. I assume there will be some discussion and changes before it will be merged.

@justb4
Copy link
Member

justb4 commented Jan 20, 2019

First quick response: very welcome PR @borrob (and @sebastic for first starting this).! (Was actually thinking about this, this week, as all my current projects are in Python3). I expected indeed problems around 'factory' methods and imports. Maybe there is some way to get rid of the 'stetl=False' config requirement (maybe the same as in other imports with try/except?

The current Travis failure is mainly some minor flake8 errors like empty lines with spaces, code style. You can solve them by running flake8 in the root of the Stetl GH project. (And always run flake8 before commits.) I will need to review the rest in more detail, to be continued.

@borrob
Copy link
Collaborator Author

borrob commented Jan 20, 2019

Thanks for the quick feedback. A good one about flake8. I didn't realize that yet. Luckily it seems easy to fix!

I feel there should be an better way to deal with the stetl=False thing, but I thought this was a good point to ask for some feedback. Perhaps expanding the idea on the try/catch and try with and without the stetl prefix might work better.

All my work is also in python3, so that is why I started to work on getting Stetl there as well. I think we can improve on clarity and coding style when we can concentrate on either python 2 or 3, but I guess we want python3 support and we're not yet ready to drop python2.

@justb4
Copy link
Member

justb4 commented Jan 21, 2019

Think we should aim for Python3-only. Python2 EOL clock is literally ticking. This project is too small (in terms of number of developers) and specialized to maintain both Py2 and Py3 forever. Like with Django (only Python 3 since Django 2) we/I can bring out a last Stetl version (1.3) that supports Python2, maybe a 1.4 that supports both, and move to Python3 starting with Stetl v2.0. This is an aim, there may be practical issues in particular specific Stetl Components (Plugins) written (like I do) for Python2. What helps is that the preferred way to run Stetl is via Docker (the Dockerfile needs modification there as well.).

I am trying to test your changes in two virtualenvs. Will let you know of progress. Think we both have other work as well, so don't feel rushed! Hopefully some feedback from another main Stetl dev @fsteggink here...

@fsteggink
Copy link
Collaborator

+1

@borrob
Copy link
Collaborator Author

borrob commented Mar 11, 2019

The code is a bit ugly, but that is to enable both Py2 and Py3. The sooner we can drop Py2, the sooner we can clean it up, but that is for the project board/maintainer to decide ;). I hope we can fix this in time, so NLExtract can benefit from the work.

@justb4
Copy link
Member

justb4 commented Mar 12, 2019

Rethinking and seeking feedback: I think we should first bring out a last Python 2 version 1.3, make a branch of it for quickfix/maintenance, and then move to Python3 in Stetl v2.0 and not bother with any version supporting both Py2 and Py3. So that would mean your PR just needs to focus on Py3. Would that make the PR easier? Thoughts?

@sebastic
Copy link
Contributor

sebastic commented Mar 12, 2019

Moving to Python3-only certainly makes life easier, no need to deal with six for Python 2 compatibility.

With Python 2 going EOL in 2020, having a Stetl branch for Python 2 in maintenance mode makes sense.

You still need to be careful/conservative with the Python 3 features you require, targeting compatibility with Python 3.3 (perhaps 3.5) or newer should be a good baseline.

CentOS/RHEL:

  • 6 has 3.4.8 in EPEL
  • 7 has 3.4.9 & 3.6.6 in EPEL

Debian:

  • oldold stable (wheezy) has 3.2.3
  • old stable (jessie) has 3.4.2
  • stable (stretch) has 3.5.3
  • testing (buster) has 3.7.2

Ubuntu:

  • precise has 3.4.2
  • trusty has 3.4.0
  • xenial has 3.5.1
  • bionic has 3.6.7

@justb4
Copy link
Member

justb4 commented Mar 12, 2019

@sebastic good point! Was testing @borrob 's branch with Python 3.7.2 in a venv, so yes we need to pick a baseline. 3.3, 3.5? open for suggestions.

@borrob I see most of the PR code dealing with handling imports of Stetl packages.
I realize the current setup is a bit confusing (at least for me now).
The style in Py3 I see e.g. in Django core is to use the component/product name as top-package, and relative imports from within the same package.

e.g. in a sub-package like stetl/inputs/fileinput.py:

from stetl.component import Config
from stetl.input import Input
from stetl.util import Util, etree

and within same package relative imports:

from .packet import Packet

Also here for discussion.
It would probably mean a lot of changes to Stetl .cfg files....

@borrob
Copy link
Collaborator Author

borrob commented Mar 14, 2019

The python version list sorted differently:

3.2.3 wheezy
3.4.0 trusty
3.4.2 jessie, precise
3.4.8 RHEL6
3.4.9 RHEL 7
3.5.1 xenial
3.5.3 stretch
3.6.6 RHEL7
3.6.7 bionic
3.7.2 buster

So 3.4.2 would be a good starting point (just leaving out wheezy and trusty). If you would start using stetl on a new project and thereby picking a 'new' machine, you would get either stretch or RHEL7 and thus allowing us to start with 3.5. I'm not sure if this thought train is valid though.
If we compare it to others:

+1 for the idea of keeping a maintenance branch around for python 2 and focus new work on python 3.

My personal preference is to stick to 'full' import, but I'm happy to follow other guidelines if there are other preferences.

With respect to the config files: perhaps change it to specify the full path (from inputs.fileinput.JsonFileInput to stetl.inputs.fileinput.JsonFileInput:

[input_geojson_file]
class = stetl.inputs.fileinput.JsonFileInput
file_path = input/cities.geojson

Users can then still specify there own classes (that don't start with stetl). Passing it on as a suggestion.

@justb4 : the changes are mainly about the importing, but there are some changes on catching exceptions, a couple of things with iterators, and off course the change in the config file with stetl=False (that I mentioned in the first post of the MR).

@justb4
Copy link
Member

justb4 commented Mar 15, 2019

@sebastic and @borrob thanks for all investigations! My votes go for:

  • use 3.4.2 as baseline and e.g. for Docker image
  • +1 for full path in Stetl .cfg files, e.g. class = stetl.inputs.fileinput.JsonFileInput

For the latter: we then can do away with the Stetl = False hack IMO?

imports? I would like to follow the most common style as in Django, Flask, Celery etc
I must admit, also having worked too long with Py2, that I lack deep knowledge in this area, especially for relative imports. Can I leave this to your judgements?

Celery: uses full imports all over, and within a package relative 'dot' style, like in this random file:

from celery.utils.functional import maybe_list
from celery.utils.log import get_logger
from celery.utils.serialization import jsonify, strtobool
from celery.utils.time import rate

from . import state as worker_state
from .request import Request

Django: quick walkthrough, similar style as Celery, like in this random file.

Update: tried 'Celery/Django' import styles above on @borrob 's python23 branch in 3.4.2. venv, examples/basics: works, also after python setup.py install!

  • using stetl. style in .cfg files (~global replace class = by class = stetl.)
  • no stetl. prefixing in Factory.py
  • no Stetl = False hack in mycomponent example
  • no ImportError fallbacks needed

So i.s.o.

try:
    from component import Component
    from util import Util
except ImportError:
    from stetl.component import Component
    from stetl.util import Util

this:

from .component import Component
from .util import Util

@borrob
Copy link
Collaborator Author

borrob commented Mar 15, 2019

Great! I think we have a plan.

@borrob
Copy link
Collaborator Author

borrob commented Mar 16, 2019

Perhaps it's best to close this MR?

I'll start work on setting up a new branch with docker and convert to python3 (thereby dropping support for python2).

@justb4
Copy link
Member

justb4 commented Mar 18, 2019

Whatever you like: you can also do additional commits on this PR because of the interesting discussions, and then squash them together in a single commit. This makes it easier to review full changes.

In any case we need some coordinated actions:

  • @justb4 : release v1.3 as last Py2 supported from current master
  • @justb4 : make branch/tag for v1.3
  • ditto for Stetl Docker Image version 1.3
  • Docker-related: some projects use the Docker Image latest version: pin these to 1.3..
  • all, lead: @borrob : work/review on this (or new) PR to support Py3 only (3.4.2), also migration guidance doc
  • @justb4 :merge PR
  • @justb4 : bring out v2.0 as Py3-only Stetl
  • all: migrate Stetl projects like NLExtract to v2.0

@justb4 justb4 mentioned this pull request Mar 18, 2019
@justb4 justb4 added this to the Version 2.0 milestone Mar 18, 2019
@borrob
Copy link
Collaborator Author

borrob commented Mar 19, 2019

I already started working on a new branch to solely support py3 (3.4.2). All the basic examples are working, all nose tests pass, and all flake8 checks validate too.

I still want to see if there is more clean-up work, prepare the setup.py, check the docker image, check the documentation, ...

@justb4
Copy link
Member

justb4 commented Mar 20, 2019

@borrob good to hear! I have some time this week to work on this. I can start with the v1.3 release work. You can merge (mainly text, version info, no code changes) that into your branch as starting point ok?

I can clone your branch and test, possibly make PRs for your branch then. ok?

@justb4 justb4 mentioned this pull request Mar 20, 2019
justb4 added a commit that referenced this pull request Mar 20, 2019
@justb4
Copy link
Member

justb4 commented Mar 20, 2019

Ok, Stetl 1.3 released: PyPi and DockerHub plus 1.3 Tag+Branch made!

justb4 added a commit to justb4/stetl that referenced this pull request Mar 20, 2019
…fix 10_jinja2 ex, docker-run examples script
@justb4
Copy link
Member

justb4 commented Mar 20, 2019

Tip: to verify a Stetl source branch, in top/root-dir, I do:

  • make a virtualenv e.g. for Python 3.4.2 (I use pyenv on MacOSX Homebrew)
  • install dependencies, GDAL may be tricky, usually pip install gdal==<version> with v from gdalinfo --version
  • flake8 (verify syntax/coding standards)
  • nose2 (verify all tests ok)
  • python setup.py install (installs stetl command and Stetl code in site-packages for venv)
  • cd examples/basics; ./runall.sh > runall.log 2>&1 - run all basic examples from installed version, capture logs
  • inspect runall.log for errors or strange outputs

And with Docker (see later):

  • docker build -t geopython/stetl:2.0 .
  • build Docker Image:
  • cd examples/basics; ./runall-docker.sh > runall-docker.log 2>&1 - run all basic examples from Docker Image
  • inspect runall-docker.log for errors or strange outputs

Will add this to CONTRIBUTING.txt.

borrob added a commit to borrob/stetl that referenced this pull request Mar 20, 2019
geopython#23 geopython#81 Dockerfile building/working, path fix 10_jinja2 ex, docker-run examples script
@borrob
Copy link
Collaborator Author

borrob commented Mar 20, 2019

@justb4 : can you open a 'project board' on github for this conversion-work? That might make it easier to track all the tasks and todos.

@justb4
Copy link
Member

justb4 commented Mar 21, 2019

@justb4 : can you open a 'project board' on github for this conversion-work? That might make it easier to track all the tasks and todos.

Done: see Project 1, though this is the first time I work with Cards (usually worked with issue-only Projects). Tried to add what is dangling/done.

Ok? I think most of the work has already been done, but we should be aware of the last (often nasty) 10%...

@borrob
Copy link
Collaborator Author

borrob commented Mar 21, 2019

Great. I think this is a helpful tool to keep track of these last, nasty 10%.

@justb4
Copy link
Member

justb4 commented Mar 22, 2019

@borrob Good. You should have access to the repo/project.

@justb4
Copy link
Member

justb4 commented Mar 22, 2019

@borrob did you work on https://github.com/geopython/stetl/projects/1#card-19171400 ? Looks like borrob/stetl master is even with geopython/stetl master, but the borrob/stetl python3 branch still needs to merge-in geopython/stetl master (did you do a checkout python3 first?). e.g.
https://github.com/borrob/stetl/blob/python3/stetl/version.py should be 1.3 (and best be changed to 2.0rc1 later)

@borrob
Copy link
Collaborator Author

borrob commented Mar 22, 2019

Working on it right now. Yesterday I just did a quick update to get my master branch in line with the upstream stetl.

@borrob
Copy link
Collaborator Author

borrob commented Mar 22, 2019

I merged 1.3 into the python3 branch, made some smaller changes, and bumped the version to 2.0rc1. I think the code is about ready, that we can move the card prepare python3 to 'done', and prepare a PR for the official geopython/stetl repo (so we can start with the card for the PR). Any last issues to address, or can I start preparing the PR?

@justb4
Copy link
Member

justb4 commented Mar 23, 2019

@borrob yes, let's move ahead! I've just tested borrob:python3 on my fork. Found no issues. Seen you have Card access, so whenever you find time you can change Card states, and prepare the PR for geopython/stetl .

@justb4
Copy link
Member

justb4 commented Apr 11, 2019

No need for this PR..We're all done, Stetl v2 supporting Py3-only available on PyPi and DockerHub, now into maintenance mode, closing, thanks all @sebastic @borrob !

@justb4 justb4 closed this Apr 11, 2019
@fsteggink
Copy link
Collaborator

Wonderful, many thanks ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants