New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting to Python 3 #6062

Open
Eric-Arellano opened this Issue Jul 2, 2018 · 12 comments

Comments

7 participants
@Eric-Arellano
Copy link
Contributor

Eric-Arellano commented Jul 2, 2018

Hello, I'm an intern from Foursquare working with @mateor and @iantabolt to migrate internally to Python 3, and one of our blockers is not being able to import Pants within Python 3.

I'm hoping to be able to spearhead migrating Pants to Python 3 that is backwards compatible with 2.7.

Scope

  • Convert source code to Python 3.6+
  • Maintain library compatibility with Python 2.7

"Attempting to write Python 3 code that’s compatible with Python 2 is much more rewarding than the opposite" (Django authors)

Porting guides

Official Python guide
Python-future library, tools and imports to maintain interopable codebase

Tasks

  • Run futurize --stage1 for automatic and safe fixes; code still Python 2
  • Add future library, which provides cross-compatibility functions and tools (expanded six)
  • Run futurize --stage2 for automatic yet unsafe changes. Still Python 2, although using Python 3 semantics in most places.
  • Get all unit tests to pass on both Python 2 and Python 3.
  • Get all integration tests to pass on both Python 2 and Python 3.
  • Switch to Python3 under-the-hood.
  • Release universal library to PyPi etc.

@mateor mateor self-assigned this Jul 2, 2018

@mateor

This comment has been minimized.

Copy link
Member

mateor commented Jul 2, 2018

Hi @stuhood - thanks for the interest. Foursquare will work with Eric on the patch but technical advice and review from the Pants community makes a tremendous difference.

Stu and I chatted about this before opening the issue - he suggested that @benjyw, @CMLivingston and @kwlzn would be interested in the discussion.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Jul 2, 2018

Also pinging @jsirois on this, because he should be able to answer questions about some of the CI infrastructure changes needed.

The first few steps look reasonable to me though.

@kwlzn

This comment has been minimized.

Copy link
Member

kwlzn commented Jul 3, 2018

awesome to have you working on this @Eric-Arellano! based on what I've seen floating around for Python 3.7.x perf improvements (with 3.7.0 just released last week and 3.7.1 expected in July), its possible that this effort could lead to a significant runtime speedup once pants can run seamlessly on 3.7.x. so it'd be super interesting to post some early benchmark numbers for various pants invocations (e.g. time ./pants list ::) once we get to an initially running state (even in a PR branch).

one quick thought on the path forward for both 2.7 + 3.x testing is that running the entirety of CI against both interpreter versions will effectively ~double the CI runtime (which is already quite lengthy at around an ~hour and a half). given that added cost, it could make sense to hasten a jump to 3.x as the "recommended" python version for pants with a strategy that could look something like:

  1. get all at major consumers onboard for an eager switch to 3.x (potentially dangling the aforementioned perf #'s as carrot) as soon as this is dogfoodable
  2. announce deprecation of python 2.7 support as of some two-major-versions-away target release, after which python 2.7 execution will fail (we could even assert a required minimum version at this point for safety)
  3. run CI in a dual-mode config for the next two major releases - possibly as sidecar (e.g. different github org) like we've done in the past for osx to avoid CI taking 3h per run
  4. kill 2.7 in CI after the deprecation release and recommend/assert a requirement for 3.x (ideally 3.7) as the defacto interpreter going forward

from then on, we could then begin to utilize 3.x features (like e.g. data classes) in pants' codebase with the need for 2.7 compatibility behind us. we could also peel away the six/future compatibility layer in favor of idiomatic 3.x code at that point as well.

In terms of blockers, I believe there is at least one known issue to solve related to pants' execution interpreter vs target interpreters for working with python targets (which is currently biting us for 2.x -> 3.x IIRC) - so it'd be good to sync up with @CMLivingston on that so we can all be working strategically to unblock the initial dogfooding.

@Eric-Arellano

This comment has been minimized.

Copy link
Contributor

Eric-Arellano commented Jul 3, 2018

@kwlzn I am completely on board with that - great strategy.

Python and most key libraries are dropping official support by January 1 2020 (new versions of Numpy, Scikit, Django, Pylint, etc already dropped support in newest versions). Hence, I think a lot of organizations are this year finally dealing with Python3 migration, and it is much more tractable to urge them to upgrade now.

Once we drop Python2, removing future and six should fortunately be quite easy. They are designed to be Python3 first, and for the most part future seems to act as a no-op when using Python 3. Theoretically we will only need to delete the imports and then we can start using fully Python 3 features (type hints, data classes, and f-strings are the most alluring).

@benjyw

This comment has been minimized.

Copy link
Contributor

benjyw commented Jul 3, 2018

Ah, this is awesome! We've been circling this for a while, glad someone's finally going in for a bite!

I agree with @kwlzn that there is no need for long-term support for running Pants on 2.7. The interpreter Pants runs on is sort-of an implementation detail, not parts of its API. Although, as you've discovered, the fact that your own code (specifically, custom pants plugins) can use Pants as a library muddies those waters... So an eager switch is 💯.

stuhood added a commit that referenced this issue Jul 3, 2018

Run `futurize --stage1` to make safe changes for python 3 compatibili…
…ty. (#6063)

### Problem
First stage of [Python 3 port](#6062)

### Solution
Stage 1 of the futurize tool only makes safe changes, and it keeps the code base at Python 2. See http://python-future.org/futurize.html#forwards-conversion-stage1. Stage 2 will later convert this to actual Python 3 code.

This command was ran on
- src/python
- test/python
- contrib
- examples
- pants-plugins
- build-support/bin/check-header-helper.py

(not ran on testprojects)

stuhood added a commit that referenced this issue Jul 6, 2018

Add future lib and port src/base (#6067)
### Description
Implements stage 2 of [porting plan](#6062), which is adding the future library.

Also ports the `src/pants/base` and `tests/python/pants_test/base` as a demo of how this tool works.

At this stage, we aren't testing things yet with the Python 3 interpreter, even though technically it's Python 3 code. The goal for now is to make sure things still work on Python 2. 

#### FYI - how the backports work
The `future` library adds several modules: `future`, `builtins`, `past`, and individual ports like `queue` (instead of `Queue`). 

These added names all correspond exactly to Python 3 code. When running Python 3, it will simply use the native Python 3 version. When using Python 2, it will use the library implementation, which is semantically equivalent to Python 3.

#### FYI - using old `moves` interface
Typically `futurize` will try to use this idiom for certain backports

```python
from future import standard_library
standard_library.install_aliases()

import [backports]
```

This however does not work in our codebase, because all imports must appear above code with iSort, so the `install_aliases()` line gets moved below the relevant imports. 

Instead, we are using the legacy `future.moves` library. Once we remove Python 2 support we'll want to convert this type of import `from future.moves.itertools import x` to `from itertools import x`.

stuhood added a commit that referenced this issue Jul 6, 2018

Port util metaprogramming files to Python3 (#6072)
Part of #6062 

## Problem
`ABCMeta` expects `unicode` on Python3, but `bytes` on Python2. 

## Solution
We use feature detection to try with idiomatic Py3 and fall back on Py2. This is [preferred over version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) (i.e. checking which Python interpreter is used).

Note that the source of this file, [`twitter.commons.lang`](https://github.com/twitter/commons/blob/master/src/python/twitter/common/lang/__init__.py#L179), did not have to do this because they don't use `from __future__ import unicode_literals`.

stuhood added a commit that referenced this issue Jul 9, 2018

Port majority of pants/util (#6073)
Ports everything `util/` except for `objects.py` and `meta.py` (see #6072), `process_handler.py`, and `tarutil.py`.

Part of #6062

stuhood added a commit that referenced this issue Jul 10, 2018

Port backend/native (#6084)
Part of porting project #6062

stuhood added a commit that referenced this issue Jul 10, 2018

Port backend/python (#6086)
Part of porting project #6062

CI fails for me on mac (we discussed this in slack), so hopefully this works since I haven't tested it..

This was referenced Jul 10, 2018

stuhood added a commit that referenced this issue Jul 11, 2018

Port fs to python 3 compatibility (#6091)
Part of #6062

P.S. Let me know if these PRs are too small - I've been keeping it one directory per PR so that we can use `git bisect` / debug easier and make it easier to do code review. This means we'll eventually have 54 PRs in total ([spreadsheet tracking this all](https://docs.google.com/spreadsheets/d/1i4xFUrgejVFp6NSwT9aGBQWyjbiDIuqMqGEIxcSMCC0/edit#gid=0))
@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Jul 31, 2018

I like @kwlzn's proposed steps above in #6062 (comment) , and have a few minor tweaks.

  1. get all at major consumers onboard for an eager switch to 3.x (potentially dangling the aforementioned perf #'s as carrot) as soon as this is dogfoodable
  2. announce deprecation of python 2.7 support as of some two-major-versions-away target release, after which python 2.7 execution will fail (we could even assert a required minimum version at this point for safety)
  3. run CI in a dual-mode config for the next two major releases - possibly as sidecar (e.g. different github org) like we've done in the past for osx to avoid CI taking 3h per run
  4. kill 2.7 in CI after the deprecation release and recommend/assert a requirement for 3.x (ideally 3.7) as the defacto interpreter going forward

The tweaks:
0. We should begin running pants' unit tests in both Python 2 and Python 3 ASAP. These are relatively cheap and valuable.
2.1. I'll send out mail to pants-devel@ proposing items 1 and 2 to give people time to object.
3. I'm not sure that we'll be able to dual-mode all of the integration tests for CI, but I think that we should add some integration test decorators similar to the @ensure_daemon and @ensure_resolver decorators, and use them liberally. Then...
3.5. Swapover integration tests from py2 to py3 by default in CI.

I'll look at 0, 2.1, and 3 this week and next.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Aug 8, 2018

#6289 was green on its last run, so I'll land that ASAP.

The Py3 shard is reporting that 104 out of 217 unit test targets is passing, which IMO is pretty excellent. gist here: https://gist.github.com/stuhood/c4c1bd4ba824ef82ab2ab365830ac569

I'm not sure what sort of automated checks we could do on that shard to help us track the fact that it is increasing, but certainly any PRs that are intended to improve Python3 compatibility should be diffing the summary from the end of the shard.

Thanks @Eric-Arellano !

stuhood added a commit that referenced this issue Aug 8, 2018

Switch to Py2 and Py3 shards. (#6289)
### Problem

In order to avoid backsliding as changes are made in service of #6062, we should begin running a travis shard that uses python 3 even before it has gone completely green.

### Solution

Repurpose one of the unit test shards to run with python 3, with abbreviated logging and `|| exit 0`. This should allow us to track the total failure count under python 3, and attempt to push it down.
@illicitonion

This comment has been minimized.

Copy link
Contributor

illicitonion commented Aug 10, 2018

#6323 makes the py3 shard green, and gives us a specific list of unit tests which are known to be broken in py3.

I would suggest we move forwards by taking those targets in turn, fixing them, and removing them from the list of broken tests so that they don't regress :)

@illicitonion

This comment has been minimized.

Copy link
Contributor

illicitonion commented Aug 10, 2018

Merged #6323 :)

@jsirois

This comment has been minimized.

Copy link
Member

jsirois commented Aug 17, 2018

Noting two some pants_requirement work that will need some mods as we switch over to python 3 only: #6358 #6359.

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this issue Aug 27, 2018

Switch to Py2 and Py3 shards. (pantsbuild#6289)
### Problem

In order to avoid backsliding as changes are made in service of pantsbuild#6062, we should begin running a travis shard that uses python 3 even before it has gone completely green.

### Solution

Repurpose one of the unit test shards to run with python 3, with abbreviated logging and `|| exit 0`. This should allow us to track the total failure count under python 3, and attempt to push it down.
@Eric-Arellano

This comment has been minimized.

Copy link
Contributor

Eric-Arellano commented Nov 5, 2018

Update: all unit tests are now passing on both Python 2 and Python 3! The next stage is getting all integration tests passing for Python 3.

As discussed in Slack, our overall CI strategy will be to run Python 3 for all integration shards, and then run a nightly Python 2 job to check for regressions.

Since some integration tests still fail on Python 3, here is the path forward:

  1. Identify all failing tests.
  2. Switch existing shards to run Python 3 and blacklist currently failing tests.
  3. Add new Py2 shard to run the blacklisted tests.
  4. Setup the nightly Py2-only job.

@Eric-Arellano Eric-Arellano referenced this issue Nov 8, 2018

Open

Add type hints to codebase #6742

0 of 6 tasks complete
@Eric-Arellano

This comment has been minimized.

Copy link
Contributor

Eric-Arellano commented Dec 20, 2018

Update: all tests—integration and unit—are now passing Python 3!

The next step is getting Pants to work with Python 3 under-the-hood. To test this out, apply the following diff:

diff --git a/build-support/virtualenv b/build-support/virtualenv
index d34e285fa..0f8d617a8 100755
--- a/build-support/virtualenv
+++ b/build-support/virtualenv
@@ -14,10 +14,10 @@ fi
 source ${REPO_ROOT}/build-support/common.sh

 if [[ -z "${PY}" ]]; then
-  if which python2.7 >/dev/null; then
-    PY=`which python2.7`
+  if which python3 >/dev/null; then
+    PY=`which python3`
   else
-    die 'No python2.7 interpreter found on the path.  Python will not work!'
+    die 'No python3 interpreter found on the path.  Python will not work!'
   fi
 fi

diff --git a/pants.ini b/pants.ini
index ba51a2631..a60b2948b 100644
--- a/pants.ini
+++ b/pants.ini
@@ -372,7 +372,7 @@ verify_commit: False
 # We only support pants running under 2.7 for now with 3.3+ support to be added later.
 # Any example or test targets that are meant to test interpreters outside pants own
 # acceptable set should specify an explicit compatibility constraint.
-interpreter_constraints: ["CPython>=2.7,<3"]
+interpreter_constraints: ["CPython>=2.7,<3", "CPython>=3.6"]
 interpreter_cache_dir: %(pants_bootstrapdir)s/python_cache/interpreters
 resolver_cache_dir: %(pants_bootstrapdir)s/python_cache/requirements

There are several issues newly surfaced, which will hopefully be closed out soon.

@cosmicexplorer cosmicexplorer added this to In progress in python 3 migration Jan 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment