# Why your Python 2 code has no future

![CPython EoL](Py2_EoL.png)

* Python 3 introduced a few incompatible language changes. Most Python 2 programs will need some adjustement to run under Python 3.
* Python 3 is _massive_ improvement of the language, though. Switching is worth the effort. And you don't have a choice, anyway ...

# It's not only Python itself

<img src="Packages_EoL.png" width="800"/>
<a href="https://python3statement.org/">https://python3statement.org/</a>

# Why can't I just stay on Python 2?

* Systems will be updated and obsolete package will be dropped
* You'll be hit by bugs that won't be fixed
* You want to be able to collaborate with others without backporting to obsolete software.

The Python 3 ecosystem has been mature for a few years. The longer you write Python 2 code, the more code you will have to convert. Plus you miss out on a lot of nice improvements.

# The obvious stumbling blocks

`print()` is now a function (with additional options, e.g.):

In [12]:
%%python2
import sys

print "Hello World!"
for i in range(2):
    print "Computed", i, "iterations"
    sys.stdout.flush()

Hello World!
Computed 0 iterations
Computed 1 iterations


In [13]:
for i in range(2):
    print("Computed", i, "iterations", flush=True)

Computed 0 iterations
Computed 1 iterations


# How many times have you been bitten by integer division?

True division is now the standard division:

In [13]:
%%python2
import numpy as np

def volume_sphere(r):
    return 4/3 * np.pi * r**3

print "Volume unit sphere", volume_sphere(1)
print "4/3 evaluates to", 4/3

Volume unit sphere 3.14159265359
4/3 evaluates to 1


In [14]:
def volume_sphere(r):
    return 4/3 * np.pi * r**3

print(f"Volume unit sphere {volume_sphere(1)}")
print("4/3 evaluates to ", 4 / 3)
print("Integer divion still available with // operator: 4//3 = ", 4 // 3)

Volume unit sphere 4.1887902047863905
4/3 evaluates to  1.3333333333333333
Integer divion still available with // operator: 4//3 =  1


(N.B. syntax highlighting after cell magic only fixed in IPython 7.x)

# You can get these in Python 2

In [11]:
%%python2

from __future__ import print_function, division

print("Hello World!")
print("1/2 is", 1/2)

Hello World!
1/2 is 0.5


# range -> xrange -> range

* Python 2 introduced `xrange` to avoid the creation of very long lists in memory
* In Python 3 `range` returns a range object instead of a list.
* Python 3 drops `xrange`

In [19]:
%%python2

print range(10)
print xrange(10)
for i in xrange(10):
    print i,

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
xrange(10)
0 1 2 3 4 5 6 7 8 9


In [22]:
print(range(10))
for i in range(10):
    print(i, end=" ")
xrange(10)

range(0, 10)
0 1 2 3 4 5 6 7 8 9 

NameError: name 'xrange' is not defined

# Built-ins and methods return iterators instead of lists

In [37]:
%%python2

z = zip(range(5), range(5, 0, -1))
print z, type(z)

[(0, 5), (1, 4), (2, 3), (3, 2), (4, 1)] <type 'list'>


In [38]:
z = zip(range(5), range(5, 0, -1))
print(z)
print(list(z))
print(list(z))

<zip object at 0x7fae4087be48>
[(0, 5), (1, 4), (2, 3), (3, 2), (4, 1)]
[]


This is more memory efficient but prevents easy slicing/indexing.

Similar issue with `map()`, `.keys()`, `.items()`, etc.

In [43]:
import numpy as np
H = np.eye(2)
beta = np.ones(2)
V = np.eye(2)
r = np.zeros(2)

# The good stuff — Matrix multiplication

Imagine you need to implement something like this

$$S = (H\beta -r)^T (HVH^T)^{-1} (H\beta - r)$$

Python 3.5+ has a matrix multiplication operator that makes code much easier to read.

In [46]:
# Borrowed from PEP 465
import numpy as np
from numpy.linalg import inv, solve

# Using dot function:
S = np.dot((np.dot(H, beta) - r).T,
           np.dot(inv(np.dot(np.dot(H, V), H.T)), np.dot(H, beta) - r))

# Using dot method:
S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)

# With the @ operator, the direct translation of the above formula becomes:

S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

# Refactoring this matrix multiplication

In [45]:
# Easy to follow refactoring
# Original version
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

# (1) Avoid repeated computation of Hβ-r
trans_coef = H @ beta - r
S = trans_coef.T @ inv(H @ V @ H.T) @ trans_coef

# (2) solve(A, B) more numerically stable than dot(inv(A), B)
S = trans_coef.T @ solve(H @ V @ H.T, trans_coef)

# Nicer string formatting - f-strings

No more confusion about which variable gets printed where in a string (starting with Python 3.6).

In [48]:
value = 4 * 20
# Unnecessarily verbose
print("The value is {value}".format(value=value))

# Gets confusing with many variable to print
print("The value is {}".format(value))

# Finally, easy print formatting
print(f"The value is {value}")

# The {} can contain Python expressions
a, b = 7, 13
print(f"The sum of the values is {a + b}")

The value is 80
The value is 80
The value is 80
The sum of the values is 20


# Advanced unpacking

In [51]:
# Already possible in Python 2
a, _, _, _, b = range(5)
print(a, b)

# Now you can do this
a, *rest, b = range(5)
print(a, rest, b, sep=", ")  # Note another feature of the new print

# rest can go anywhere
*rest, a, b = range(5)
print(a, b)

0 4
0, [1, 2, 3], 4
3 4


# Get first and last line of a file

In [52]:
with open("file.txt", "r") as f:
    first, *_, last = f.readlines()
    
print(first)
print(last)

Step 1: Use Python 3

Step 10: Profit!



# Keyword only arguments

In [57]:
from astropy import cosmology

# This is bad style in Python! 
my_cosmo = cosmology.FlatLambdaCDM(70, 0.27, 0, 3.04, 0, 0.04)

# Explicit is better than implicit. API changes will break this code. Better:
my_cosmo = cosmology.FlatLambdaCDM(70, Om0=0.27, Tcmb0=0, Neff=3.04, m_nu=0, Ob0=0.04)

# Python 3 introduces keyword only arguments
def myfunction(a, b, *, kwname=None):
    print(f"a+b = {a+b}")
    if kwname is not None:
        print(f"kwname is {kwname}")

myfunction(2, 3, kwname="test")
myfunction(2, 3, "fail")

a+b = 5
kwname is test


TypeError: myfunction() takes 2 positional arguments but 3 were given

# Dictionary improvements

In Python 3.6+ dictionaries by default behave like `OrderedDict`, i.e. key insertion order is preserved. Same is true for `**kwargs`.

In [58]:
%%python2
x = {str(i):i for i in range(5)}
print x

{'1': 1, '0': 0, '3': 3, '2': 2, '4': 4}


In [59]:
x = {str(i):i for i in range(5)}
x

{'0': 0, '1': 1, '2': 2, '3': 3, '4': 4}

## They are also much more space efficient.

In [62]:
%%python2
from collections import defaultdict
import sys

d = defaultdict(float)
for i in xrange(10000000):
    d[i] = i
print sys.getsizeof(d)

402653472


In [68]:
from collections import defaultdict
import sys

d = defaultdict(float)
for i in range(10_000_000):  #  <- Notice the thousands separators. 
                             #  Also new in Python 3
    d[i] = i
print(sys.getsizeof(d) / 402653472)

0.8333329955739212


Also many performance improvements, esp. in Python 3.6+.

# How to transition your code

* If you maintain a software package for other users and want to keep your code compatible with Python 2 and 3, your not my target audience. You probably already know about [six](https://pypi.org/project/six/). You'll still have a lot of `if/then` in your code. Set a deadline to drop Python 2 support.

* If it's just your own code, transition asap. Tools like [2to3](http://python3porting.com/2to3.html) go a long way to automate the transition.

* How do I know that my code still works after translating to Python 3? Write [unit tests](https://github.com/joergdietrich/Code-Coffee-2018-05-08)!

These slides borrowed heavily from

https://python-3-for-scientists.readthedocs.io/en/latest/python3_features.html

https://www.asmeurer.com/python3-presentation/python3-presentation.pdf

https://github.com/arogozhnikov/python3_with_pleasure
