# Python review of concepts

Mainly to point out useful aspects of Python you may have glossed over. Assumes you already know Python fairly well.

## Python as a language

### Why Python? 

- Huge community - especially in data science and ML 
- Easy to learn 
- Batteries included 
- Extensive 3rd party libraries 
- Widely used in both industry and academia 
- Most important “glue” language bridging multiple communities

In [None]:
import __hello__

### Versions 

- Only use Python 3 (current release version is 3.8, container is 3.7) 
- Do not use Python 2

In [None]:
import sys

In [None]:
sys.version

### Multi-paradigm 

#### Procedural

In [None]:
x = []
for i in range(5):
    x.append(i*i)
x

#### Functional

In [None]:
list(map(lambda x: x*x, range(5)))

#### Object-oriented 

In [None]:
class Robot:
    def __init__(self, name, function):
        self.name = name
        self.function = function
        
    def greet(self):
        return f"I am {self.name}, a {self.function} robot!"

In [None]:
fido = Robot('roomba', 'vacuum cleaner')

In [None]:
fido.name

In [None]:
fido.function

In [None]:
fido.greet()

### Dynamic typing 

#### Complexity of a + b 

In [None]:
1 + 2.3

In [None]:
type(1), type(2.3)

In [None]:
'hello' + ' world'

In [None]:
[1,2,3] + [4,5,6]

In [None]:
import numpy as np

np.arange(3) + 10

### Several Python implementations! 

- CPtyhon 
- Pypy 
- IronPython 
- Jython

### Global interpreter lock (GIL) 

- Only applies to CPython
- Threads vs processes 
- Avoid threads in general 
- Performance not predictable

In [None]:
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

In [None]:
def f(n):
    x = np.random.uniform(0,1,n)
    y = np.random.uniform(0,1,n)
    count = 0
    for i in range(n):
        if x[i]**2 + y[i]**2 < 1:
            count += 1
    return count*4/n

In [None]:
n = 100000
niter = 4

In [None]:
%%time

[f(n) for i in range(niter)]

In [None]:
%%time

with ThreadPoolExecutor(4) as pool:
    xs = list(pool.map(f, [n]*niter))
xs

In [None]:
%%time

with ProcessPoolExecutor(4) as pool:
    xs = list(pool.map(f, [n]*niter))
xs

## Coding in Python

In [None]:
import this

### Coding conventions 

- PEP 8 
- Avoid magic numbers 
- Avoid copy and paste 
- extract common functionality into functions

[Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/)

### Data types 

- Integers  
    - Arbitrary precision 
    - Integer division operator 
     - Base conversion 
     - Check if integer 

In [None]:
import math

In [None]:
n = math.factorial(100)

In [None]:
n

In [None]:
f'{n:,}'

In [None]:
h = math.sqrt(3**2 + 4**2)

In [None]:
h

In [None]:
h.is_integer()

- Floats 
    - Checking for equality 
    - Catastrophic cancellation 
- Complex

In [None]:
x = np.arange(9).reshape(3,3)
x = x / x.sum(axis=0)
λ = np.linalg.eigvals(x)

In [None]:
λ[0]

In [None]:
λ[0] == 1

In [None]:
math.isclose(λ[0], 1)

In [None]:
def var(xs):
    """Returns variance of sample data."""
    
    n = 0
    s = 0
    ss = 0

    for x in xs:
        n +=1
        s += x
        ss += x*x

    v = (ss - (s*s)/n)/(n-1)
    return v

In [None]:
xs = np.random.normal(1e9, 1, int(1e6))

In [None]:
var(xs)

In [None]:
np.var(xs)

- Boolean 
    - What evaluates as False? 

In [None]:
stuff = [[], [1], {},'', 'hello', 0, 1, 1==1, 1==2]
for s in stuff:
    if s:
        print(f'{s} evaluates as True')
    else:
        print(f'{s} evaluates as False')

- String 
    - Unicode by default 
    - b, r, f strings

In [None]:
u'\u732b'

String formatting

- Learn to use the f-string.

In [None]:
import string

In [None]:
char = 'e'
pos = string.ascii_lowercase.index(char) + 1
f"The letter {char} has position {pos} in the alphabet"

In [None]:
n = int(1e9)
f"{n:,}"

In [None]:
x = math.pi

In [None]:
f"{x:8.2f}"

In [None]:
import datetime
now = datetime.datetime.now()
now

In [None]:
f"{now:%Y-%m-%d %H:%M}"

### Data structures 

- Immutable - string, tulle 
- Mutable - list, set, dictionary 
- Collections module 
- heapq 

In [None]:
import collections

[x for x in dir(collections) if not x.startswith('_')]

### Functions 

- \*args, \*\*kwargs 
- Care with mutable default values 
- First class objects 
- Anonymous functions 
- Decorators

In [None]:
def f(*args, **kwargs):
    print(f"args = {args}") # in Python 3.8, you can just write f'{args = }'
    print(f"kwargs = {kwargs}")

In [None]:
f(1,2,3,a=4,b=5,c=6)

In [None]:
def g(a, xs=[]):
    xs.append(a)
    return xs

In [None]:
g(1)

In [None]:
g(2)

In [None]:
h = lambda x, y, z: x**2 + y**2 + z**2

In [None]:
h(1,2,3)

In [None]:
from functools import lru_cache

In [None]:
def fib(n):
    print(n, end=', ')
    if n <= 1:
        return n
    else:
        return fib(n-2) + fib(n-1)

In [None]:
fib(10)

In [None]:
@lru_cache(maxsize=100)
def fib_cache(n):
    print(n, end=', ')
    if n <= 1:
        return n
    else:
        return fib_cache(n-2) + fib_cache(n-1)

In [None]:
fib_cache(10)

### Classes 

- Key idea is encapsulation into objects  
- Everything in Python is an object 
- Attributes and methods 
- What is self? 
- Special methods - double underscore methods 
- Avoid complex inheritance schemes - prefer composition 
- Learn “design patterns” if interested in OOP

In [None]:
(3.0).is_integer()

In [None]:
'hello world'.title()

In [None]:
class Student:
    def __init__(self, first, last):
        self.first = first
        self.last = last
        
    @property
    def name(self):
        return f'{self.first} {self.last}'    

In [None]:
s = Student('Santa', 'Claus')

In [None]:
s.name

### Enums

Use enums readability when you have a discrete set of CONSTANTS.

In [None]:
from enum import Enum

In [None]:
class Day(Enum):
    MON = 1
    TUE = 2
    WED = 3
    THU = 4
    FRI = 5
    SAT = 6
    SUN = 7

In [None]:
for day in Day:
    print(day)

### NamedTuple

In [None]:
from collections import namedtuple

In [None]:
Student = namedtuple('Student', ['name', 'email', 'age', 'gpa', 'species'])

In [None]:
abe = Student('Abraham Lincoln', 'abe.lincoln@gmail.com', 23, 3.4, 'Human')

In [None]:
abe.species

In [None]:
abe[1:4]

### Data Classes

Simplifies creation and use of classes for data records. 

Note: NamedTuple serves a similar function but are immutable.

In [None]:
from dataclasses import dataclass

In [None]:
@dataclass
class Student:
    name: str
    email: str
    age: int
    gpa: float
    species: str = 'Human'

In [None]:
abe = Student('Abraham Lincoln', 'abe.lincoln@gmail.com', age=23, gpa=3.4)

In [None]:
abe

In [None]:
abe.email

In [None]:
abe.species

**Note**

The type annotations are informative only. Python does *not* enforce them.

In [None]:
Student(*'abcde')

### Imports, modules and namespaces 

- A namespace is basically just a dictionary 
- LEGB 
- Avoid polluting the global namespace

In [None]:
[x for x in dir(__builtin__) if x[0].islower()][:8]

In [None]:
x1 = 23

def f1(x2):
    print(locals())
    # x1 is global (G), x2 is enclosing (E), x3 is local
    def g(x3):
        print(locals())
        return x3 + x2 + x1 
    return g

In [None]:
x = 23

def f2(x):
    print(locals())
    def g(x):
        print(locals())
        return x 
    return g

In [None]:
g1 = f1(3)
g1(2)

In [None]:
g2 = f2(3)
g2(2)

### Loops 

- Prefer vectorization unless using numba 
- Difference between continue and break 
- Avoid infinite loops 
- Comprehensions and generator expressions

In [None]:
import string

In [None]:
{char: ord(char) for char in string.ascii_lowercase}

### Iterations and generators 

- The iterator protocol
    - `__iter__` and `__next__`
    - iter()
    - next()
- What happens in a for loop
- Generators with `yield` and `yield from`

In [None]:
class Iterator:
    """A silly class that implements the Iterator protocol and Strategy pattern.
    
    start = start of range to square
    stop = end of range to square
    """
    def __init__(self, start, stop, func):
        self.start = start
        self.stop = stop
        self.func = func
        
    def __iter__(self):
        self.n = self.start
        return self
    
    def __next__(self):
        if self.n >= self.stop:
            raise StopIteration
        else:
            x = self.func(self.n)
            self.n += 1
            return x

In [None]:
sq = Iterator(0, 5, lambda x: x*x)

In [None]:
list(sq)

### Generators

Like functions, but lazy.

In [None]:
def cycle1(xs, n):
    """Cuycles through values in xs n times."""
    
    for i in range(n):
        for x in xs:
            yield x

In [None]:
list(cycle1([1,2,3], 4))

In [None]:
for x in cycle1(['ann', 'bob', 'stop', 'charles'], 1000):
    if x == 'stop':
        break
    else:
        print(x)

In [None]:
def cycle2(xs, n):
    """Cuycles through values in xs n times."""
    
    for i in range(n):
        yield from xs

In [None]:
list(cycle2([1,2,3], 4))

Because they are lazy, generators can be used for infinite streams.

In [None]:
def fib():
    a, b = 1, 1
    while True:
        yield a
        a, b = b, a + b

In [None]:
for n in fib():
    if n > 100:
        break
    print(n, end=', ')

You can even slice infinite generators. More when we cover functional programming.

In [None]:
import itertools as it

In [None]:
list(it.islice(fib(), 5, 10))