# Python for AI/ML: Review of concepts

Mainly to point out useful aspects of Python you may have glossed over. Assumes you already know Python fairly well.

## Acknowledgments \& Credits

This lesson is adapted largely from the excellent curriculum materials by Cliburn Chan (2021) at <https://github.com/cliburn/bios-823-2021/> under the MIT License.

## Python as a language

### Why Python? 

Modules, executable notebooks, and tutorials exist in several programming languages widely used (or increasingly popular) in the life sciences (e.g., R, Julia, Rust). Why did we choose to focus on Python here? 
- Huge community - especially in data science and ML 
- Easy to learn 
- Batteries included 
- Extensive 3rd party libraries 
- Widely used in both industry and academia 
- Most important “glue” language bridging multiple communities

### Versions 

- Only use **Python 3**. ML frameworks (Tensorflow, PyTorch) at recent versions require at least Python v3.8. (Tensorflow 1.x can use up to Python v3.7; Tensorflow v2.2+ Python can use Python 3.8.)
- Do not use Python 2.
- Container has Python 3.11:

In [None]:
import sys
sys.version

### Multi-paradigm 

#### Procedural

In [None]:
x = []
for i in range(5):
    x.append(i*i)
x

#### Functional

In [None]:
list(map(lambda x: x*x, range(5)))

#### Object-oriented 

In [None]:
class Robot:
    def __init__(self, name, function):
        self.name = name
        self.function = function
        
    def greet(self):
        return f"I am {self.name}, a {self.function} robot!"

In [None]:
fido = Robot('roomba', 'vacuum cleaner')

In [None]:
fido.name

In [None]:
fido.function

In [None]:
fido.greet()

### Dynamic typing 

#### Complexity of a + b 

In [None]:
1 + 2.3

In [None]:
type(1), type(1.0), type(2.3)

In [None]:
'hello' + ' world'

In [None]:
[1,2,3] + [4,5,6]

In [None]:
import numpy as np

np.arange(3) + 10

In [None]:
np.array([1,2,3]) + np.array([4,5,6])

## Coding in Python

### Coding conventions 

- PEP 8 
- Avoid magic numbers 
- Avoid copy and paste 
- extract common functionality into functions

[Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/)

### Data types 

#### Integers  
- Arbitrary precision 
- Integer division operator 
- Base conversion 
- Check if integer 

In [None]:
import math

In [None]:
n = math.factorial(100)

In [None]:
n

In [None]:
f'{n:,}'

In [None]:
sys.getsizeof(64), sys.getsizeof(2**31), sys.getsizeof(2**60)

In [None]:
sys.getsizeof(n)

#### Floats
- Checking for equality 
- Roundoff error accumulation  

In [None]:
h = math.sqrt(3**2 + 4**2)

In [None]:
h

In [None]:
h.is_integer()

In [None]:
h == 5

In [None]:
x = np.arange(9).reshape(3,3)
x

In [None]:
x.sum(axis=0)

In [None]:
x = x / x.sum(axis=0)
z = np.linalg.eigvals(x)

In [None]:
z

In [None]:
z[0] == 1

In [None]:
z[0], z[1], z[2]

In [None]:
math.isclose(z[0], 1)

Sample variance:
$$var(X)=\frac{\sum_i^N(x_i-\mu)^2}{N-1}; \mu=\frac{1}{N}\sum_i^Nx_i$$

Short-cut formula to avoid having to first calculate sample mean $\mu$:
$$var(X)=\frac{\sum_i^Nx_i^2 - \frac{1}{N}(\sum_i^Nx_i)^2}{N-1}$$

In [None]:
def var(xs):
    """Returns variance of sample data."""
    
    n = 0
    s = 0
    ss = 0

    for x in xs:
        n +=1
        s += x
        ss += x*x

    v = (ss - (s*s)/n)/(n-1)
    return v

In [None]:
xs = np.random.normal(1e9, 1, int(1e6))
xs, type(xs[0])

In [None]:
var(xs)

In [None]:
np.var(xs)

#### Boolean 
- What evaluates as False? 

In [None]:
stuff = [[], [1], {},'', 'hello', 0, 1, 1==1, 1==2]
for s in stuff:
    if s:
        print(f'{s} evaluates as True')
    else:
        print(f'{s} evaluates as False')

#### String 
- Unicode by default 
- b, r, f strings

In [None]:
u'\u732b'

In [None]:
s = 'ACGT'
type(s), list(s)

In [None]:
s = b'ACGT'
type(s), list(s)

In [None]:
r"C:\Users\Name\Documents\file.txt"

In [None]:
print("String with newline\nThis is after the newline")
print(r"String with newline\nThis is after the newline")

String formatting

- Learn to use the f-string.

In [None]:
import string

In [None]:
char = 'e'
pos = string.ascii_lowercase.index(char) + 1
f"The letter {char} has position {pos} in the alphabet"

In [None]:
n = int(1e9)
f"{n:,}"

In [None]:
x = math.pi

In [None]:
f"{x:8.2f}"

In [None]:
import datetime
now = datetime.datetime.now()
now

In [None]:
f"{now:%Y-%m-%d %H:%M}"

### Data structures 

- Immutable - string, tulle 
- Mutable - list, set, dictionary 
- Collections module 
- heapq 

In [None]:
import collections

[x for x in dir(collections) if not x.startswith('_')]

In [None]:
d = {'a': 1, 'b': 'foo', 'c': 1e-6}
d

In [None]:
kw = {'b': 'bar', 'z': 5}
{**d, **kw}

In [None]:
kw2 = kw
kw2['z'] = 10
kw

In [None]:
l = [1,2,3]
l

In [None]:
[*l]

### Functions 

- \*args, \*\*kwargs 
- Care with mutable default values 
- First class objects 
- Anonymous functions 
- Decorators

In [None]:
def f(*args, c=1, **kwargs):
    print(f"f() c = {c}")
    print(f"f() args = {args}")
    print(f"f() kwargs = {kwargs}")
    return g(*args, **kwargs)

def g(*args, **kwargs):
    print(f"g() args = {args}")
    print(f"g() kwargs = {kwargs}")

In [None]:
f(1,2,3,a=4,b=5,c=6)

In [None]:
l = [4,5,6]
k = {'c':10, 'a': 4, 'b': 5}
f(l, k)

In [None]:
f(*l, **k)

In [None]:
def g(a, xs=[]):
    xs.append(a)
    return xs

In [None]:
g(1)

In [None]:
g(2)

In [None]:
h = lambda x, y, z: x**2 + y**2 + z**2

In [None]:
h(1,2,3)

In [None]:
class VerboseFunc:
    def __init__(self, func):
        self.func = func

    def __call__(self, *args):
        print(f'called with args = {args} for function {str(self.func)}')
        return self.func(*args)


In [None]:
import math
verbose_prod = VerboseFunc(lambda x, y: x * y)
verbose_prod(4,5)

### Classes 

- Key idea is encapsulation into objects  
- Everything in Python is an object 
- Attributes and methods 
- What is self? 
- Special methods - double underscore methods 
- Avoid complex inheritance schemes - prefer composition 
- Learn “design patterns” if interested in OOP

In [None]:
(3.0).is_integer()

In [None]:
'hello world'.title()

In [None]:
class Student:
    def __init__(self, first, last):
        self.first = first
        self.last = last

    @property
    def name(self):
        return f'{self.first} {self.last}'

    @staticmethod
    def is_student(obj):
        return isinstance(obj, Student)

    @classmethod
    def fromlist(cls, l):
        print(cls)
        while len(l) < 2:
            l = l + ['Nameless']
        return cls(*l)

In [None]:
s = Student('Santa', 'Claus')

In [None]:
s.name

In [None]:
Student.is_student(s)

In [None]:
s2 = Student.fromlist(['Santa','Claus'])
type(s2), s2.name

In [None]:
class GraduateStudent(Student):
    def __init__(self, first, last, program='Masters'):
        super().__init__(first,last)
        self.program = program

    @property
    def name(self):
        return super().name + f' ({self.program} program)'


In [None]:
s3 = GraduateStudent.fromlist(['Santa','Claus','PhD'])

In [None]:
s3.name

In [None]:
Student.is_student(s3), GraduateStudent.is_student(s3)

In [None]:
GraduateStudent.fromlist(['John']).name

### Enums

Use enums readability when you have a discrete set of CONSTANTS.

In [None]:
from enum import Enum

In [None]:
class Day(Enum):
    MON = 1
    TUE = 2
    WED = 3
    THU = 4
    FRI = 5
    SAT = 6
    SUN = 7

In [None]:
from collections.abc import Iterable
type(Day), isinstance(Day, Iterable)

In [None]:
for day in Day:
    print(day)

In [None]:
[str(day) for day in Day]

### NamedTuple

In [None]:
from collections import namedtuple

In [None]:
Student = namedtuple('Student', ['name', 'email', 'age', 'gpa', 'species'])

In [None]:
abe = Student('Abraham Lincoln', 'abe.lincoln@gmail.com', 23, 3.4, 'Human')

In [None]:
abe.species

In [None]:
abe[1:4]

### Data Classes

Simplifies creation and use of classes for data records. 

Note: NamedTuple serves a similar function but are immutable.

In [None]:
from dataclasses import dataclass

In [None]:
@dataclass
class Student:
    name: str
    email: str
    age: int
    gpa: float
    species: str = 'Human'

In [None]:
abe = Student('Abraham Lincoln', 'abe.lincoln@gmail.com', age=23, gpa=3.4)

In [None]:
abe

In [None]:
abe.email

In [None]:
abe.species

**Note**

The type annotations are informative only. Python does *not* enforce them.

In [None]:
Student(*'abcde')

### Imports, modules and namespaces 

- A namespace is basically just a dictionary 
- **L**(ocal)**E**(nclosing)**G**(lobal)**B**(uiltin) 
- Avoid polluting the global namespace

In [None]:
[x for x in dir(__builtin__) if x[0].islower()][:8]

In [None]:
x1 = 23

def f1(x2):
    print(locals())
    # x1 is global (G), x2 is enclosing (E), x3 is local
    def g(x3):
        print(locals())
        return x3 + x2 + x1 
    return g

In [None]:
x = 23

def f2(x):
    print('enclosing locals:', locals())
    def g(x):
        print('enclosed locals:', locals())
        return x 
    return g

In [None]:
g1 = f1(3)
g1(2)

In [None]:
g2 = f2(3)
g2(2)

### Loops 

- Prefer vectorization unless using [Numba](https://numba.pydata.org) 
- Difference between continue and break 
- Avoid infinite loops 
- Comprehensions and generator expressions

In [None]:
import string

In [None]:
{char: ord(char) for char in string.ascii_lowercase}

### Iterations and generators 

- The iterator protocol
    - `__iter__` and `__next__`
    - iter()
    - next()
- What happens in a for loop
- Generators with `yield` and `yield from`

In [None]:
class Iterator:
    """A silly class that implements the Iterator protocol and Strategy pattern.
    
    start = start of range to apply func to
    stop = end of range to apply func to
    """
    def __init__(self, start, stop, func):
        self.start = start
        self.stop = stop
        self.func = func
        
    def __iter__(self):
        self.n = self.start
        return self
    
    def __next__(self):
        if self.n >= self.stop:
            raise StopIteration
        else:
            x = self.func(self.n)
            self.n += 1
            return x

In [None]:
sq = Iterator(0, 5, lambda x: x*x)

In [None]:
list(sq)

### Generators

Like functions, but lazy.

In [None]:
def cycle1(xs, n):
    """Cycles through values in xs n times."""
    
    for i in range(n):
        for x in xs:
            yield x

In [None]:
list(cycle1([1,2,3], 4))

In [None]:
for x in cycle1(['ann', 'bob', 'stop', 'charles'], 1000):
    if x == 'stop':
        break
    else:
        print(x)

In [None]:
def cycle2(xs, n):
    """Cycles through values in xs n times."""
    
    for i in range(n):
        yield from xs

In [None]:
list(cycle2([1,2,3], 4))

Because they are lazy, generators can be used for infinite streams.

In [None]:
def fib():
    a, b = 1, 1
    while True:
        yield a
        a, b = b, a + b

In [None]:
for n in fib():
    if n > 100:
        break
    print(n, end=', ')

You can even slice infinite generators. More when we cover functional programming.

In [None]:
import itertools as it

In [None]:
list(it.islice(fib(), 5, 10))