$
\newcommand{\nc}{\newcommand} 
\nc{\t}{\text}
\nc{\tb}{\textbf}
\nc{\ti}{\textit}
\nc{\x}{\boldsymbol{x}}
\nc{\y}{\boldsymbol{y}}
\nc{\z}{\boldsymbol{z}}
$

# $$\textbf{Python Programming for Machine Learning} $$ 

## $$\textbf{Advanced Python}$$

#### $$\text{Winter Semester 2022/23}$$

#### $$\text{Sergej Dogadov and Panagiotis Tomer Karagianis}$$

<center>
<img src='images/pyt.png' width=450>

## $$\textbf{Inheritance with Python}$$
<hr>

* $\textbf{Inheritance} \text{ is when a class uses a code written within another class.}$

In [1]:
# First we need to define a Parent class
class Person: 
    
    def __init__(self, first_name, last_name): # constructor
        
        self.first_name = first_name
        self.last_name = last_name
        
    def __repr__(self): # object representation
        return f'Person: {self.first_name} {self.last_name}'
    
    def __call__(self):
        #function to be called by default on object call
        return self.talk()
        #print(super())
    
    def talk(self): # method
        return f'Hello. My name is {self.first_name} {self.last_name}. '

In [2]:
me = Person('Sergej', 'Dogadov')

# __repr__ function is called
me

Person: Sergej Dogadov

In [3]:
# Calling the object as a function
me() 

# or me.talk() explicit call

'Hello. My name is Sergej Dogadov. '

## $$\textbf{Child class}$$
<hr>

In [4]:
class Student(Person): # Child class
    
    def __init__(self, first_name, last_name, mat_number, university):
        
        Person.__init__(self, first_name, last_name) # Parent constructor
        self.mat_number = mat_number
        self.university = university
        self.modules = []
        self.credits = 0 # ECTS
        self.notes = [] # [3, 4, 1, 2, 2]
        
    def __repr__(self):
        info = f'{self.first_name} {self.last_name}\nStudent: {self.university} {self.mat_number}'
        
        if len(self.modules) > 0:      
            classes = ', '.join(self.modules)
            info += f'\nCredits: {self.credits} ECTS in {classes} ' + \
                f'avg note: {sum(self.notes)/len(self.notes):0.1f}'
        return  info
        
    def talk(self):
        # parent's method call
        return super().talk() + f"I'm studying at {self.university}. " + \
            f"My matriculation number is {self.mat_number}"
    
    def exam(self, module_name: str, credit: int, note: float) -> None:
        """
            Adds exam info and credits 
        """
        self.modules += [module_name]
        self.credits += credit
        self.notes += [note]

## $$\textbf{Student class usage}$$
<hr>

In [5]:
#object creation
anna = Student('Anna', 'Mustermann', 4345325, 'TU Berlin')

# function talk is invoked
anna.talk() # or just anna()

"Hello. My name is Anna Mustermann. I'm studying at TU Berlin. My matriculation number is 4345325"

In [6]:
anna.exam('CS', 6, 1.7)
anna.exam('BIO', 12, 2.7)

print(anna)

Anna Mustermann
Student: TU Berlin 4345325
Credits: 18 ECTS in CS, BIO avg note: 2.2


## $$\textbf{Multiple class inheritance}$$
<hr>

In [7]:
class Employee:
    
    def __init__(self, company, position, salary):
        
        self.company = company
        self.position = position
        self.salary = salary
        
    def __repr__(self):
        return f"Employee: {self.company} as {self.position}"

In [8]:
class HiWi(Student, Employee):

    def __init__(self, first_name, last_name, mat_number, university, salary):
        
        Student.__init__(self, first_name, last_name, mat_number, university)
        Employee.__init__(self, university, 'HiWi', salary)
        
    def __repr__(self):
        
        info = Student.__repr__(self) # Note: super() first inherited class is beeing called 
        info += f'\nPosition: {self.position} with salary {self.salary}$'
        return info + f'\n{Employee.__repr__(self)}'

In [9]:
me = HiWi('Sergej','Dogadov', 123456, 'TU Berlin', 1000)
me.exam('BIO', 6, 2.7)
me.exam('PyML', 3, 1.7)
me

Sergej Dogadov
Student: TU Berlin 123456
Credits: 9 ECTS in BIO, PyML avg note: 2.2
Position: HiWi with salary 1000$
Employee: TU Berlin as HiWi

## $$\textbf{Advanced Python}$$
<hr>  


<center>
<img src='images/apyt.png' width=350>

### $$\textbf{Python – Database Manager (dbm)}$$
<hr> 

* $\text{Write a class to handle data with the} \textit{ dbm } \text{database package.}$

In [10]:
import dbm, os, pickle

class DBM(object):

    def __init__(self, name, folder='./db'):
        os.makedirs(folder, exist_ok=True)
        self.folder = folder
        
        self.filepath = os.path.join(self.folder, name)

    def add_(self, key, value):
        with dbm.open(self.filepath, 'c') as db:
            db[key] = pickle.dumps(value, protocol=pickle.HIGHEST_PROTOCOL)

    def get_(self, key):
        with dbm.open(self.filepath, 'r') as db:
            try:
                res = db[key]
                return pickle.loads(res)
            except KeyError:
                return None
            
    def delete_(self, key):
        with dbm.open(self.filepath, 'c') as db:
            del db[key]
            
    def keys_(self):
        with dbm.open(self.filepath, 'r') as db:
            return db.keys()

### $$\textbf{Init a new database to handle students}$$
<hr>  

In [11]:
tub = DBM("TU Berlin")

In [12]:
mat_number = 123
anna = Student('Anna', 'Mustermann', mat_number, 'TU Berlin')

tub.add_(f"student:{mat_number}", anna) # key can contain table information

In [13]:
tub.get_('student:123')

Anna Mustermann
Student: TU Berlin 123

In [17]:
mat_number = 456
sergej = HiWi('Sergej','Dogadov', mat_number, 'TU Berlin', 1000)

tub.add_(f"hiwi:{mat_number}", sergej)

In [18]:
tub.get_('hiwi:456') 

Sergej Dogadov
Student: TU Berlin 456
Position: HiWi with salary 1000$
Employee: TU Berlin as HiWi

In [19]:
tub.keys_()

[b'student:123', b'hiwi:456']

### $$\textbf{Generators}$$
<hr>  

$\t{Generators are special functions which execution you can stop and rerun.} $

In [20]:
# Infinite counter as a generator
def counter():
    
    print('Conter initialized')
    
    n = 0
    while True:
        
        yield n # similar to return 
        n += 1

In [21]:
# generator type is returned
counter()

<generator object counter at 0x7fe34c49eb30>

In [22]:
import sys
sys.getsizeof(counter()) # it may generate infinitly long number list

112

### $$\textbf{Generators cont'd}$$
<hr>  


* $\t{Getting a next element}$

In [23]:
cnt_gen = counter()
next(cnt_gen)

Conter initialized


0

In [24]:
next(cnt_gen)

1

* $\t{Iterating over the generator object}$

In [25]:
for i in counter():
    if i < 5:
        print(i)
    else:
        break # stop loop oterwise generator will work further

Conter initialized
0
1
2
3
4


### $$\textbf{Generators object as a tuple comprehensions}$$
<hr>  


In [16]:
gen_obj = (x**2 for x in range(100_000) if x % 10 == 0)
gen_obj

<generator object <genexpr> at 0x7fa860195200>

In [17]:
next(gen_obj), next(gen_obj), next(gen_obj)

(0, 100, 400)

In [18]:
gen_list = [x**2 for x in range(100_000) if x % 10 == 0]

In [19]:
sys.getsizeof(gen_obj), sys.getsizeof(gen_list)

(112, 87616)

* $\t{What are the use cases generators could be useful?}$

### $$\textbf{Decorator as a benchmark a function}$$
<hr>  

In [20]:
# Define a decorator which wrapps a custom function

def benchmark(func):
    
    from time import time #import time to get current time
    
    def wrapper(*args, **kwargs):
      
        start = time() # start measuring time second passed since begin of UNIX time 1970 sthm
        res = func(*args, **kwargs)
        end = time() # end measuring time
        
        ms = (end - start) * 1000
        print(f"Elapsed time: {ms:0.6f} ms")

        return res
    
    return wrapper

In [21]:
@benchmark
def sum_up(n, step=1):
    cnt = 0
    for i in range(n):
        if i % step == 0:
            cnt += i
    return cnt 

In [22]:
#execute the fn
res = sum_up(10_000, step=5)
print(res)

Elapsed time: 0.549555 ms
9995000


### $$\textbf{Caching}$$
<hr>  

$\t{If a function is being executed many times and }$

 * $\t{it takes a long time to return the results,}$ 

 * $\t{it produces the same results for the same inputs.}$

$\t{Then then we might cache the results to improve the performance.} $

In [23]:
from functools import lru_cache

@lru_cache(10) # number of func returns to cache
def sum_up(n, step=1):
    cnt = 0
    for i in range(n):
        if i % step == 0:
            cnt += i
    return cnt 

In [24]:
@benchmark
def run(*args, **kwargs):
    res = sum_up(*args, **kwargs)
    print(sum_up.cache_info())
    return res

### $$\textbf{Running the cached function}$$
<hr>  

In [25]:
run(10_000, step=20)

CacheInfo(hits=0, misses=1, maxsize=10, currsize=1)
Elapsed time: 0.583887 ms


2495000

In [26]:
run(10_000, step=25)

CacheInfo(hits=0, misses=2, maxsize=10, currsize=2)
Elapsed time: 0.566483 ms


1995000

### $$\textbf{Combinatorics}$$
<hr>  

* $\t{All unique combinations}$

In [27]:
import itertools as it

lst = [1, 0, 2]

[i for i in it.combinations(lst, r=2)]

[(1, 0), (1, 2), (0, 2)]

* $\t{All possible combinations }$

In [28]:
[i for i in it.combinations_with_replacement(lst, r=3)]

[(1, 1, 1),
 (1, 1, 0),
 (1, 1, 2),
 (1, 0, 0),
 (1, 0, 2),
 (1, 2, 2),
 (0, 0, 0),
 (0, 0, 2),
 (0, 2, 2),
 (2, 2, 2)]

 * $\t{All permutations}$

In [29]:
[ i for i in it.permutations(lst)]

[(1, 0, 2), (1, 2, 0), (0, 1, 2), (0, 2, 1), (2, 1, 0), (2, 0, 1)]

* $\t{Cartesian product A x B}$

In [30]:
[ i for i in it.product(lst, lst)]

[(1, 1), (1, 0), (1, 2), (0, 1), (0, 0), (0, 2), (2, 1), (2, 0), (2, 2)]

### $$\textbf{Scikit-learn package}$$
<hr>  

<br>
<center>
<img src='./images/sklearn.png'  width='700'>
    

### $$\textbf{Standard Scaler}$$
<hr>  

In [70]:
import numpy as np

# conda install -c anaconda scikit-learn
from sklearn.preprocessing import StandardScaler


$$z = \frac{ \x - \mu }{\sigma}, \quad \mu = \frac{1}{N}\sum_{i=1}^N x_i, \quad \sigma = \sqrt{\frac1 N \sum_{i=1}^N (x_i - \mu)^2} $$

In [71]:
x = np.array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2])

scaler = StandardScaler(with_mean=True, with_std=True)
x = x.reshape(-1, 1) # x is not a col-vector
print(x.shape)

scaler.fit(x); # train

(10, 1)


In [72]:
# mu
print(scaler.mean_)

# sigma
print(scaler.scale_)

[0.9]
[0.83066239]


In [73]:
z = scaler.transform(x)#.squeeze() # squeeze is used to supress the col-vector repr.
z, z.mean(), z.std()

(array([[-1.08347268],
        [-1.08347268],
        [-1.08347268],
        [-1.08347268],
        [ 0.12038585],
        [ 0.12038585],
        [ 0.12038585],
        [ 1.32424438],
        [ 1.32424438],
        [ 1.32424438]]),
 0.0,
 1.0)

### $$\textbf{Min Max Scaler}$$
<hr>  

In [83]:
from sklearn.preprocessing import MinMaxScaler

feature_range = (-10, 10)
min_max_scaler = MinMaxScaler(feature_range, clip=False) # outside ranges are allowed
#MinMaxScaler?

In [84]:
x = np.array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2])[:, None] # added an axis to be a col-vector
print(x.shape)

min_max_scaler.fit(x); # return self__repr__()

(10, 1)


In [85]:
x = np.array([-10, 0, 0, 0, 1, 1, 1, 2, 2, 2])[:, None]

min_max_scaler.transform(x).squeeze()

array([-110.,  -10.,  -10.,  -10.,    0.,    0.,    0.,   10.,   10.,
         10.])

### $$\textbf{Zen of Python}$$
<hr>  

In [86]:
# Hint of how to write beautiful code
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


# $$ \textbf{Thank you for your attention.}$$