## The Standard Library: "batteries included" philosophy

* operating system interface: os, shutil
* file wildcards: glob
* command line arguments: sys.argv
* error output redirection: sys.stderr
* regular expressions: re 
* math: math, random, statistics
* sql query language: sqlite3
* comma separated value format: csv
* data interchange format: json
* internet access: urllib.request, smtplib
* dates and times: datetime
* data compression: zlib, gzip, bz2, lzma, zipfile and tarfile
* profiling performance: timeit, profile, pstats 
* quality control: doctest, unittest
* ...


## Simple text processing using str.format()

* manipulate string using arguments: by position, by name 
* accessing arguments' attributes and items
* conversion flags: '!s' calls str(), '!r' calls repr(), '!a' calls ascii()
 * to inspect string content
* manipulate string alignment and fill char
* manipulate string: conversions
 * decimal ('d',default), binary ('b'), character ('c'), octa ('o'), hex('x' o 'X'), exponent ('e'), fixed point ('f'), percentage ('%')
 
 
 ## Paths

* the module os.path allows path manipulation 
* the module glob finds all the pathnames matching

In [1]:
# using string arguments
print('{0} {1} {2}'.format('access','by','position','again','can be repeat'))
print('{} {} {}'.format('access','by','position','again','can be repeat'))
print('{3} {1} {2}'.format('access','by','position','again','can be repeat'))
print('{0} {4} {4} {4}'.format('access','by','position','again','can be repeat'))

print("\n")
print('Sissa gps coordinates: {latitude}, {longitude}'.format(latitude='45.680243',longitude='13.774218')) 
print('Ictp gps coordinates: {latitude}, {longitude}'.format(**{'latitude':45.704175, 'longitude':13.719495})) 
print('Miramare gps coordinates: {0[0]}, {0[1]}'.format((45.703465, 13.707386)))

print("\n")
print('Real part of {0} is {0.real}'.format(2+1j))

access by position
access by position
again by position
access can be repeat can be repeat can be repeat


Sissa gps coordinates: 45.680243, 13.774218
Ictp gps coordinates: 45.704175, 13.719495
Miramare gps coordinates: 45.703465, 13.707386


Real part of (2+1j) is 2.0


In [28]:
%reset -f 
import datetime
now = datetime.datetime.now() 
print(str(now) ,'\n ->',type(str(now)))
print(repr(now),'\n ->',type(repr(now))) # repr returns a string containing a printable representation of an object.

2017-12-19 11:26:17.517214 
 -> <class 'str'>
datetime.datetime(2017, 12, 19, 11, 26, 17, 517214) 
 -> <class 'str'>


In [41]:
%reset -f
unicode_c = u'あ'    # unicode char
print(str(unicode_c)  ,'\t ->',type(str(unicode_c)))
print(repr(unicode_c) ,'\t ->',type(repr(unicode_c)))  # returns a printable representation of an object as a string
print(ascii(unicode_c),'\t ->',type(ascii(unicode_c))) # as repr, but escape the non-ASCII characters in the string

あ 	 -> <class 'str'>
'あ' 	 -> <class 'str'>
'\u3042' 	 -> <class 'str'>


In [2]:
# string format
sf = {\
   'PURPLE':'\033[95m',\
   'CYAN':'\033[96m',\
   'DARKCYAN':'\033[36m',\
   'BLUE':'\033[94m',\
   'GREEN':'\033[92m',\
   'YELLOW':'\033[93m',\
   'RED':'\033[91m',\
   'BOLD':'\033[1m',\
   'UNDERLINE':'\033[4m',\
   'END':'\033[0m'}

print('\nnice string formatting:')
us = sf['UNDERLINE']+'underlined'+sf['END']
bs = sf['BOLD']+'bold'+sf['END']
bbs = sf['BLUE']+bs
print(us)
print(bs)
print(bbs)
print("\n{r}str():{e}{} \t {r}repr():{e}{} \t {r}ascii():{e}{}".format('readable',\
                                                                       'unambiguous',\
                                                                       'uses escape sequence for non-ascii',\
                                                                       r=sf['RED'],e=sf['END']))
# conversion flags 
# !s -> str()
# !r -> repr()
# !a -> ascii()
print('\nstr(): {!s}\nrepr(): {!r}\nascii(): {!a}'.format(us,us,us))
print('\nstr(): {!s}\nrepr(): {!r}\nascii(): {!a}'.format('ë','ë','ë'))


nice string formatting:
[4munderlined[0m
[1mbold[0m
[94m[1mbold[0m

[91mstr():[0mreadable 	 [91mrepr():[0munambiguous 	 [91mascii():[0muses escape sequence for non-ascii

str(): [4munderlined[0m
repr(): '\x1b[4munderlined\x1b[0m'
ascii(): '\x1b[4munderlined\x1b[0m'

str(): ë
repr(): 'ë'
ascii(): '\xeb'


In [6]:
print('{:^<20}'.format('left aligned'))
print('{:_>20}'.format('right aligned'))
print('{:.^20}'.format('centered'))
print('{:^20}'.format('centered'))

left aligned^^^^^^^^
_______right aligned
......centered......
      centered      


In [4]:
print('{:+f}; {:+f}'.format(1, -1)) # show always the sign
print('{: f}; {: f}'.format(1, -1)) # + becomes a space
print('{:-f}; {:-f}'.format(1, -1)) # show only the minus

+1.000000; -1.000000
 1.000000; -1.000000
1.000000; -1.000000


In [47]:
print("int: {0:d};  hex: {0:x};  oct: {0:o};  bin: {0:b}".format(42))
# with 0x, 0o, or 0b as prefix:
print("          hex: {0:#x};  oct: {0:#o};  bin: {0:#b}".format(42)) 
# percentage & digit of resolution after the decimal point
print('{:.9%}'.format(0.123456789110123456789))

int: 42;  hex: 2a;  oct: 52;  bin: 101010
          hex: 0x2a;  oct: 0o52;  bin: 0b101010
12.345678911%


## Simple path manipulation

* os.path: Common pathname manipulations
* glob: Unix style pathname pattern expansion

In [15]:
# retreive absolute path and navigate joining relative paths
import os.path 

currentPath = os.path.abspath('.')                # I am here 
print("\n",os.path.dirname(currentPath),sep="") 
print(os.listdir(currentPath),sep="") 

currentPath = os.path.join(currentPath,'modules') # -> cd modules
print("\n",os.path.dirname(currentPath),sep="") 
print(os.listdir(currentPath),sep="") 

currentPath = os.path.join(currentPath,'..')      # -> cd ..
print("\n",os.path.dirname(currentPath),sep="") 
print(os.listdir(currentPath),sep="") 


/home/jaky/Documents/MHPC/python-advanced-programming
['.git', '.gitignore', 'python_lecture_2.ipynb', 'exercises', '.ipynb_checkpoints', 'python_lecture_1.ipynb', 'figures', 'modules', '.README.md', 'README.md']

/home/jaky/Documents/MHPC/python-advanced-programming/python-lectures
['test_relative_paths.py', 'myFirstPackage', 'parse_input.py']

/home/jaky/Documents/MHPC/python-advanced-programming/python-lectures/modules
['.git', '.gitignore', 'python_lecture_2.ipynb', 'exercises', '.ipynb_checkpoints', 'python_lecture_1.ipynb', 'figures', 'modules', '.README.md', 'README.md']


In [1]:
%reset -f
import glob
import os.path

mdFile = glob.glob('*.md')                                         # finds markdown files
print(mdFile)
print(os.path.isdir\
     (os.path.join(os.path.abspath('.'),mdFile[0]))) # is a markdown file a dir?

['README.md']
False


## Parsing inputs from command line to a script

* a script is a file that is executed
* every .py file has a global variable \__name\__ (set to '\__main\__' for scripts)
 * you can write a standalone script and definitions for other programs (i.e., modules) in the same file


* sys.argv stores in a list the command line arguments of a python call
* getopt.getopt(argv, options\[, long_options\]): this module parses a list of string

### [modules/parse_input_using_getopt_and_sys.py](modules/parse_input_using_getopt_and_sys.py)
```python
import getopt, sys

if __name__ == "__main__":
    # getopt.getopt(argv, options[, long_options]) 
    # argv is a list containing all string to parse
    # options are preceded by -
    #  -> if an argument is needed, use :
    # long_options are preceded by --
    #  -> if an argument is needed, use =
    argv = sys.argv[:]
    opts,args = getopt.getopt(argv[1:],\
                              'ab:c:',\
                              ['just-a-flag',\
                               'set-this-param-to=',\
                               'same-of-a']) 

    print("sys.argv -> {}\n".format(argv))
    print("opts -> {} \nargs -> {}\n".format(opts,args) )
```

In [8]:
!python modules/parse_input_using_getopt_and_sys.py\
                                                    -a -b 1 -c1\
                                                    --set-this-param-to=cond1 --just-a-flag\
                                                    arg1 arg2 ... argN

sys.argv -> ['modules/parse_input_using_getopt_and_sys.py', '-a', '-b', '1', '-c1', '--set-this-param-to=cond1', '--just-a-flag', 'arg1', 'arg2', '...', 'argN']

opts -> [('-a', ''), ('-b', '1'), ('-c', '1'), ('--set-this-param-to', 'cond1'), ('--just-a-flag', '')] 
args -> ['arg1', 'arg2', '...', 'argN']



* argparse module is an alternative
 * has a modular structure (easier to maintain) 
 * you can create the help argument-after-argument
 
### [modules/parse_input_using_argparse.py](modules/parse_input_using_argparse.py)
```python
import argparse

if __name__ == "__main__":

    # nargs=int: int arguments
    #       '?': 0 or 1 arguments
    #       '*': 0 or many arguments
    #       '+': Many, at least one, argument
    parser = argparse.ArgumentParser(description=\
            'here you can describe what this program does')
    parser.add_argument('-a','--same-of-a',help='-a needs no argument',nargs='?') # -a is the same of --same-of-a, expect 0 or 1 argument 
    parser.add_argument('-b',help='-b needs 1 arg',nargs=1) # -b expects 1 argument
    parser.add_argument('-c',help='-c needs 2 args',nargs=2) # -c expects 2 arguments
    parser.add_argument('-d',help='-d accepts all arguments',nargs='*') # -c expects 2 arguments
    parser.add_argument('--set-this-param-to=',help='instructions for --set-this-param-to',nargs=1) # --set-this-param-to expects 1 arg
    parser.add_argument('this-setting',help='setting needs no arguments',nargs='?') 
    args = parser.parse_args()
    print("{}\n".format(args) )

```  

In [68]:
!python modules/parse_input_using_argparse.py\
                                                -a -b 1 -c 1 2\
                                                --set-this-param-to=cond1 this-setting

Namespace(b=['1'], c=['1', '2'], d=None, same_of_a=None, set_this_param_to==['cond1'], this-setting='this-setting')



In [71]:
!python modules/parse_input_using_argparse.py -a 1 -d 1 2 3 4 5

Namespace(b=None, c=None, d=['1', '2', '3', '4', '5'], same_of_a='1', set_this_param_to==None, this-setting=None)



In [48]:
!python modules/parse_input_using_argparse.py -d 1 2 3 4 5 -a 1 

Namespace(b=None, c=None, d=['1', '2', '3', '4', '5'], same_of_a='1', set_this_param_to==None, this-setting=None)



In [49]:
# no help was explicitly set, however... 
!python modules/parse_input_using_argparse.py --help

usage: parse_input_using_argparse.py [-h] [-a [SAME_OF_A]] [-b B] [-c C C]
                                     [-d [D [D ...]]]
                                     [--set-this-param-to= SET_THIS_PARAM_TO=]
                                     [this-setting]

here you can describe what this program does

positional arguments:
  this-setting          setting needs no arguments

optional arguments:
  -h, --help            show this help message and exit
  -a [SAME_OF_A], --same-of-a [SAME_OF_A]
                        -a needs no argument
  -b B                  -b needs 1 arg
  -c C C                -c needs 2 args
  -d [D [D ...]]        -d accepts all arguments
  --set-this-param-to= SET_THIS_PARAM_TO=
                        instructions for --set-this-param-to


## Modules and Packages

* definitions in python files might be reused (i.e., imported) 
 * those files are called modules
 * every python file is potentially a module
 * every module can be run as a script
* global variable names are global within the module
* modules have name, i.e. the file name without the suffix .py
* module's name is assigned to the global variable \__name\__
 * when a module is run as a script, \__name\__ == "\__main\__"
* --
* a package is a module that contains other modules: any folder containing .py files can be a package
 * folders containing an \__init\__.py file (even empty) are packages for python 
 * the \__init\__.py file prevents unintentionally hiding of names in the module search path
* --
* subpackages and submodules are packages and modules contained by other packages or modules 

## Structure of a python Package

```txt
package/
    __init__.py
    module1.py
    module2.py
    subpackage/
        __init__.py
        submodule1.py
        submodule2.py
```
* --
* packages are structured modules using "dotted module names"
 * A.B: A is a package named A while B is a subpackage (or submodule) named B
* \__init\__.py can just be an empty file or execute initialization code for the package or set the \__all\__ variable
 * \__all\__ variable is a list of module names that are imported when 
 ```python
from package import * 
```


In [51]:
!tree modules/myFirstPackage/

[01;34mmodules/myFirstPackage/[00m
├── biggerModule.py
├── import_usefulSubModule_from_subPackage1.py
├── __init__.py
├── __init__.pyc
├── myFirstModule.py
├── myFirstModule.pyc
├── [01;34m__pycache__[00m
│   ├── biggerModule.cpython-35.pyc
│   ├── __init__.cpython-35.pyc
│   └── myFirstModule.cpython-35.pyc
├── runModuleAsScript.py
├── [01;34msubPackage_1[00m
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── [01;34m__pycache__[00m
│   │   ├── __init__.cpython-35.pyc
│   │   └── usefulSubModule.cpython-35.pyc
│   ├── usefulSubModule.py
│   └── usefulSubModule.pyc
├── [01;34msubPackage_2[00m
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── needs_subPackage_1.py
│   ├── needs_subPackage_1.pyc
│   └── [01;34m__pycache__[00m
│       ├── __init__.cpython-35.pyc
│       └── needs_subPackage_1.cpython-35.pyc
└── [01;34msubPackage_3[00m
    ├── __init__.py
    ├── __init__.pyc
    ├── needs_subPackage_1.py
    ├── needs_subPackage_1.pyc
    └── [0

## .pyc files and \__pycache\__ folder

* what are .pyc files?
 * .pyc contain the compiled bytecode of Python source files
 * python interpreter prefers loading .pyc 
* a .pyc is created
 * only for an imported .py 
 * only if its timestamp differs from .py 's 
* why .pyc?
 * the execution time would not change...
 * but loading time is shorter for a .pyc file 
 * small scripts may spend more time in loading/compiling than in executing
* \__pycache\__ was introduced in python3.2 to store the .pyc files
 * different python implementations store in there with different desinences (e.g., "cpython-35")
 * the .pyc file in the same folder of .py is the one loaded (it is a renamed copy for the proper python implementation)
* -- 
* these files/folders can be safely deleted
* there is no need to store them then...
 * ...add them on .gitignore 

### [modules/myFirstPackage/myFirstModule.py](modules/myFirstPackage/myFirstModule.py)
``` python
def f0():
    return f1() # f1 is in scope

def f1():
    return __name__
```

In [52]:
%reset -f
from modules.myFirstPackage.myFirstModule import * # this means f0 and f1
print('\nList imported modules:',end=" ")
%who

print("\n",f0(),sep="")


List imported modules: f0	 f1	 

modules.myFirstPackage.myFirstModule


In [5]:
%reset -f
import modules.myFirstPackage.myFirstModule as m1
print('\nList imported modules:',end=" ")
%who

print("\n",m1.f0(),sep="") # __name__ != "__main__" 


List imported modules: m1	 

modules.myFirstPackage.myFirstModule


## Run module as a script

* \__name\__ changes to "\__main\__"

### [modules/runModuleAsScript.py](modules/runModuleAsScript.py)
``` python
import myFirstModule as m

def f0():
    return __name__

def f1():
    return m.__name__

if __name__ == "__main__": 
    print("this module's name is "+f0())
    print("sys module's name is "+f1())
```

In [6]:
! python modules/myFirstPackage/runModuleAsScript.py

this module's name is __main__
sys module's name is myFirstModule


## \__all\__ list 

* \__all\__ overrides * during import 

### [modules/myFirstPackage/biggerModule.py](modules/myFirstPackage/biggerModule.py)

```python
__all__=['f0'] 

def f0():
    return 

def f1():
    return 

def f2():
    return 
```

In [7]:
%reset -f 
from modules.myFirstPackage.biggerModule import * # __all__ is imported
print('\nList imported modules:',end=" ")
%who



List imported modules: f0	 


## Intra-package references

* relative imports use a module's \__name\__ to determine module's position in the package hierarchy 

### [modules/myFirstPackage/subPackage_1/usefulSubModule.py](modules/myFirstPackage/subPackage_1/usefulSubModule.py)
```python
def usefulFunction():
    return "I can use this usefulFunction"
```
### [modules/myFirstPackage/import_usefulSubModule_from_subPackage1.py](modules/myFirstPackage/import_usefulSubModule_from_subPackage1.py)
```python
import subPackage_1.usefulSubModule as sm

if __name__ == "__main__": 
    print(sm.usefulFunction())
```

In [56]:
!ls modules/myFirstPackage/
print()
!ls modules/myFirstPackage/subPackage_1/
print()
!python modules/myFirstPackage/import_usefulSubModule_from_subPackage1.py

biggerModule.py				    myFirstModule.py	  subPackage_1
import_usefulSubModule_from_subPackage1.py  myFirstModule.pyc	  subPackage_2
__init__.py				    __pycache__		  subPackage_3
__init__.pyc				    runModuleAsScript.py

__init__.py  __init__.pyc  __pycache__	usefulSubModule.py  usefulSubModule.pyc

I can use this usefulFunction


## Issue while importing top packages 

* ValueError for relative import 
 * scripts can't import relative because \__name\__ is \__main\__
 * relative import makes sense only for a package
* place scripts that runs the module outside the package directory

### [modules/myFirstPackage/subPackage_2/needs_subPackage_1.py](modules/myFirstPackage/subPackage_2/needs_subPackage_1.py)
```python
from ..subPackage_1 import usefulSubModule as s

def f0():
    return s.usefulFunction()

if __name__ == "__main__":
    print(f0())
```

### [modules/test_relative_paths.py ](modules/test_relative_paths.py )
```python
import myFirstPackage.subPackage_2.needs_subPackage_1 as s

if __name__ == "__main__":
    print(s.f0())
```

In [57]:
!python modules/myFirstPackage/subPackage_2/needs_subPackage_1.py

Traceback (most recent call last):
  File "modules/myFirstPackage/subPackage_2/needs_subPackage_1.py", line 1, in <module>
    from ..subPackage_1 import usefulSubModule as s
ValueError: Attempted relative import in non-package


In [58]:
!python modules/test_relative_paths.py

I can use this usefulFunction


## PYTHONPATH 

* default search path for module files
* the format is the same as the shell’s PATH
 * one or more directory pathnames separated by os.pathsep
* the search path can be manipulated from within a Python program as the variable sys.path 
 * using modules os and sys you can add package path into subpackages and submodules
 
### [modules/myFirstPackage/subPackage_3/needs_subPackage_1.py ](modules/myFirstPackage/subPackage_3/needs_subPackage_1.py )
```python
import os,sys
file_path = os.path.abspath(__file__)
this_subpackage_path = os.path.split(file_path)[0]
this_package_path = os.path.join(this_subpackage_path,'../..')
sys.path.insert(0, this_package_path)

from myFirstPackage.subPackage_1 import usefulSubModule as s

def f0():
    return s.usefulFunction()
if __name__ == "__main__":
    print(f0())
```

In [None]:
!python modules/myFirstPackage/subPackage_3/needs_subPackage_1.py

## Functional programming paradigm 

* a sequence is any iterable object 
 * iterables are objects that contain an \__iter\__ or \__getitem\__ method
 * you can retreive an iterable using an iterator
 * iterators are objects that define a \__next\__ method
* a container is an iterable, hence a sequence (the reverse is not true)
* ex: range(0,1e10) does not return a container but an iterable: 
 * it does not contain 1e10 elements!
 * it generates the value one after the other
* --
* this paradigm is based on three built-in functions
 * filter(cond,seq): applies a condition to a sequence
 * map(fun,seq): applies a function to all elements of a sequence
 * functools.reduce(fun,seq): applies function of two arguments to the items of a sequence
* filter and map return iterables (not lists or tuples)
* functools.reduce returns a single value
* advantage: the order of execution is not important

In [18]:
%reset -f
import functools
l  = list(range(10))               # 0,1,...,9
l2 = list(filter(lambda x: x%2,l)) # 1,3,5,7,9
l3 = list(filter(lambda x: x<5,l)) # 1,2,3,4,5
l4 = list(map(lambda x: x%2,l))  # 10 elements 1 or 0
l5 = list(map(lambda x: x<5,l))  # 10 elements True or False
lp1 = list(map(lambda x: x+1,l)) # 10 elements l+1
l6 = list(map(lambda x,y: x<y,l,lp1)) # 10 elements all True
l7 = functools.reduce(lambda x,y: x+y,l) # y is next(x)

print('l:       \n  {}'.format(l))
print('l%2:     \n  {}'.format(l2))
print('l<5:     \n  {}'.format(l3))
print('el%2:    \n  {}'.format(l4))
print('el<5:    \n  {}'.format(l5))
print('el<el+1: \n  {}'.format(l6))
print('sum(el): \n  {}'.format(l7))

l:       
  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
l%2:     
  [1, 3, 5, 7, 9]
l<5:     
  [0, 1, 2, 3, 4]
el%2:    
  [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
el<5:    
  [True, True, True, True, True, False, False, False, False, False]
el<el+1: 
  [True, True, True, True, True, True, True, True, True, True]
sum(el): 
  45


## Comprehensions

* python can implement expressions that allow sequences to be built from other sequences
* comprehensions consist of: 
 * input sequence
 * variable representing members of the input sequence
 * optional predicate expression
 * output expression producing an output sequence
* comprehensions can be made with: 
 * list, nested list, set, dictionaries
 * generators (or generator comprehension)

In [65]:
%reset -f
l1 = [0,1,2,3,4,5]
l2 = [2,4,2,4,2,4]
l1_p_l2 = [x+y for x,y in zip(l1,l2)] # list comprehension
l1_p_l2_ = [x+y for x,y in zip(l1,l2) if x+y>4] # with predicate
print(l1_p_l2)
print(l1_p_l2_)

[2, 5, 4, 7, 6, 9]
[5, 7, 6, 9]


In [48]:
%reset -f
l1 = [1,2,3]
l2 = [0,2,4]
# list comprehension
v12 = [1 if i==j else 0 for i,j in zip(l1,l2)] # i==j vector 1 or 0
# list comprehension can be nested
m12 = [[1 if i==j else 0 for i in l1]\
                         for j in l2] # i==j matrix 1 or 0
print("\nv12[i]=1 if l1[i]=l2[i]\n -> ", v12)
print("\nm12[i][j]=1 if l1[i]=l2[j]\n -> ", m12)


v12[i]=1 if l1[i]=l2[i]
 ->  [0, 1, 0]

m12[i][j]=1 if l1[i]=l2[j]
 ->  [[0, 0, 0], [0, 1, 0], [0, 0, 0]]


In [51]:
%reset -f
names = ['Mickey','Minnie','Donald','Daisy','Goofy','Pluto']
# set comprehension
m_char_set = {name for name in names if name[0]=='M'} # M* characters
print(m_char_set)

{'Minnie', 'Mickey'}


In [67]:
%reset -f
names = {'Mickey':'mouse','Minnie':'mouse',\
         'Donald':'duck','Daisy':'duck',\
         'Goofy':'dog','Pluto':'dog'}
# dictionary comprehension
ducks_char_dict = {name for name in names if names[name]=='duck'}
print(ducks_char_dict)

{'Daisy', 'Donald'}


## Generator comprehension

* generators are iterators that: 
 * can be iterated only once 
 * do not store values
 * generate values on the fly  

In [75]:
%reset -f
squares = (x*x for x in range(10))
print("squares is",squares,"of type", type(squares))
print("1st loop over squares ->",[i for i in squares])
print("2nd loop over squares ->",[i for i in squares])

squares is <generator object <genexpr> at 0x7f633aebe780> of type <class 'generator'>
1st loop over squares -> [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
2nd loop over squares -> []


## next keyword

* retrieve the next item from an iterator by calling its \__next\__() method

In [102]:
%reset -f
squares = (x*x for x in range(3))
print(next(squares))
print(next(squares))
print(next(squares))
print([i for i in squares])     

0
1
4
[]


## yield keyword

* can be used only in the body of a function definition
* causes the function to return a generator 
* yield vs return
 * return transfers the control of execution to the point where the function was called
 * yield implies that the tranfere of control is temporary and voluntary
 * functions yield when they expects to regain control in the future
 * function with these capabilities are called generators
* conceptually it performs the opposite operation of next(): 
 * next retrieves an item and removes it from the generator
 * yield accumulates an item in the generator, ready to be iterated
 
## Subroutines vs coroutines

* functions that return can also be referred to as subroutines
 * subroutines returns the result in once
* functions that yield can also be referred to as coroutines (or generators in python)
 * coroutines yield the result one value at a time
* coroutines should be preferred when
 * values to return are too many to be stored
 * there is no need to read those values twice

In [113]:
%reset -f
def odd_numbers(numbers):
    odds = []
    for x in numbers: 
        if x%2!=0:
            odds.append(x)
    return odds # odds contains ALL odd numbers

odds = odd_numbers(range(10))       
print("odds is",odds,"of type", type(odds))
print("1st loop over odds ->",[i for i in odds])
print("2nd loop over odds ->",[i for i in odds])

odds is [1, 3, 5, 7, 9] of type <class 'list'>
1st loop over odds -> [1, 3, 5, 7, 9]
2nd loop over odds -> [1, 3, 5, 7, 9]


In [114]:
%reset -f
def odd_numbers(numbers):
    for x in numbers:
        if x%2!=0:
            yield x # x is one odd number 
            
odds = odd_numbers(range(10))  
print("odds is",odds,"of type", type(odds))
print("1st loop over odds ->",[i for i in odds])
print("2nd loop over odds ->",[i for i in odds])

odds is <generator object odd_numbers at 0x7f635006a150> of type <class 'generator'>
1st loop over odds -> [1, 3, 5, 7, 9]
2nd loop over odds -> []


## Python scopes and namespaces


* a namespace is a mapping from names to objects
* a namespace can be implemented in different ways
 * currently most are implemented as dictionaries
* there is no relation between names in different namespaces
* a.b: b is an attribute of the object a
 * attributes can be read-only or writable
 * attributes may be deleted with del statement
* --
* namespaces are created at different times 
* namespaces can have different lifetimes
 * built-in names are contained in a namespace created at interpreter start up and never deleted
 * global namespace for a module 
* local namespace for a function is created at the function call
 * and forgotten when the function exit (i.e., return or raise an exception)
* --
* a scope is a textual region where an unqualified reference to a name attempts to find the name in the namespace
* scopes are determined statically but used dynamically
 * local names, non-local non-global names, global names, built-in names 

In [170]:
%reset -f
# scope names
def scope_test():
    def do_local():
        var = "local"

    def do_nonlocal():
        nonlocal var
        var = "change"

    def do_global():
        global var
        var = "changes var value"

    var = "unchanged"
    do_local()
    print("local assignment leaves var", var)
    do_nonlocal()
    print("nonlocal assignment makes var", var)
    do_global()
    print("global assignment does not", var,"var value in current scope")
    
scope_test()
print("global assignment", var,"in global scope")

local assignment leaves var unchanged
nonlocal assignment makes var change
global assignment does not change var value in current scope
global assignment changes var value in global scope


## Object Oriented Programming in Python

* a class is a mean to bundle data and functionalities together 
 * each class instance has attributes attached to it for maintaining its state
 * each class instance has methodes fot modifying its states
 * creating a new class creates a new type of object
* objects can contain arbitrary amounts and kinds of data
* classes are created at runtime and can be modified after creation


## Differences from C++ 

* interpreted language
 * clang programs are static determined at compile-time 
 * python programs are interpreted at run-time 
* static vs dynamic typed language
 * variable's type is clear in clang, not in python because var type can change
* memory management 
 * python is a language, implementations may differ
 * in CPython anything is an object and all objects live on the heap
 * only names are stored in the stack
* built-in types can be used as base classes for extension
* all methods in python are effectively virtual
* private members don't exist in python
 * nothing makes it possible to enforce data hiding, you can mess up with names
 * classes are not usable to implement pure abstract data types
 * it is all based upon conventions (e.g., name mangling for non-public attributes)

## Class definition syntax
```python
class ClassName(<base_class>):
    <statement-1>
    .
    <statement-N>
```

* when a class definition is entered a new namespace is created and used as local scope
 * all assignments to local variables go into this new namespace
 * self.name refers to the name of this new namespace 
* class support two operations: 
 * obj.name: attribute references (as for namespace)
 * inst = obj(): class instantiation to create a new instance of the class  
* base_class: the class from which ClassName inherits all references 
* object keyword: basic object in python
 * in python 2.x inheriting from object includes perks
 * in python 3.x all classes inherits from objects by default
 * for code compatibility it is suggested to explicitly inherit from object

## Special attributes

* \__init\__(self\[,...\]): automatically invoked by class instantiation (i.e., the constructor)
* \__del\__(self): automatically invoked when the object should be destroyed (i.e., the destructor) 
* \__doc\__: docstring of the class
* \__class\__: reference to the type of the current instance
* \__str\__(self): called by print(obj)
* \__repr\__(self): substitues \__str\__(self)
* \__bases\__: tuple with all base classes

In [101]:
%reset -f
class myClass(object):
    """
    a docstring creates __doc__ attribute
    here you can place the documentation
    """
    i=1
    def __init__(self):
        self.j = 2
        print("\ncreate this class\n")
    def __del__(self):
        print("\ndelete this class\n")
    def f1(saelf):
        print("f1 was called")
    def __str__(self): # automatically called by print
        return self.__doc__

def do_stuff():        
    o = myClass()
    o.f1()
    print("docstring is",o.__doc__)
    print("print(o) calls o.__str__()",o)
    print(o.__class__.i == o.i)
    print(o.j)
    try: 
        print(o.__class__.j)
    except AttributeError as err:
        print(err)

do_stuff()    


delete this class


create this class

f1 was called
docstring is 
    a docstring creates __doc__ attribute
    here you can place the documentation
    
print(o) calls o.__str__() 
    a docstring creates __doc__ attribute
    here you can place the documentation
    
True
2
type object 'myClass' has no attribute 'j'

delete this class



## Method objects

* the first argument in method definition links the object itself (i.e., the C++ this)
 * using self for this argument is just a (very common) convention 
 * this argument should not be passed to method, if the method definition is inside the class statement
* a method with no arguments cannot be called by the class instances
 * can be used just by the class itself (i.e., \__class\__)
* methods (functions in general) can be assigned to a name

In [148]:
%reset -f
class dumb(object):
    def __init__(justaname,x): justaname.x=x
    def __str__(justaname): return str(justaname.x)

d = dumb(1)
print(d)

1


In [187]:
%reset -f
class dumb(object):
    def f0(self): return "f0 was invoked"      
    def f1(): return "f1 can be invoked inside the class"
    f2 = f1()
    def f3(self): return f1()
    
d = dumb()
print(d.f0())
try: 
    print(d.f1())
except TypeError as err:
    print("      err: ",err)   
    print("f1() is not callable outside of the class")
print(d.f2)
try: 
    print(d.f3())
except NameError as err:
    print("      err: ",err)   
    print("NO WAY: f1() is not callable outside of the class")
print(dumb.f1())

f0 was invoked
      err:  f1() takes 0 positional arguments but 1 was given
f1() is not callable outside of the class
f1 can be invoked inside the class
      err:  name 'f1' is not defined
NO WAY: f1() is not callable outside of the class
f1 can be invoked inside the class


In [188]:
%reset -f
class dumb(object):
    def f0(self): return "f0 was invoked"          
    
d = dumb()    
f01=d.f0    # f01 returns the method d.f0
f02=d.f0()  # f02 invokes d.f0
for i in range(3):
    print(f01)
for i in range(3):
    print(f02)    

<bound method dumb.f0 of <__main__.dumb object at 0x7f1c246a3828>>
<bound method dumb.f0 of <__main__.dumb object at 0x7f1c246a3828>>
<bound method dumb.f0 of <__main__.dumb object at 0x7f1c246a3828>>
f0 was invoked
f0 was invoked
f0 was invoked


## Class and Instance variables

* instance variables are for data unique to each instance
* class variables are shared by all instances of the class

In [189]:
class num(object):
    kind='num'                                   # kind is the same for all instances
    def __init__(self,value=0): self.value=value # value is different for any instance
    def __str__(self): return "defined "+str(self.value)+" as "+self.kind

_0 = num(0)
_1 = num(1)
print(_0)
print(_1)

defined 0 as num
defined 1 as num


## Overload operators

Math operators
* \__add\__(self, other): +
* \__mul\__(self, other): *
* \__sub\__(self, other): -
* \__mod\__(self, other): %
* \__truediv\__(self, other): /
* -- 
Comparison operators
* \__eq\__(self, other): ==
* \__ne\__(self, other): !=
* \__lt\__(self, other): <
* \__le\__(self, other): <=
* \__gt\__(self, other): >
* \__ge\__(self, other): >= 
* --
Others
* \__len\__(self): len 
* \__bool\__(self): invoked by if statement of bool function  
* \__nonzero\__(self): as \__bool\__(self) for python2
* \__iter\__(self): invoked by loops
* \__contains\__(self, value): invoked by in
* \__getitem\__(self, key): invoked by reading obj\[key\]
* \__setitem\__(self, key,value): invoked by writing obj\[key\]
* \__delitem\__(self, key): invoked by deleting obj\[key\]

In [1]:
%reset -f
class mynum(object):
    def __init__(self,value):
        self.value = value
    def __eq__(self,other):
        return self.value>other.value
    def __len__(self):
        return 1
    def __bool__(self): 
        print("__bool__ was invoked",end=" ")
        return True
    __nonzero__=__bool__ # compatible with python2
a = mynum(1)
b = mynum(2)
print(a==b) #1==2: False
try:
    print(a+b) #1+2: Undefined!
except TypeError as err:
    print(err)
print("a is one number, len(a) is",len(a))    
if a: print("by if")

False
unsupported operand type(s) for +: 'mynum' and 'mynum'
a is one number, len(a) is 1
__bool__ was invoked by if


In [5]:
%reset -f
class mylist(object):
    def __init__(self,l=[]): self.list=l
    def __len__(self): return len(self.list)
    def __setitem__(self,key,value): self.list[key]=value
    def __getitem__(self,key): return self.list[key]
    def __delitem__(self,key): self.list=self.list[0:key]+self.list[key+1:]
    def __iter__(self): return self.list.__iter__()
    def __contains__(self,value): return value in self.list
    def __str__(self): return str(self.list)
    def append(self,*args): [self.list.append(arg) for arg in args]

l = mylist([0,1,2]) # use __init__
print('l is',type(l))
l.append(3,4,5) # use append
print(l)    # use __str__
print(l[3]) # use __getitem__ and __str__
l[3]=-3     # use __setitem__
print(l[3]) 
del l[3]    # use __delitem__
for i in l: # use __iter__ and __contains__
    print(i,end=';')
print("\nlen(l):",len(l)) # use __len__ and __str__

l is <class '__main__.mylist'>
[0, 1, 2, 3, 4, 5]
3
-3
0;1;2;4;5;
len(l): 5


## Important remarks 

* attributes can be defined outside of the class statement
 * users can add new attributes that were not defined by developers
* attributes can be overriden!!
 * developers should avoid name conflicts
 * users should use data attributes with care
* methods is an attribute: it can also be defined outside of the class statement
 * methods defined inside override methods defined outside
 * last inside definitions override previous definitions
 * data attributes override method attributes 

In [274]:
%reset -f 
class empty(object): 
    pass # this class is empty
         # still a user could add new attributes 

e = empty()
pre_obj_attr = dir(e)
e.attr1=1         # user is defining a new attribute 
e.attr2='attr2'   # user is defining a new attribute
post_obj_attr = dir(e)
for i in post_obj_attr:       # loop objects...
    if i not in pre_obj_attr: # ...and check if they are new
        e_str = 'e.'+str(i)
        print(e_str,"=",eval(e_str))

e.attr1 = 1
e.attr2 = attr2


In [249]:
%reset -f 
def f(self): return "f() is defined outside"  # methods can be defined outside the class statement
class dumb(object): 
    f_in=f # f is assigned to f_in
    pass
def g(self): return "g() is defined outside"  # methods can be defined outside the class statement

d = dumb()
d.g = g
print(d.f_in())  # self is not needed, f_in belongs to the object d
try:
    print(d.g()) # this raise a TypeError
except TypeError as err:
    print("you need to pass the object to g():",end="\n\t")
    print("error:",err)          
print(d.g(d))    # self is needed, g does not belong to the object d 

f() is defined outside
you need to pass the object to g():
	error: g() missing 1 required positional argument: 'self'
g() is defined outside


In [226]:
%reset -f 
def f(self): return "outside" 
class dumb(object):    
    def f(self): return "inside" # methods defined inside override methods defined outside 

d = dumb()
print(d.f()) # invokes the inside definition

inside


In [227]:
%reset -f 
class dumb(object):        
    def f(self): return 1 
    def f(self): return 2 # last inside definitions override previous definitions
    
d = dumb()
print(d.f())

2


In [208]:
%reset -f 
class dumb(object):    
    method_or_string='something else'
    def method_or_string(self): return y  # last inside definitions override previous definitions
    def int_or_method(self): pass       
    int_or_fun=10                         # last inside definitions override previous definitions
        
d = dumb()
print("type(d.method_or_string) is....",type(d.method_or_string))
print("type(d.int_or_fun) is..........",type(d.int_or_fun))

type(d.method_or_string) is.... <class 'method'>
type(d.int_or_fun) is.......... <class 'int'>


In [248]:
%reset -f 
class dumb(object):    
    def f(self): self.f="overriden"  # data attributes override method attributes

d = dumb()
for i in range(4):
    if i==0: print("~~~~f is a method~~~~")
    if i==1: print("~~~~f is a data~~~~")
    try:
        print(type(d.f),end=": ")
        d.f()
        print("no error")
    except TypeError as err:
        print(err)        

~~~~f is a method~~~~
<class 'method'>: no error
~~~~f is a data~~~~
<class 'str'>: 'str' object is not callable
<class 'str'>: 'str' object is not callable
<class 'str'>: 'str' object is not callable


## Inheritance 

* a new class can inherit attributes from one or more "base" classes
 * this new class is called derived class
 * the base class must be in a scope containing the derived class
* if a requested attribute is not found in the class, the search proceeds to look in the base class   
 * this rule is applied recursively if base classe is derived itself
* a derived class can override methods of a base class
 * a method of a base class may end up calling a method of a derived class
 * this behaviour is equivalent to C++ virtual classes
 * the overriden base class method can be invoked explicitly (baseClass.methosName(self,args)) 
* sometimes overriden methods have a lot in commons with the relative base definition
 * super() is a built-in function that can be used to extend the base class method
* --
* isinstance(inst,class): True if inst is an instance of of class or derived 
* issubclass(subClass,class): True if subClass derived from class 

In [291]:
%reset -f
class base(object):
    def f0(self): return 'f0 is a method of base class'

class derived(base): # derived class inherits from base class
    pass

d = derived()
print(d.f0()) 

f0 is a method of base class


In [296]:
%reset -f
def anotherScope():
    class base(object):
        def f0(self): return 'f0 is a method of base class'

try:
    class derived(base): # base and derived class 
                         # should be in the same scope
        pass
except NameError as err:
    print("\nerr:",err)


err: name 'base' is not defined


In [309]:
%reset -f
class A(object):
    def f0(self): return "f0 is a method of class A: "+\
                         "it can be inherited recursively"
class B(A): # B inherits B from A
    pass
class C(B): # C inherits B from A
    pass
class D(A): # D inherits B from A
    pass

d = D()
print(d.f0())

f0 is a method of class A: it can be inherited recursively


In [311]:
%reset -f
class A(object):
    def f0(self): return "f0 is a method of class A"
class B(A): 
    def f0(self): return "f0 is a method of class B"

b = B()
print(b.f0()) # f0 was overriden by class B

f0 is a method of class B


In [1]:
%reset -f
class A(object):
    def f(self): return self.f0() # calls A.f0()
    def f0(self): return "f0 is a method of class A"
class B(A): 
    def f0(self): return "f0 is a method of class B"

a = A()    
b = B()
print(a.f()) 
print(b.f()) # b.f()->a.f()->(self).f0() == b.f0()
             # b inherits b from class A and calls A.f0()
             # however, A.f0() was overriden by B.f0()! 

f0 is a method of class A
f0 is a method of class B


In [47]:
%reset -f
class A(object):
    def   f(self): return a.__class__.f0(a) # calls A.f0()
    def  _f(self): return A.f0(a) 
    def f0(self): return "f0 is a method of class A"
class B(A): 
    def f0(self): return "f0 is a method of class B"

a = A()    
b = B()
print(a.f()) 
print(b.f())   # b.f()->a.f()->__class__.f0(obj) == obj.f0()
print(b._f())  # b.f()->a.f()->        A.f0(obj) == obj.f0()               # b inherits b from class A and calls A.f0()
               # the use of A.f0() is explicit, no override happens

f0 is a method of class A
f0 is a method of class A
f0 is a method of class A


In [112]:
%reset -f
class A(object):
    def __init__(self): self._1=1
class B(A): 
    def __init__(self): 
        super().__init__() 
        '''
         the method A.f0() is embedded in this line 
         then B._1 already exists!
         and I saved (many?) lines of code
         but also B._2 exists, while A._2 does not  
        ''' 
        self._2=2    
    
a = A() # sets self._1 
b = B() # sets self._1 and self._2
for attr in dir(b): # attributes that are in b and not in a 
    if attr not in dir(a):
        print(attr)

_2


In [9]:
%reset -f
class A(object):
    def f(self): pass
class B(A):       # inherits from A
    pass
class C(object):  # does not inherit from A
    pass
class D(B):       # inherits from B, which inherits from A
    pass

print("issubclass(B,A) ->",issubclass(B,A)) # True
print("issubclass(C,A) ->",issubclass(C,A)) # False
print("issubclass(C,A) ->",issubclass(D,A)) # True 

a = A()
b = B()   
print()
print("isinstance(b,B) ->",isinstance(b,B),\
      " .... b is an instance of B")  
print("isinstance(b,A) ->",isinstance(b,A),\
      " .... but B is a subclass of A") 
print("isinstance(a,B) ->",isinstance(a,B),\
      ".... however a is NOT an instance of B") 

issubclass(B,A) -> True
issubclass(C,A) -> False
issubclass(C,A) -> True

isinstance(b,B) -> True  .... b is an instance of B
isinstance(b,A) -> True  .... but B is a subclass of A
isinstance(a,B) -> False .... however a is NOT an instance of B


## Multiple inheritance

* a class can be derived by multiple base class
* search for attributes is depth-first, left-to-right 
 * Method Resolution Order algorithm 
 * base class should be accessed only once, otherwise TypeError is raised
* type(obj).mro(): returns the order in which the methods accessed by obj are searched 
* super() needs to know which base class needs to be accessed 

In [34]:
%reset -f
class C(object): 
    def f(self): return "C"
class B(C): 
    def f(self): return "B"
class A(B,C): # B->C,C... clearly B is looked first
    pass

a = A() 
print("a.f() is ",a.f()) # B or C? 
print("MRO is", type(a).mro())

a.f() is  B
MRO is [<class '__main__.A'>, <class '__main__.B'>, <class '__main__.C'>, <class 'object'>]


In [113]:
%reset -f
class C(object): 
    def f(self): return "C"
class B(C): 
    def f(self): return "B"

try:    
    class A(C,B): # C,B->C... what class should be looked first?
                  # raises method resolution order not consistent
        pass      
except TypeError as err: 
    print(err)

Cannot create a consistent method resolution
order (MRO) for bases B, C


In [9]:
%reset -f
class A(object):
    def __init__(self):
        self.A = None
class B(A): 
    def __init__(self):
        super(A,self).__init__()      # super(A,self) -> object
        self.B = None
class C(B,A): 
    def __init__(self):
        super(C,self).__init__()      # super(C,self) -> B        
        self.C = None
class D(B,A): 
    def __init__(self):  
        super().__init__()            # super()       -> B
                                      # super(D,self) -> B
        self.D = None
class E(D,C,B,A):
    def __init__(self):  
        super(B,self).__init__() # -> A
        super(C,self).__init__() # -> B
        super(D,self).__init__() # -> C        
        super(E,self).__init__() # -> D   

a = A()        
b = B() 
c = C() 
d = D() 
e = E() 

for obj in [a,b,c,d,e]: # which attributes have been set?     
    print("\n",type(obj),"\n\tlist of attributes: ",end="")  
    for attr in dir(obj): 
        if attr not in dir(A): # Please note: A is not A()... 
            print(attr,end=" ")                              


 <class '__main__.A'> 
	list of attributes: A 
 <class '__main__.B'> 
	list of attributes: B 
 <class '__main__.C'> 
	list of attributes: B C 
 <class '__main__.D'> 
	list of attributes: B D 
 <class '__main__.E'> 
	list of attributes: A B C D 

## super() and Method Resolution Order  

* super() means super(NextBaseClass,self)
 * MRO rules what means "next" 
 * diamond structure: what is method order?
```txt
|               A.f()
|              / \
|             /   \
|            /     \
|           B.f()   C.f()
|            \     / 
|             \   /  
|              \ /   
|               D.f()
```


In [7]:
%reset -f
class A(object):
    def f(self): print("   calls A.f()")
class B(A):
    def f(self):
        print(" enters B.f()")
        super().f()
        print(" exit B.f()")
class C(A):
    def f(self): 
        print("  enters C.f()")
        super().f()
        print("  exit C.f()")
class D(B,C):
    def __init__(self): self.f() 
    def f(self): 
        print("enters D.f()")
        super().f()
        print("exit D.f()")
        
d=D()        
print(D.mro())

enters D.f()
 enters B.f()
  enters C.f()
   calls A.f()
  exit C.f()
 exit B.f()
exit D.f()
[<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <class 'object'>]


## Practical example using super()

* force a list to have integers values

In [223]:
class intList(list):
    def __init__(self,items):
        for item in items: self._validate(item)
        super().__init__()
        self.items = list(items)
    def __str__(self): return str(self.items)
    def _validate(self,item): 
        if not isinstance(item, int):
            raise TypeError('{} only supports integer values.'.format(self.__class__.__name__))
            
class sorted_intList(intList):
    def __init__(self,int_list):    
        super().__init__(int_list.items)   
        self.items.sort()
    def __str__(self): return str(self.items)
            
try: 
    l = intList([""])            
except TypeError as err:
    print(err,end="\n\n")
l = intList([3,10,8,6,7,2,1])
sl = sorted_intList(l)
print("my intList is.........",l)
print("once sorted becomes...",sl)

intList only supports integer values.

my intList is......... [3, 10, 8, 6, 7, 2, 1]
once sorted becomes... [1, 2, 3, 6, 7, 8, 10]


## exercises to do 

* follow the order for implementation (and use inheritance and super())
* cause exceptions of type Lecture2Err when raising an error
* -- 
* define Point(...): 
 * receives a list of two numeric values (i.e.: the coordinates), otherwise raise err
 * method isnumeric(self,c): returns c if numeric, otherwise raise err
 * method is2D(self,coord): returns coord if Point is 2D, otherwise raise err
* define Shape(...): 
 * receives a list of type Point, otherwise raise err
 * method ispoint(self,p): returns p if Point, otherwise raise err
 * define any other method you may need
* define Circle(...), Segment(...), Triangle(...): 
 * Circle.\__init\__(self,center,radius): center is a Point (only 1 Point), radius is a numeric, otherwise raise err
 * Segment.\__init\__(self,points): points is a list of 2 only Points, otherwise raise err
 * Triangle.\__init\__(self,points): points is a list of 3 only Points, otherwise raise err
 * method checkNpoints(...): raise err if an instance was defined with the wrong number of points
 * method calc_perimeter(self): returns the value of perimeter 
 * method calc_area(self): returns the value of the area
 * define any other method you may need