<a href="https://colab.research.google.com/github/fbeilstein/machine_learning/blob/master/lecture_1_intro_and_python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Introduction

#Machine learning landscape

#Curriculum

**This course will consist of 3 main parts:**


  * Intro
  * Tools of ML
  * Methods of ML
---

**I expect to cover:**

---

* Intro
    - General Info, Python
    - More about Python

---

  * Tools of ML
   - NumPy Arrays
   - Pandas
   - MatPlotLib

---

  * Methods of ML
    - Naive Bayes Classification
    - Linear Regression
    - Support Vector Machines
    - Decision Trees and Random Forests
    - Principal Component Analysis
    - Manifold Learning
    - k-Means Clustering
    - Gaussian Mixture Models
    - Kernel Density Estimation
    - What's next: NNs and beyond


#Instruments we will use

#Python (expected to span for at least two lectures)

##Types and Operators

###Big picture

**Preview: built-in objects**
 
Object type | Example constants/usage
---|---
Numbers | 3.14, 1234, 999L, 3+4j, decimal
Strings | 'spam', "spam's"
Lists | [1, [2, 'three'], 4]
Dictionaries | {'food':'spam', 'taste':'yum'}
Tuples | (1,'spam', 4, 'U')
Files | text = open('eggs', 'r').read()
Others | sets, types, None, bool

 

**Built-in Types**

   * Key terms: “sequence”, “immutable”, “mapping”
   * Key ideas: no fixed types, no fixed sizes, arbitrary nesting
   * Full story: dir(object), help(object.method), manuals

**Python program structure**

   * Programs are composed of modules
   * Modules contain statements
   * Statements contain expressions
   * Expressions create and process objects

**Why use built-in types?**

   * Python provides objects and supports extensions
   * Built-in objects make simple programs easy to write
   * Built-in objects are components of extensions
   * Often more efficient than custom data structures

###Numbers

**Standard types and operators**

   * Integer, floating-point, hex/octal constants
   * ‘long’ integer type with unlimited precision
   * Built-in mathematical functions: ‘pow’, ‘abs’
   * Utility modules: ‘random’, ‘math’
   * Complex numbers, ‘**’ power operator

**Numeric Python (NumPy)**

   * An optional extension, beyond core language
   * For advanced numeric programming in Python
   * Matrix object, interfaces to numeric libraries, etc.
   * Plus SciPy, matplotlib, pypar, IPython shell, others
   * Python + NumPy = open source MATLAB alternative

**Numeric literals**
 
Constant | Interpretation
---|---
1234, -24 | integers (C longs, 3.X: unlimited size)
99999999L | 2.X long integers (unlimited size)
1.23, 3.14e-10 | floating-point (C doubles)
0o177,0x9f,0b101 | octal, hex, binary integer literals
3+4j, 3.0+4.0j | complex number literals
Decimal('0.11') | fixed-precision decimal (2.4+)
Fraction(2, 3) | rational type (2.6+, 3.0+)

 **Python expressions**

   * Usual algebraic operators: ‘+’ , ‘-’, ‘*’, ‘/’, . . .
   * C’s bitwise operators: “<<”,  “&”, . . .
   * Mixed types: converted up just as in C
   * Parenthesis group sub-expressions

**Numbers in action**

   * Variables created when assigned
   * Variables replaced with their value when used
   * Variables must be assigned before used
   * Expression results echoed back
   * Mixed integer/float: casts up to float
   * Integer division truncates (until 3.X: use // to force)

In [0]:
a = 3 # name created
b = 4
b / 2 + a # same as ((4 / 2) + 3)

5.0

In [0]:
b / (2.0 + a) # same as (4 / (2.0 + 3))

0.8

In [0]:
1 / 2, 1 // 2 # / keeps remainder, // does not

(0.5, 0)

In [0]:
4 / 5.0

0.8

###The dynamic typing

In [0]:
a = 3 # Where are the missing declarations?

   * Names versus objects

   * Names are always “references” (pointers) to objects

   * Names are created when first assigned (or so)

   * Objects have types, names do not

   * Each value is a distinct object (normally)

   * Objects are pieces of memory with value + operations

   * Shared references to mutables are open to side effects (on purpose)

In [0]:
x = 1
x << 2 # shift left 2 bits

4

In [0]:
x | 2 # bitwise OR

3

In [0]:
x & 1 # bitwise AND

1

**Long integers**

   * int in 3.X Python
   * Via ‘L’ suffix
   * Some performance penalty
   * Auto-converted to long if too big (“L” optional)

In [0]:
9999999999999999999999999999 + 1

10000000000000000000000000000

**Decimal and Fraction extension types**

In [0]:
0.1 + 0.1 + 0.1 - 0.3 

5.551115123125783e-17

In [0]:
from decimal import Decimal

Decimal('0.1') + Decimal('0.1') + Decimal('0.1') - Decimal('0.3')

Decimal('0.0')

In [0]:
Decimal('0.1') + Decimal('0.10') + Decimal('0.10') - Decimal('0.30')

Decimal('0.00')

In [0]:
from fractions import Fraction

Fraction(1, 3) + Fraction(2, 8)

Fraction(7, 12)

In [0]:
Fraction(1, 10) + Fraction(1, 10) + Fraction(1, 10) - Fraction(3, 10)

Fraction(0, 1)

**Python operators and precedence**

   * Operators lower in table bind tighter (parens force order)
   * Preview: all Python operators may be overloaded by Python classes and C extension types
   * Added in Python 2.0: +=. *=, &=, …  augmented assignment statements, not operators
   * Python 3.X: `X` → repr(X), X / Y → true div, X <> Y → X != Y
   * Recent operator/expression additions:

Operators | Description
---|---
x if y else z | Ternary if, same as 4-line if/else statement
yield [from] x | Generator function’s iteration result (return can send one too)
await x | For 3.5+ async def coroutines
x @ y | Matrix multiply in 3.5+ (but not used by core Python itself!)
[x, *iter] | Unpacks (flattens) objects in literals in 3.5+

Operators | Description
---|---
x or y, lambda args: expr | Logical ‘or’ (y is only evaluated if x is false), anonymous function
x and y | Logical ‘and’ (y is only evaluated if x is true)
not x | Logical negation
<, <=, >, >=, ==, <>, !=, is, is not, in, not in | Comparison operators, sequence membership
x \| y | Bitwise ‘or’
x ^ y | Bitwise ‘exclusive or’
x & y | Bitwise ‘and’
x << y, x >> y | Shift x left or right by y bits
x + y, x – y | Addition/concatenation, subtraction
x * y, x / y, x % y, x // y | Multiply/repetition, divide, remainder/format, floor divide
x ** y, -x, +x, ~x | Power, unary negation, identity, bitwise compliment
x[i], x[i:j], x.y, x(...) | Indexing, slicing, qualification, function calls
(...), [...], {...}, `...` | Tuple, list, dictionary, conversion to string

###Strings

*     Ordered collections of characters
*     No char in Python, just 1-character strings
*     Constants, operators, utility modules (string, re)
*     Strings are immutable sequences
*     See re module for pattern-based text processing

About Unicode support: this section covers basic, ASCII text strings. In 3.X, strings are always Unicode and support encoding to bytes, and bytes strings represent truly binary 8-bit data and support decoding to strings. In 2.X, strings are essentially the same as 3.X bytes strings (containing just 8-bit characters), and Unicode is a special type similar to 3.X strings.

**Common string operations**

Operation |	 Interpretation
---|---
s1 = '' |	single quotes (empty)
s2 = "spam's" |	double quotes (same)
block = """...""" |	triple-quoted blocks, can span multiple lines
r'C:\new\text\file.txt' |	raw strings (\ kept)
s1 + s2, s2 * 3 |	concatenate, repeat
s2[i], s2[i:j], s[i:j:k], len(s2) |	index, slice, length
'a %s parrot' % 'dead' 'a {} parrot'.format('dead') |	string formatting: original, 2.6+ option
u'A\xC4B', 'A\xC4B' |	Unicode: 2.X, 3.X
b'\x00spam\x01' |	bytes: 3.X (and 2.X)
f'we get {spam} a lot' |	formats: 3.6?
for x in s2, 'm' in s2 |	iteration/membership

**Newer extensions**

*    String methods:
X.split('+') same as older string.split(X, '+')
string module requires import, methods do not
methods now faster, preferred to string module
*    Unicode strings:
Multi-byte characters, for internationalization (I18N)
U'xxxx' constants, Unicode modules, auto conversions
Can mix with normal strings, or convert: str(U), unicode(S)
Varies in 3.X: see note box above for more details
*    Template formatting: string module, see ahead
*    String .format() method: largely redundant with %

**Strings in action**

In [0]:
'abc' + 'def' # concatenation: a new string

'abcdef'

In [0]:
'Ni!' * 4 # like "Ni!" + "Ni!" + ...

'Ni!Ni!Ni!Ni!'

**Indexing and slicing**

In [0]:
S = 'spam'
S[0], S[-2] # indexing from from or end

('s', 'a')

In [0]:
S[1:3], S[1:], S[:-1] # slicing: extract section

('pa', 'pam', 'spa')

**Changing and formatting**

In [0]:
S = S + 'Spam!' # to change a string, make a new one
S

'spamSpam!'

In [0]:
'That is %d %s bird!' % (1, 'dead') # like C sprintf

'That is 1 dead bird!'

**Advanced formatting examples**

In [0]:
x = 1234
res = "integers: ...%d...%-6d...%06d" % (x, x, x)
res

'integers: ...1234...1234  ...001234'

In [0]:
x = 1.23456789
x

1.23456789

In [0]:
'%e | %f | %g' % (x, x, x)

'1.234568e+00 | 1.234568 | 1.23457'

In [0]:
'%-6.2f | %05.2f | %+06.1f' % (x, x, x)

'1.23   | 01.23 | +001.2'

In [0]:
int(x)

1

In [0]:
round(x, 2)

1.23

In [0]:
round(x, 4)

1.2346

In [0]:
"%o %x %X" % (64, 64, 255)

'100 40 FF'

In [0]:
hex(255), int('0xff', 16), eval('0xFF')

('0xff', 255, 255)

In [0]:
ord('s'), chr(115)

(115, 's')

**Formatting with dictionaries**

In [0]:
D = {'xx': 1, 'yy': 2}
"%(xx)d => %(yy)s" % D

'1 => 2'

In [0]:
aa = 3
bb = 4
"%(aa)d => %(bb)s" % vars()

'3 => 4'

In [0]:
reply = """
Greetings...
Hello %(name)s!
Your age squared is %(age)s
"""
values = {'name': 'Bob', 'age': 40}
print(reply % values)


Greetings...
Hello Bob!
Your age squared is 40



**Formatting method**

In [0]:
'{} {}'.format(42, 'spam')

'42 spam'

In [0]:
'{0:.2f}'.format(1.234)

'1.23'

In [0]:
'{0:,.2f}'.format(1234567.234)

'1,234,567.23'

**Template formatting**

In [0]:
('%(page)i: %(title)s' % {'page':2, 'title': 'The Best of Times'})

'2: The Best of Times'

In [0]:
import string
t = string.Template('$page: $title')
t.substitute({'page':2, 'title': 'The Best of Times'})

'2: The Best of Times'

In [0]:
s = string.Template('$who likes $what')
s.substitute(who='bob', what=3.14)

'bob likes 3.14'

In [0]:
s.substitute(dict(who='bob', what=3.14))

'bob likes 3.14'

**Common string tools**

In [0]:
S = "spammify"
S.upper() # convert to uppercase

'SPAMMIFY'

In [0]:
S.find("mm") # return index of substring

3

In [0]:
int("42"), str(42) # convert from/to string

(42, '42')

In [0]:
S.split('mm') # splitting and joining

['spa', 'ify']

In [0]:
'XX'.join(S.split("mm"))

'spaXXify'

**Example: replacing text**

In [0]:
# replace method
S = 'spammy'
S = S.replace('mm', 'xx')
S

'spaxxy'

In [0]:
S = 'xxxxSPAMxxxxSPAMxxxx'
S.replace('SPAM', 'EGG') # replace all

'xxxxEGGxxxxEGGxxxx'

In [0]:
# finding and slicing
S = 'xxxxSPAMxxxxSPAMxxxx'
where = S.find('SPAM') # search for position
where # occurs at offset 4

4

In [0]:
S = S[:where] + 'EGGS' + S[(where+4):]
S

'xxxxEGGSxxxxSPAMxxxx'

In [0]:
# exploding to/from list
S = 'spammy'
L = list(S) # explode to list
L

['s', 'p', 'a', 'm', 'm', 'y']

In [0]:
L[3] = 'x' # multiple in-place changes
L[4] = 'x' # cant do this for strings
L

['s', 'p', 'a', 'x', 'x', 'y']

In [0]:
S = ''.join(L) # implode back to string
S

'spaxxy'

**Example: parsing with slices**

In [0]:
line = 'aaa bbb ccc'
col1 = line[0:3] # columns at fixed offsets
col3 = line[8:]
col1

'aaa'

In [0]:
col3

'ccc'

**Example: parsing with splits**

In [0]:
line = 'aaa bbb ccc' # split around whitespace
cols = line.split()
cols

['aaa', 'bbb', 'ccc']

In [0]:
line = 'bob,hacker,40' # split around commas
line.split(',')

['bob', 'hacker', '40']

**Generic type concepts**

*    Types share operation sets by categories
*    Numbers support addition, multiplication, . . .
*    Sequences support indexing, slicing, concatenation, . . .
*    Mappings support indexing by key, . . .
*    Mutable types can be changed in place
*    Strings are immutable sequences

**Concatenation and repetition**

*      X + Y makes a new sequence object with the contents of both operands
*      X * N makes a new sequence object with N copies of the sequence operand

**Indexing and slicing**

*    Indexing
     -      Fetches components via offsets: zero-based
     -      Negative indexes: adds length to offset
     -      S[0] is the first item
     -      S[-2] is the second from the end (4 - 2)
     -      Also works on mappings, but index is a key
*    Slicing
     -      Extracts contiguous sections of a sequence
     -      Slices default to 0 and the sequence length if omitted
     -      S[1:3] fetches from offsets 1 upto but not including 3
     -      S[1:] fetches from offsets 1 through the end (length)
     -      S[:-1] fetches from offsets 0 upto but not including last
     -      S[I:J:K] newer, I to J by K, K is a stride/step (S[::2])

###Lists

*     Arrays of object references
*     Access by offset
*     Variable length, heterogeneous, arbitrarily nestable
*     Category: mutable sequence
*     Ordered collections of arbitrary objects

**Common list operations**

Operation |	Interpretation
---|---
L1 = [] |	an empty list
L2 = [0, 1, 2, 3] |	4-items: indexes 0..3
['abc', ['def', 'ghi']] |	nested sublists
L2[i], L2[i:j], len(L2) |	index, slice, length
L1 + L2, L2 * 3 |	concatenate, repeat
L1.sort(), L2.append(4) |	methods: sort, grow
del L2[k], L2[i:j] = [] |	shrinking
L2[i:j] = [1,2,3] |	slice assignment
range(4), xrange(0, 4) |	make integer lists
for x in L2, 3 in L2 |	iteration/membership

**Lists in action**

In [0]:
[1, 2, 3] + [4, 5, 6] # concatenation

[1, 2, 3, 4, 5, 6]

In [0]:
['Ni!'] * 4 # repetition

['Ni!', 'Ni!', 'Ni!', 'Ni!']

**Indexing and slicing**

In [0]:
L = ['spam', 'Spam', 'SPAM!']
L[2]
'SPAM!'
L[1:]
['Spam', 'SPAM!']
 


**Changing lists in-place**

In [0]:
L[1] = 'eggs' # index assignment
L

['s', 'eggs', 'a', 'x', 'x', 'y']

In [0]:
L[0:2] = ['eat', 'more'] # slice assignment
L # replace items 0,1

['eat', 'more', 'a', 'x', 'x', 'y']

In [0]:
L.append('please') # append method call
L

['eat', 'more', 'a', 'x', 'x', 'y', 'please']

*      Only works for mutable objects: not strings
*      Index assignment replaces an object reference
*      Slice assignment deletes a slice and inserts new items
*      Append method inserts a new item on the end (realloc)

**Preview: iteration/membership**

In [0]:
for x in L: print(x, end=',')

eat,more,a,x,x,y,please,

**Example: 2-dimensional array**

In [0]:
matrix = [[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]
matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [0]:
matrix[1]

[4, 5, 6]

In [0]:
matrix[1][1]

5

In [0]:
matrix[2][0]

7

###Dictionaries

*     Tables of object references
*     Access by key, not offset (hash-tables)
*     Variable length, heterogeneous, arbitrarily nestable
*     Category: mutable mappings (not a sequence)
*     Unordered collections of arbitrary objects

**Common dictionary operations**

Operation |	Interpretation
---|---
d1 = {} |	empty dictionary
d2 = {'spam': 2, 'eggs': 3} |	2 items
d3 = {'food': {'ham': 1, 'egg': 2}} |	nesting
d2['eggs'], d3['food']['ham'] |	indexing by key
d2.has_key('eggs'), d2.keys() |	methods
d2.get('eggs', default) |	default values
len(d1) |	length (entries)
d2[key] = new, del d2[key] |	adding/changing

**Dictionaries in action**

In [0]:
d2 = {'spam': 2, 'ham': 1, 'eggs': 3}
d2['spam']

2

In [0]:
len(d2) # number entries

3

In [0]:
d2.keys() # list of keys

dict_keys(['spam', 'ham', 'eggs'])

**Changing dictionaries**

In [0]:
d2['ham'] = ['grill', 'bake', 'fry']
d2

{'eggs': 3, 'ham': ['grill', 'bake', 'fry'], 'spam': 2}

In [0]:
del d2['eggs']
d2

{'ham': ['grill', 'bake', 'fry'], 'spam': 2}

**Making dictionaries**

In [0]:
# literals
D = {'name': 'Bob', 'age': 42, 'job': 'dev'}
D

{'age': 42, 'job': 'dev', 'name': 'Bob'}

In [0]:
# keywords
D = dict(name='Bob', age=42, job='dev')
D

{'age': 42, 'job': 'dev', 'name': 'Bob'}

In [0]:
# field by field
D = {}
D['name'] = 'Bob'
D['age'] = 42
D['job'] = 'dev'
D

{'age': 42, 'job': 'dev', 'name': 'Bob'}

In [0]:
# zipped keys/values
pairs = zip(['name', 'age', 'job'], ('Bob', 42, 'dev'))
pairs

<zip at 0x7fb2d7cc9b08>

In [0]:
D = dict(pairs)
D

{'age': 42, 'job': 'dev', 'name': 'Bob'}

In [0]:
# key lists
D = dict.fromkeys(['name', 'age', 'job'], '?')
D

{'age': '?', 'job': '?', 'name': '?'}

**A language table**

In [0]:
table = {'Perl': 'Larry Wall', 
         'Tcl': 'John Ousterhout', 
         'Python': 'Guido van Rossum' }

language = 'Python'
creator = table[language]
creator

'Guido van Rossum'

In [0]:
for lang in table.keys(): print(lang)

Perl
Tcl
Python


**Dictionary usage notes**

*      Sequence operations dont work!
*      Assigning to new indexes adds entries
*      Keys need not always be strings

**Example: simulating auto-grown lists**

In [0]:
L = [] # L=[0]*100 would help
print("ERROR EXPECTED")
L[99] = 'spam'

ERROR EXPECTED


IndexError: ignored

In [0]:
D = {}
D[99] = 'spam'
D[99]

'spam'

In [0]:
D

{99: 'spam'}

**Example: dictionary-based records**

In [0]:
rec = {}
rec['name'] = 'mel'
rec['age'] = 40
rec['job'] = 'trainer/writer'
print(rec['name'])

mel


In [0]:
mel = {'name': 'Mark', 
       'jobs': ['trainer', 'writer'], 
       'web': 'www.rmi.net/~lutz', 
       'home': {'state': 'CO', 'zip':80503}}

In [0]:
mel['jobs']

['trainer', 'writer']

In [0]:
mel['jobs'][1]

'writer'

In [0]:
mel['home']['zip']

80503

**Example: more object nesting**

In [0]:
rec = {'name': {'first': 'Bob', 'last': 'Smith'},
       'job': ['dev', 'mgr'],
       'age': 40.5}
rec['name']

{'first': 'Bob', 'last': 'Smith'}

In [0]:
rec['name']['last']

'Smith'

In [0]:
rec['job'][-1]

'mgr'

In [0]:
rec['job'].append('janitor')
rec

{'age': 40.5,
 'job': ['dev', 'mgr', 'janitor'],
 'name': {'first': 'Bob', 'last': 'Smith'}}

In [0]:
db = {}
db['bob'] = rec # collecting records into a db
db

{'bob': {'age': 40.5,
  'job': ['dev', 'mgr', 'janitor'],
  'name': {'first': 'Bob', 'last': 'Smith'}}}

Preview: Bob could be a record in a real database, using shelve or pickle persistence modules: watch for more details on the class and database units. This nested dictionary/list structure is also the genesis of JSON: see Pythons json standard library module for a trivial translation utility, and the Database unit for a simple example.

**Example: dictionary-based sparse matrix**

In [0]:
Matrix = {}
Matrix[(2,3,4)] = 88 # tuple key is coordinates
Matrix[(7,8,9)] = 99

X = 2; Y = 3; Z = 4 # ; separates statements
Matrix[(X,Y,Z)]

88

In [0]:
Matrix

{(2, 3, 4): 88, (7, 8, 9): 99}

In [0]:
Matrix.get((0, 1, 2), 'Missing')

'Missing'

###Tuples

*     Arrays of object references
*     Access by offset
*     Fixed length, heterogeneous, arbitrarily nestable
*     Category: immutable sequences (cant be changed)
*     Ordered collections of arbitrary objects

**Common tuple operations**

Operation |	Interpretation
---|---
() |	an empty tuple
T1 = (0,) |	a one-item tuple
T2 = (0, 1, 2, 3) |	a 4-item tuple
T2 = 0, 1, 2, 3 |	another 4-item tuple
T3 = ('abc', ('def', 'ghi')) |	nested tuples
T1[i], t1[i:j], len(t1) |	index, slice, length
T1 + t2, t2 * 3 |	concatenate, repeat
for x in t2, 3 in t2 |	iteration/membership

**Tuples in action**

In [0]:
T1 = (1, 'spam')
T2 = (2, 'ni')
 
T1 + T2

(1, 'spam', 2, 'ni')

In [0]:
T1 * 4

(1, 'spam', 1, 'spam', 1, 'spam', 1, 'spam')

In [0]:
T2[1]

'ni'

In [0]:
T2[1:]

('ni',)

**Why lists and tuples?**

*      Immutability provides integrity
*      Some built-in operations require tuples (argument lists)
*      Guido was a mathematician: sets versus data structures

###Files

*     A wrapper around Cs stdio file system (io module in 3.X)
*     The builtin open function returns a file object
*     File objects export methods for file operations
*     Files are not sequences or mappings (methods only)
*     Files are a built-in C extension type

**Common file operations**

Operation |	Interpretation
---|---
O = open('/tmp/spam', 'w') |	create output file
I = open('data', 'r') |	create input file
I.read(), I.read(1) |	read file, byte
I.readline(), I.readlines() |	read line, lines list
O.write(S), O.writelines(L) |	write string, lines
O.close() |	manual close (or on free)

**Files in action**

In [0]:
newfile = open('test.txt', 'w')
newfile.write(('spam' * 5) + '\n')
newfile.close()
 

myfile = open('test.txt')
text = myfile.read()
text

'spamspamspamspamspam\n'

**Related Python tools**

*      Descriptor based files: os module
*      DBM keyed files
*      Persistent object shelves
*      Pipes, fifos, sockets

###General object properties

**Type categories revisited**

*      Objects share operations according to their category
*      Only mutable objects may be changed in-place

Object type |	Category |	Mutable?
---|---|---
Numbers |	Numeric |	No
Strings |	Sequence |	No
Lists |	Sequence |	Yes
Dictionaries |	Mapping |	Yes
Tuples |	Sequence |	No
Files |	Extension |	n/a

**Generality**

*     Lists, dictionaries, and tuples can hold any kind of object
*     Lists, dictionaries, and tuples can be arbitrarily nested
*     Lists and dictionaries can dynamically grow and shrink

**Nesting example**

In [0]:
L = ['abc', [(1, 2), ([3], 4)], 5]
L[1][1]

([3], 4)

In [0]:
L[1][1][0]

[3]

In [0]:
L[1][1][0][0]

3

**Shared references**

*     Assignments always create references to objects
*     Can generate shared references to the same object
*     Changing a mutable object impacts all references
*     To avoid effect: make copies with X[:], list(X), etc.
*     Tip: distinguish between names and objects!
       -   Names have no "type", but objects do

In [0]:
X = [1, 2, 3]
L = ['a', X, 'b']
D = {'x':X, 'y':2}

X[1] = 'surprise' # changes all 3 references!
L

['a', [1, 'surprise', 3], 'b']

In [0]:
D

{'x': [1, 'surprise', 3], 'y': 2}

**Equality and truth**

*     Applied recursively for nested data structures
*     is tests identity (object address)
*     True: non-zero number or non-empty data structure
*     None is a special empty/false object

In [0]:
L1 = [1, ('a', 3)] # same value, unique objects
L3 = [1, ('a', 3)]
L1 == L3, L1 is L3 # equivalent?, same object?

(True, False)

**Other comparisons**

*     Applied recursively for nested data structures
*     Strings compared lexicographically
*     Lists and tuples compared depth-first, left-to-right
*     Dictionaries compared by sorted (key, value) lists
*     Disctionaries dont compare in 3.X, but their .items() do

In [0]:
L1 = [1, ('a', 3)]
L2 = [1, ('a', 2)]
L1 < L2, L1 == L2, L1 > L2

(False, False, True)

###Summary: Pythons type hierarchies

*     Everything is an object type in Python: first class
*     Types are objects too: type(X) returns type object of X
*     Preview: C extension modules and types use same mechanisms as Python types

**How to break your codes flexibility**

In [0]:
L = [1, 2, 3]
if type(L) == type([]):
  print('yes')

yes


In [0]:
if type(L) == list:
  print('yes')

yes


In [0]:
if isinstance(L, list):
  print('yes')

yes


**Newer types**

*      Decimal, fraction (modules): see above
*      Boolean (bool, True, False): see next section
*      Sets: 2.4 (module in 2.3), 3.X/2.7 literal/comprehension
*      Other extension types: namedtuple, collection (std lib)

In [0]:
True + 2 # bool: True/False like 1/0 (next section)

3

**Sets**

In [0]:
x = set('abcde') # set constructor
y = set('bdxyz')
x

{'a', 'b', 'c', 'd', 'e'}

In [0]:
'e' in x # membership

True

In [0]:
x - y # difference

{'a', 'c', 'e'}

In [0]:
x | y # union

{'a', 'b', 'c', 'd', 'e', 'x', 'y', 'z'}

In [0]:
x & y # intersection

{'b', 'd'}

In [0]:
x = {'a', 'b', 'c', 'd'} # 3.X set literal
x

{'a', 'b', 'c', 'd'}

In [0]:
{'b', 'd'} < x # subset test

True

In [0]:
{ord(c) for c in 'spam'} # 3.X set comprehension (ahead)

{97, 109, 112, 115}

In [0]:
{c: ord(c) for c in 'spam'} # 3.X dict comprehension (ahead)

{'a': 97, 'm': 109, 'p': 112, 's': 115}

###Built-in type gotchas

*    Assignment creates references, not copies

In [0]:
L = [1, 2, 3]
M = ['X', L, 'Y']
M

['X', [1, 2, 3], 'Y']

In [0]:
L[1] = 0
M

['X', [1, 0, 3], 'Y']

*    Repetition adds 1-level deep

In [0]:
L = [4, 5, 6]
X = L * 4 # like [4, 5, 6] + [4, 5, 6] + ...
Y = [L] * 4 # [L] + [L] + ... = [L, L,...]
X, Y

([4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6],
 [[4, 5, 6], [4, 5, 6], [4, 5, 6], [4, 5, 6]])

In [0]:
L[1] = 0
X, Y

([4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6],
 [[4, 0, 6], [4, 0, 6], [4, 0, 6], [4, 0, 6]])

*    Cyclic structures print oddly (loop in 1.5)

In [0]:
L = ['hi.']; L.append(L) # append reference to self
L # dots=cycle today (no loop)

['hi.', [...]]

*    Immutable types cant be changed in-place

In [0]:
T = (1, 2, 3)
print("ERROR EXPECTED")
T[2] = 4 # error!

ERROR EXPECTED


TypeError: ignored

In [0]:
T = T[:2] + (4,) # okay: (1, 2, 4)
T

(1, 2, 4)

##Basic Statements

**Python program structure**

*     Programs are composed of modules
*     Modules contain statements
*     Statements contain expressions: logic
*     Expressions create and process objects

Statement |	Examples
---|---
Assignment |	curly, moe, larry = 'good', 'bad', 'ugly'
Calls |	stdout.write("spam, ham, toast\n")
Print (a call in 3.X) |	print 1, "spam", 4, 'u',
If/elif/else |	if "python" in text: mail(poster, spam)
For/else |	for peteSake in spam: print peteSake
While/else |	while 1: print 'spam',i; i=i+1
Pass |	while 1: pass
Break, Continue |	while 1: break
Try/except/finally |	try: spam() except: print 'spam error'
Raise |	raise overWorked, cause
Import, From |	import chips; from refrigerator import beer
Def, Return,Yield |	def f(a, b, c=1, *d): return a+b+c+d[0]
Class |	class subclass(superclass): staticData = []
Global, Nonlocal (3.X) |	def function(): global x, y; x = 'new'
Del |	del spam[k]; del spam[i:j]; del spam.attr
Exec (a call in 3.X) |	exec "import " + moduleName in gdict, ldict
Assert |	assert name != "", "empty name field"
With/As (2.6+) |	with open('text.dat') as F: process(F)

###General syntax concepts

**Python syntax**

*     No variable/type declarations
*     No braces or semicolons
*     The what you see is what you get of languages

**Python assignment**

*     Assignments create object references
*     Names are created when first assigned
*     Names must be assigned before being referenced

**A Tale of Two Ifs**

C++/Java/etc.:
```c
if (x)
{
    x = y + z; // braces, semicolons, parens
}
```

Python:
```python
if x:
  x = y + z # indented blocks, end of line, colon
```

*      What Python removes [(), ;, {}], and adds [:]
*      Why indentation syntax? [readability counts!]

###Assignment

*     = assigns object references to names or components
*     Implicit assignments: import, from, def, class, for, calls

Operation |	Interpretation
---|---
spam = 'SPAM' |	basic form
spam, ham = 'yum', 'YUM' |	tuple assignment
[spam, ham] = ['yum', 'YUM'] |	list assignment
a, b, c, d = 'spam' |	sequence assign
spam = ham = 'lunch' |	multiple-target
spam += 42; ham *= 12 |	Augmented (2.0)
a, *b, c = [1, 2, 3, 4] |	Extended (3.X)

**Variable name rules**

*     (_ or letter) + (any number of letters, digits, _s)
*     Case matters: SPAM is not spam
*     But cant use reserved words:
*     + yield, for generators (2.3 and later)
*     + with and as for context managers (2.6, optional in 2.5 though not in 2.5 IDLE!)
*     3.X: minus print, exec; plus None, True, False, nonlocal (and in 3.5+: async, await)
*     Also applies to module file names:
    -     some-code.py can be run, but not imported! (notice dash!)

and |	assert |	break |	class
---|---|---|---
continue |	def |	del |	elif
else |	except |	exec |	finally
for |	from |	global |	if
import |	in |	is |	lambda
not |	or |	pass |	print
raise |	return |	try |	while

###Expressions

*     Useful for calls, and interactive prints
*     Expressions can be used as statements
*     But statements cannot be used as expressions (=)

Operation |	Interpretation
---|---
spam(eggs, ham) |	function calls
spam.ham(eggs) |	method calls
spam |	interactive print
spam < ham and ham != eggs |	compound expr's
spam < ham < eggs |	range tests

###Print

*     print statement writes objects to the stdout stream
*     File object write methods write strings to files
*     Adding a trailing comma suppresses line-feed
*     Reset sys.stdout to catch print output
*     3.X: a built-in function call with arguments in ()

**Python 2.X form:**

Operation |	Interpretation
---|---
print spam, ham |	print objects to sys.stdout
print spam, ham, |	dont add linefeed at end
print>>file, spam |	Python 2.0: not to stdout

**Python 3.X form:**

```python
print(spam, ham, sep='::', end='.\n', file=open('save.txt', 'w'), flush=True)
```

**Usable in 2.X via:**

```python
from __future__ import print_function
```


**Otherwise, 2.X to 3.X mappings:**

2.X | 3.X
---|---
print a | print(a)
print a, b, c | print(a, b, c)
print a, b, | print(a, b, end='')
print>>F, a | print(a, file=F)

**The Python Hello world program**

*    Expression results dont need to be printed at top-level

In [0]:
print('hello world')

hello world


*    The hard way

In [0]:
x = 'hello world'
import sys
sys.stdout.write(str(x) + '\n')

hello world


*    sys.stdout can be assigned

In [0]:
_old_out = sys.stdout
sys.stdout = open('log', 'a') # or a class with .write
print(x)
sys.stdout = _old_out # restore broken out
!cat log

hello world
hello world
hello world


###If selections

*     Pythons main selection construct
*     No switch: via if/elif/else, dictionaries, or lists
*     See also expression ahead: X = B if A else C

**General format**

```python
if <test1>:
  <statements1>
elif <test2>: # optional elifs
  <statements2>
else: # optional else
  <statements3>
```

**Examples**

In [0]:
if 3 > 2:
  print('yep')

yep


In [0]:
x = 'killer rabbit'

if x == 'bunny':
  print('hello little bunny')
elif x == 'bugs':
  print("what's up doc?")
else:
  print('Run away! Run away!')

Run away! Run away!


In [0]:
choice = 'ham'

print({'spam': 1.25, # dictionary switch
       'ham': 1.99,
       'eggs': 0.99,
       'bacon': 1.10}[choice])

1.99


In [0]:
# with actions
choice = 'square'
arg = 2
{'linear': (lambda x: x),
 'square': (lambda x: x**2),
}[choice](arg)

4

###Python syntax rules

*     Compound statements = header, :, indented statements
*     Block and statement boundaries detected automatically
*     Comments run from # through end of line
*     Documentation strings at top of file, class, function

**Block delimiters**

*      Block boundaries detected by line indentation
*      Indentation is any combination of spaces and tabs
*      Tabs = N spaces up to multiple of 8 (but dont mix)

**Statement delimiters**

*      Statement normally end at end-of-line, or ';'
*      Statements may span lines if open syntactic pair: ( ), { }, [ ]
*      Statements may span lines if end in backslash (outdated feature)
*      Some string constants span lines too (triple-quotes)

**Special cases**

In [0]:
L = ["Good",
     "Bad",
     "Ugly"] # open pairs may span lines
L

['Good', 'Bad', 'Ugly']

In [0]:
x = 1; y = 2; print(x) # more than 1 simple statement

1


In [0]:
if 1: print('hello') # simple statement on header line

hello


**Nesting code blocks**

In [0]:
x = 1 # block0
if x:
  y = 2 # block1
  if y:
    print('block2')
  print('block1')
print('block0')

block2
block1
block0


###The documentation sources interlude

Form |	Role
---|---
# comments |	In-file documentation
The dir function |	Lists of attributes available on objects
Docstrings: __doc__ |	In-file documentation attached to objects
PyDoc: The help function |	Interactive help for objects
PyDoc: HTML reports |	Module documentation in a browser
Standard manual set |	Official language and library descriptions
Web resources |	Online tutorial, examples, and so on
Published books |	Commercially-available texts (see Resources)

In [0]:
import sys
dir(sys)[:5] # also works on types, objects, etc.

['__displayhook__',
 '__doc__',
 '__excepthook__',
 '__interactivehook__',
 '__loader__']

In [0]:
print(sys.__doc__)

This module provides access to some objects used or maintained by the
interpreter and to functions that interact strongly with the interpreter.

Dynamic objects:

argv -- command line arguments; argv[0] is the script pathname if known
path -- module search path; path[0] is the script directory, else ''
modules -- dictionary of loaded modules

displayhook -- called to show results in an interactive session
excepthook -- called to handle any uncaught exception other than SystemExit
  To customize printing in an interactive session or to install a custom
  top-level exception handler, assign other functions to replace these.

stdin -- standard input file object; used by input()
stdout -- standard output file object; used by print()
stderr -- standard error object; used for error messages
  By assigning other file objects (or objects that behave like files)
  to these, it is possible to redirect all of the interpreter's I/O.

last_type -- type of last uncaught exception
last_value -- value

In [0]:
help(sys) # Help on built-in module sys

Help on built-in module sys:

NAME
    sys

MODULE REFERENCE
    https://docs.python.org/3.6/library/sys
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This module provides access to some objects used or maintained by the
    interpreter and to functions that interact strongly with the interpreter.
    
    Dynamic objects:
    
    argv -- command line arguments; argv[0] is the script pathname if known
    path -- module search path; path[0] is the script directory, else ''
    modules -- dictionary of loaded modules
    
    displayhook -- called to show results in an interactive session
    excepthook -- called to handle any uncaught exception other than SystemExit
      To customize printing 

**PyDoc**

*      GUI till 3.2:
*      Start/App menu, PythonXX/ModuleDocs
*      Or run Python script pydocgui.pyw in std lib
*      Or command line python -m pydoc g (or C:\Python3x\python )
*      Browser in 3.2+:
*      Command line python -m pydoc b
*      Or on Windows py −3 -m pydoc b
*      Or on some Python 3.X: Start/App menu, PythonXX/ModuleDocs

In [0]:
"""
Module documentation
Words Go Here
"""

spam = 40
 
def square(x):
  """
  function documentation
  can we have your liver then?
  """
  return x ** 2
 

class SomeClass: # see ahead
  """
  class docs go here, and can also
  be in nested method def statements
  """
  pass
 


help(SomeClass)
help(square)

Help on class SomeClass in module __main__:

class SomeClass(builtins.object)
 |  class docs go here, and can also
 |  be in nested method def statements
 |  
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

Help on function square in module __main__:

square(x)
    function documentation
    can we have your liver then?



###Truth tests revisited

*     True = non-zero number, or non-empty object
*     Comparisons operators return True (1) or False (0)
*     Boolean operators short-circuit when result known
*     Boolean operators return an operand object

Object |	Value
---|---
"spam" |	true
"" |	false
[] |	false
{} |	false
1 |	true
0.0 |	false
None |	false

**Examples**

In [0]:
2 < 3, 3 < 2 # return True (1) or False (0)

(True, False)

In [0]:
2 or 3, 3 or 2 # return left operand if true else return right operand (T|F)

(2, 3)

In [0]:
[] or 3

3

In [0]:
[] or {}

{}

In [0]:
{} or []

[]

In [0]:
2 and 3, 3 and 2 # return left operand if false lse return right operand (T|F)

(3, 2)

In [0]:
[] and {}

[]

In [0]:
3 and []

[]

**Ternary operator in Python**

In [0]:
x = 'a' if 3 > 2 else 'b'
x

'a'

**Boolean type**

*     bool is a subclass of int
*      bool has two instances: True and False
*      True,False are 1,0 but print differently

In [0]:
1 > 0

True

In [0]:
True == 1, True is 1

(True, False)

In [0]:
True + 1

2

###While loops

*     Pythons most general iteration construct
*     One of two looping statements: while, for
*     Implicit looping tools: map, reduce, filter, in, list (and other) comprehensions

**General format**

```python
while <test>:
  <statements>
else: # optional else
  <statements2> # run if didn't exit with break
```

**Examples**

In [0]:
while True:
  print('Press Stop to stop me!')

In [0]:
count = 5
while count:
  print(count)
  count -= 1

5
4
3
2
1


In [0]:
x = 'spam'
while x:
  print(x)
  x = x[1:] # strip first char off x

spam
pam
am
m


In [0]:
a=0; b=10
while a < b: # one way to code counter loops
  print(a, end=',')
  a += 1

0,1,2,3,4,5,6,7,8,9,

###Break, continue, pass, and the loop else

*     break jumps out of the closest enclosing loop
*     continue jumps to the top of the closest enclosing loop
*     pass does nothing: an empty statement placeholder
*     loop else run if loop exits normally, without a break
*     3.X a literal ellipsis can work like a pass

**General loop format**

```python
while <test>:
  <statements>
  if <test>: break # exit loop now, skip else
  if <test>: continue # go to top of loop now
else:
  <statements> # if we didnt hit a break
```

**Examples**

*    Pass: an infinite loop
*    Better example: stubbed-out function bodies, TBD

In [0]:
while True: pass # Stop button to stop!

*    Continue: print even numbers
*    Avoids statement nesting (but use sparingly!)

In [0]:
x = 10
while x:
  x = x-1
  if x % 2 != 0: continue # odd?--skip
  print(x, end=",")

8,6,4,2,0,

*    Break: find factors
*    Avoids search status flags

In [0]:
y = 121
x = y // 2
while x > 1:
  if y % x == 0: # remainder
    print(y, 'has factor', x)
    break # skip else
  x = x-1
else: # normal exit
  print(y, 'is prime')
 


121 has factor 11


###For loops

*     A general sequence (iterable) iterator
*     Works on strings, lists, tuples, other
*     Replaces most counter style loops
*     Original: repeatedly indexes object until IndexError detected
*     Newer: repeatedly calls next(iter(object)) until StopIteration
*     Preview: also works on Python classes and C types

**General format**

```python
for <target> in <object>: # assign object items to target
  <statements>
  if <test>: break # exit loop now, skip else
  if <test>: continue # go to top of loop now
else:
  <statements> # if we didnt hit a break
 ```

**Examples**

In [0]:
for x in ["spam", "eggs", "spam"]:
  print(x)

spam
eggs
spam


In [0]:
prod = 1
for i in (1, 2, 3, 4): prod *= i # tuples

prod

24

In [0]:
S = 'spam'
for c in S: print(c) # strings

s
p
a
m


**Works on any iterable object: files, dicts**

In [0]:
! echo "hello" >  data.txt &&\
  echo "world" >> data.txt &&\
  echo "!!!!" >> data.txt

In [0]:
for line in open('data.txt'):
  print(line.upper()) # calls next(), catches exc

HELLO

WORLD

!!!!



In [0]:
D = {'hello':'world', 'this is':'Sparta'}
for key in D:
  print(key, D[key])

hello world
this is Sparta


**iteration protocol and files**

*      file.__next__() gets next line, like file.readline(), but eof differs
*      Use file.next() in 2.X, or next(file) in either 2.X or 3.X
*      All iteration tools/contexts use the iteration protocol
*      Protocol: I=iter(X), call next(I) * N, catch StopIteration
*      iter(X) optional for single-scan iterables (e.g., files)
*      Any such object works in for loops, comprehensions,
*      line.rstrip(), line.upper() are string methods

**Comprehensions:**

*      [expr-with-var for var in iterable]
*      [expr-with-var for var in iterable if expr-with-var]
*      [expr-with-vars for var1 in iterable1 for var2 in iterable2]
*      Also works for sets, dictionaries, generators in 2.7+: {}, ()
*      Hint: comps have statement equivalents: for/if/append() nesting
*      Hint: KISS! if its hard for you read, dont do it

###Loop coding techniques

*     for subsumes most counter loops
*     range generates a list of integers to iterate over
*     xrange similar, but doesnt create a real list
*     avoid range, and the temptation to count things!

In [0]:
# The easy (and fast) way

X = 'spam'
for item in X: print(item) # step through items

s
p
a
m


In [0]:
# The hard way: a C-style for loop

i = 0
while i < len(X): # manual while indexing
  print(X[i]); i += 1

s
p
a
m


In [0]:
# Range and fixed repitions
list(range(5)), list(range(2, 5))

([0, 1, 2, 3, 4], [2, 3, 4])

In [0]:
for i in range(4): print('A shrubbery!')

A shrubbery!
A shrubbery!
A shrubbery!
A shrubbery!


In [0]:
X = 'spam'
len(X)

4

In [0]:
# Using range to generate offsets (not items!)
list(range(len(X)))

[0, 1, 2, 3]

In [0]:
for i in range(len(X)): print(X[i]) # step through offsets

s
p
a
m


In [0]:
# Using range and slicing for non-exhaustive traversals
list(range(2, 10, 2))

[2, 4, 6, 8]

In [0]:
S = 'abcdefghijk'
for i in range(0,len(S),2): print(S[i])

a
c
e
g
i
k


In [0]:
for c in S[::2]: print(c) # S[::-1] reverses

a
c
e
g
i
k


In [0]:
# Using range and enumerate
L = [1, 2, 3, 4]
for x in L: x += 10
L

[1, 2, 3, 4]

In [0]:
for i in range(len(L)): L[i] += 10
L

[11, 12, 13, 14]

In [0]:
#List comprehensions
M = [x + 10 for x in L]
M

[21, 22, 23, 24]

In [0]:
lines = [line.rstrip() for line in open('sample_data/README.md')]
lines[:5]

['This directory includes a few sample datasets to get you started.',
 '',
 '*   `california_housing_data*.csv` is California housing data from the 1990 US',
 '    Census; more information is available at:',
 '    https://developers.google.com/machine-learning/crash-course/california-housing-data-description']

In [0]:
matrix = [[1,2,3], 
          [4,5,6], 
          [7,8,9]]
[row[1] for row in matrix] 

[2, 5, 8]

In [0]:
for (i, x) in enumerate(L):
  L[i] = x * 2

L

[4, 8, 12, 16]

In [0]:
enumerate(L)

<enumerate at 0x7efc2ed67090>

In [0]:
list(enumerate(L))

[(0, 4), (1, 8), (2, 12), (3, 16)]

In [0]:
E = enumerate(L)
E.__next__()

(0, 4)

In [0]:
E.__next__()

(1, 8)

In [0]:
# Traversing sequences in parallel with zip
L1 = [1,2,3,4]
L2 = [5,6,7,8]
list(zip(L1,L2))

[(1, 5), (2, 6), (3, 7), (4, 8)]

In [0]:
for (x,y) in zip(L1, L2):
  print(x, y, '--', x+y)

1 5 -- 6
2 6 -- 8
3 7 -- 10
4 8 -- 12


In [0]:
# Traversing dictionaries by sorted keys
D = {'a':1, 'b':2, 'c':3}
D

{'a': 1, 'b': 2, 'c': 3}

In [0]:
Ks = sorted(D.keys())
for k in Ks: print(D[k])

1
2
3


###Comprehensive loop examples

**Common ways to read from files**

In [0]:
# file creation
myfile = open('myfile.txt', 'w')
for i in range(3):
  myfile.write(('spam' * (i+1)) + '\n')
myfile.close()

In [0]:
# all at once
print(open('myfile.txt').read())

spam
spamspam
spamspamspam



In [0]:
# line by line
myfile = open('myfile.txt')
while True:
  line = myfile.readline()
  if not line: break
  print(line)

spam

spamspam

spamspamspam



In [0]:
# all lines at once
for line in open('myfile.txt').readlines():
  print(line)

spam

spamspam

spamspamspam



In [0]:
# file iterators: line by line
for line in open('myfile.txt'):
  print(line)

spam

spamspam

spamspamspam



In [0]:
# by byte counts
myfile = open('myfile.txt')
while True:
  line = myfile.read(10)
  if not line: break
  print('[' + line + ']')

[spam
spams]
[pam
spamsp]
[amspam
]


**Summing data file columns**

In [0]:
! echo "001.1 002.2 003.3" > data.txt &&\
  echo "010.1 020.2 030.3 040.4" >> data.txt &&\
  echo "100.1 200.2 300.3" >> data.txt

In [0]:
print(open('data.txt').read())

001.1 002.2 003.3
010.1 020.2 030.3 040.4
100.1 200.2 300.3



In [0]:
sums = {}
for line in open('data.txt'):
  cols = [float(col) for col in line.split()] # next!
  for pos, val in enumerate(cols):
    sums[pos] = sums.get(pos, 0.0) + val

for key in sorted(sums):
  print(key, '=', sums[key])

0 = 111.3
1 = 222.6
2 = 333.90000000000003
3 = 40.4


In [0]:
sums

{0: 111.3, 1: 222.6, 2: 333.90000000000003, 3: 40.4}

###Basic coding gotchas

*    Dont forget to type a : at the end of compound statement headers
*    Be sure to start top-level (unnested) code in column 1
*    Blank lines in compound statements are ignored in files, but end the statement at the interactive prompt
*    Avoid mixing tabs and spaces in indentation, unless youre sure what your editor does with tabs
*    C programmers: you dont need ( ) around tests in if and while; you cant use { } around blocks
*    In-place change operations like list.append() and list.sort() dont return a value (really, they return None); call them without assigning the result.
*    Add parens to call a function: file.close() is a call, file.close is a reference only

##Functions

**Why use functions?**
   * Code reuse
   * Procedural decomposition
   * Alternative to cut-and-paste: redundancy

**Function topics**

   * The basics
   * Scope rules
   * Argument matching modes
   * Odds and ends
   * Generator expressions and functions
   * Design concepts
   * Functions are objects
   * Function gotchas

###Function basics

   * def is an executable statement; usually run during import
   * def creates a function object and assigns to a name
   * return sends a result object back to the caller
   * Arguments are passed by object reference (assignment)
   * Arguments, return types, and variables are not declared
   * Polymorphism: code to object interfaces, not datatypes

**General form**

```python
def <name>(arg1, arg2,… argN):
  <statements>
  return <value>
```

**Definition**

In [0]:
def times(x, y):      # create and assign function
  return x * y      # body executed when called

**Calls**

In [0]:
times(2, 4)           # arguments in parenthesis

8

In [0]:
times('Hi', 4)        # functions are 'typeless'

'HiHiHiHi'

“Polymorphism”


The meaning of an operation depends on its subject.
By not caring about types, code becomes more flexible.
Any object with a compatible interface will work.
Most errors are best caught by Python, not your code.

**Example: intersecting sequences**

   * Definition

In [0]:
def intersect(seq1, seq2):
  res = []                     # start empty
  for x in seq1:               # scan seq1
    if x in seq2:            # common item?
        res.append(x)        # add to end
  return res

* Calls

In [0]:
s1 = "SPAM"
s2 = "SCAM"
intersect(s1, s2)               # strings

['S', 'A', 'M']

In [0]:
intersect([1, 2, 3], (1, 4))    # mixed types

[1]

###Scope rules in functions

   * Enclosing module is a ‘global’ scope
   * Each call to a function is a new ‘local’ scope
   * Assigned names are local, unless declared “global”
   * All other names are global or builtin
   * Added in 2.2: ‘nonlocal’ enclosing function locals (if any) searched before global
   * Added in 3.X: ‘nonlocal’ variables can be changed if declared, just like ‘globals’


**Name resolution: the “LEGB” rule**

   *   References search up to 4 scopes:
1.    Local                                   (function)
2.    Enclosing functions           (if any)
3.    Global                                 (module)
4.    Builtin                                 (__builtin__ (2.X), builtins (3.X))
   *   Assignments create or change local names by default
   *   “global” declarations map assigned names to module

Example
   * Global names: ‘X’, ‘func’
   * Local names: ‘Y’, ‘Z’
   * Interactive prompt: module ‘__main__’

In [0]:
X = 99            # X and func assigned in module
def func(Y): # Y and Z assigned in function
  Z = X + Y     # X not assigned: global
  return Z
func(1) # func in module: result=100

100

**Enclosing Function Scopes (2.2+)**

In [0]:
def f1():
  x = 88
  def f2():
    print(x)         # 2.2: x found in enclosing function
  f2()

f1()

88


**More useful with lambda (ahead)**

In [0]:
def func():
  x = 42
  action = (lambda n: x ** n)          # 2.2
  return action

f = func()
f(2)

1764

**Most useful for closures: state retention (non-OOP)**

In [0]:
def maker(N):
  def action(X):            # make, don’t call
    return X ** N
  return action             # return new func

In [0]:
f = maker(3)                  # “remembers” 3 (N)
f(2), f(3)                    # arg to X, not N

(8, 27)

In [0]:
g = maker(4)                  # “remembers” 4 (N)
g(2), g(3)

(16, 81)

In [0]:
f(2), f(3)                    # f still has 3

(8, 27)

###More on “global”, and 3.X “nonlocal”

   * ‘global’ means assigned at top-level of a module file
   * Global names must be declared only if assigned
   * Global names may be referenced without being declared
   * 3.X: “nonlocal X” means X in enclosing def changeable

All Pythons

In [0]:
y, z = 1, 2         # global variables in module
def all_global():
  global x        # declare globals assigned
  x = y + z       # no need to declare y,z: 3-scope rule

all_global()  
print(x)  

3


enclosing function vars are changeable (state retention) (Python 3.X Only)

In [0]:
def outer():
  x = 1
  def inner():
    nonlocal x
    x += 1
    print(x)
  return inner

f = outer()      # f is really an inner
f()

2


In [0]:
f()

3


###More on “return”


   * Return sends back an object as value of call
   * Can return multiple arguments in a tuple
   * Can return modified argument name values

In [0]:
def multiple(x, y):
  x = 2
  y = [3, 4]
  return x, y

X = 1
L = [1, 2]
X, L = multiple(X, L)
X, L

(2, [3, 4])

###More on argument passing


   * Pass by object reference: assign shared object to local name
   * In def, assigning to argument name doesn’t effect caller
   * In def, changing mutable object argument may impact caller
   * Not pass ‘by reference’ (C++), but:
     - Immutables act like ‘by value’ (C)
     - Mutables act like ‘by pointer’ (C)

In [0]:
def changer(a, b):
  a = 2             # changes local name's value only
  b[0] = 'spam'     # changes shared object in-place

X = 1
L = [1, 2]
changer(X, L)
X, L

(1, ['spam', 2])

**Equivalent to these assignments:**

In [0]:
X = 1
a = X           # they share the same object
a = 2           # resets a only, X is still 1
L = [1, 2]
b = L           # they share the same object
b[0] = 'spam'   # in-place change: L sees the change too
X, L

(1, ['spam', 2])

![assignments](https://learning-python.com/class/Workbook/unit05_files/image002.gif)

###Special argument matching modes

   * Positional      matched left-to-right in header (normal)
   * Keywords      matched by name in header
   * Varargs        catch unmatched positional or keyword args
   * Defaults        header can provide default argument values
   * 3.X: Keyword Only, after ""*"" in def,  must pass by name

Operation | Location | Interpretation
---|---|---
func(value) | caller | normal argument: matched by position
func(name=value) | caller | keyword argument: matched by name
def func(name) | function | normal argument: matches any by name or position
def func(name=value) | function | default argument value, if not passed in call
def func(*name) | function | matches remaining positional args (tuple)
def func(**name) | function | matches remaining keyword args (dictionary)
func(*args, **kargs) | caller | subsumes old apply(): unpack tuple/dict of args
def func(a, *b, c) def func(a, *, c) | function | 3.X keyword-only (c must be passed by name only)

About the stars…

In Python 3.4 and earlier, the special *X and **X star syntax forms can appear in 3 places:

   1.     In assignments, where a *X in the recipient collects unmatched items in a new list (3.X sequence assignments)
   2.     In function headers, where the two forms collect unmatched positional and keyword arguments in a tuple and dict
   3.     In function calls, where the two forms unpack iterables and dictionaries into individual items (arguments)
   
In Python 3.5 and later, this star syntax is also usable within data structure literals—where it unpacks collections into individual items, like its original use in function calls (#3 above). The unpacking star syntax now also works in lists, tuples, sets, and dictionaries where it unpacks or "flattens" another object's contents in-place:

```python
[x, *iter]      # list:  unpack iter's items
(x, *iter, y)   # tuple: ditto (parenthesis or not)
{*iter, x}      # set:   ditto (unordered, unique)
{x:y, **dict}   # dict:  unpack dict's keys/values)
```

For example, in 3.5+:

In [0]:
x = [1, 2]
y = [*x, *x]
y

[1, 2, 1, 2]

###Examples

**Positionals and keywords**

In [0]:
def f(a, b, c): print(a, b, c)
  
f(1, 2, 3)

1 2 3


In [0]:
f(c=3, b=2, a=1)

1 2 3


In [0]:
f(1, c=3, b=2)

1 2 3


**Defaults**

In [0]:
def f(a, b=2, c=3): print(a, b, c)
  
f(1)

1 2 3


In [0]:
f(1, 4, 5)

1 4 5


In [0]:
f(1, c=6)

1 2 6


**Arbitrary positionals**

In [0]:
def f(*args): print(args)

f(1)

(1,)


In [0]:
f(1,2,3,4)

(1, 2, 3, 4)


**Arbitrary keywords**

In [0]:
def f(**args): print(args)
  
f()  

{}


In [0]:
f(a=1, b=2)

{'a': 1, 'b': 2}


In [0]:
def f(a, *pargs, **kargs): print(a, pargs, kargs)
  
f(1, 2, 3, x=1, y=2)  

1 (2, 3) {'x': 1, 'y': 2}


**Example: min value functions**

   *   Only deals with matching: still passed by assignment
   *   Defaults retain an object: may change if mutable

In [0]:
def func(spam, eggs, toast=0, ham=0):   # first 2 required
  print (spam, eggs, toast, ham)
  
func(1, 2)                      # output: (1, 2, 0, 0)
func(1, ham=1, eggs=0)          # output: (1, 0, 0, 1)
func(spam=1, eggs=0)            # output: (1, 0, 0, 0)
func(toast=1, eggs=2, spam=3)   # output: (3, 2, 1, 0)
func(1, 2, 3, 4)                # output: (1, 2, 3, 4)  

1 2 0 0
1 0 0 1
1 0 0 0
3 2 1 0
1 2 3 4


**Ordering rules**

   * Call: keyword arguments after non-keyword arguments
   * Header: normals, then defaults, then *name, then * *name

**Matching algorithm (see exercise)**

1.   Assign non-keyword arguments by position
2.   Assign keyword arguments by matching names
3.   Assign extra non-keyword arguments to *name tuple
4.   Assign extra keyword arguments to **name dictionary
5.   Unassigned arguments in header assigned default values

###Odds and ends

   * lambda expression creates anonymous functions
   * list comprehensions, map, filter apply expressions to sequences (see also prior unit)
   * Generator expressions (2.4+)
   * Generator functions and yield (new in 2.2, 2.3)
   * apply function calls functions with arguments tuple
   * Functions return ‘None’ if they don’t use a real ‘return’
   * Python 3.X function annotations and keyword arguments

####Lambda expressions

In [0]:
def func(x, y, z): return x + y + z

func(2, 3, 4)

9

In [0]:
f = lambda x, y, z: x + y + z
f(2, 3, 4)

9

####List comprehensions

In [0]:
ord('s')

115

In [0]:
res = []
for x in 'spam': res.append(ord(x))
res

[115, 112, 97, 109]

In [0]:
list(map(ord, 'spam'))              # apply func to sequence

[115, 112, 97, 109]

In [0]:
[ord(x) for x in 'spam']      # apply expr to sequence

[115, 112, 97, 109]

**adding arbitrary expressions**

In [0]:
[x ** 2 for x in range(10)]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [0]:
list(map((lambda x: x**2), range(10)))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [0]:
lines = [line[:-1] for line in open('sample_data/README.md')]
lines[:3]

['This directory includes a few sample datasets to get you started.',
 '',
 '*   `california_housing_data*.csv` is California housing data from the 1990 US']

**adding if tests**

In [0]:
[x for x in range(10) if x % 2 == 0]

[0, 2, 4, 6, 8]

In [0]:
list(filter((lambda x: x % 2 == 0), range(10)))

[0, 2, 4, 6, 8]

**advanced usage**

In [0]:
[x**2 for x in range(10) if x % 2 == 0]

[0, 4, 16, 36, 64]

In [0]:
[x+y for x in 'abc' for y in 'lmn']

['al', 'am', 'an', 'bl', 'bm', 'bn', 'cl', 'cm', 'cn']

In [0]:
res = []
for x in 'abc':
  for y in 'lmn':
    res.append(x+y)
res

['al', 'am', 'an', 'bl', 'bm', 'bn', 'cl', 'cm', 'cn']

**nice for matrixes**

In [0]:
M = [[1, 2, 3],
     [4, 5, 6],
     [7, 8, 9]]
M[1]

[4, 5, 6]

In [0]:
col2 = [row[1] for row in M]
col2

[2, 5, 8]

In [0]:
quad = [M[i][j] for i in (0,1) for j in (0, 1)]
quad

[1, 2, 4, 5]

List comprehensions can become incomprehensible when nested, but map and list comprehensions may be faster than simple for loops

####Generator expressions (2.4+)

**list comprehensions generate entire list in memory**

In [0]:
squares = [x**2 for x in range(5)]
squares

[0, 1, 4, 9, 16]

**generator expressions yield 1 result at a time: saves memory, distributes work**

In [0]:
squares = (x**2 for x in range(5))
squares

<generator object <genexpr> at 0x7f379e6745c8>

In [0]:
next(squares)

0

In [0]:
next(squares)

1

In [0]:
next(squares)

4

In [0]:
list(squares)

[9, 16]

**iteration contexts automatically call next()**

In [0]:
for x in (x**2 for x in range(5)):
  print(x)

0
1
4
9
16


In [0]:
sum(x**2 for x in range(5))

30

####Generator functions and yield

  * Generator implements iteration protocol: \_\_next__()
  * Retains local scope when suspended
  * Distributes work over time, may save memory (see also: threads)
  * Related: generator expressions, enumerate function, file iterators

**functions compiled specially when contain yield**

In [0]:
def gensquares(N):
  for i in range(N):         # suspends and resumes itself
    yield i ** 2           # <- return value and resume here later

**generator objects support iteration protocol: \_\_next__()**

In [0]:
x = gensquares(10)
x                              # also retain all local variables between calls

<generator object gensquares at 0x7f379ddab9e8>

In [0]:
x.__next__()

1

In [0]:
x.__next__()

4

In [0]:
x.__next__()

9

In [0]:
x.__next__()

16

…StopIteration exception raised at end…

**for loops (and others) automatically call next()**

In [0]:
for i in gensquares(5):        # resume the function each time
  print(i)               # print last yielded value


0
1
4
9
16


####Apply syntax (all Pys)

In [0]:
def func(a, b, c):
  return a + b + c

func(*(2, 3, 4))

9

In [0]:
func(*(2, 3), **{'c': 4})     # 2.X and 3.X

9

**call syntax is more flexible:**

In [0]:
def func(a, b, c, d): return a + b + c + d

args1 = (1, 2)
args2 = {'c': 3, 'd': 4}
func(*args1, **args2)

10

In [0]:
func(1, *(2,), **args2)

10

####Default return values

In [0]:
def proc(x):
  print(x)
  
x = proc('testing 123...')  

testing 123...


In [0]:
print(x)

None


####Python 3.X function annotations

In [0]:
def func(a: int, b: 'spam', c: 88 = 99) -> float:
  print(a, b, c)

func(1, 2)

1 2 99


In [0]:
func.__annotations__

{'a': int, 'b': 'spam', 'c': 88, 'return': float}

####Python 3.X keyword-only arguments

In [0]:
def f(a, b, *, c=3, d): print(a, b, c, d)

print("ERROR EXPECTED")
f(1, 2)

TypeError: ignored

In [0]:
print("ERROR EXPECTED")
f(1, 2, 3)

ERROR EXPECTED


TypeError: ignored

In [0]:
f(1, 2, d=4)

1 2 3 4


In [0]:
f(1, 2, c=3, d=4)

1 2 3 4


###Function design concepts

   * Use global variables only when absolutely necessary
   * Use arguments for input, ‘return’ for outputs
   * Don’t change mutable arguments unless expected
   * But globals are only state-retention tool without classes
   * But classes depend on mutable arguments (‘self’)
   
   ![function](https://learning-python.com/class/Workbook/unit05_files/image004.gif)

####Functions are objects: indirect calls

   * Function objects can be assigned, passed, etc.
   * Can call objects generically: function, bound method, ...

In [0]:
def echo(message): print(message)

x = echo
x('Hello world!')

Hello world!


In [0]:
def indirect(func, arg):
  func(arg)

indirect(echo, 'Hello world!')

Hello world!


In [0]:
schedule = [ (echo, 'Hello!'), (echo, 'Ni!') ]
for (func, arg) in schedule:
    func(arg)

Hello!
Ni!


**File scanners**

In [0]:
!echo -e "this \n is \n Sparta" > data.txt

In [0]:
# definition
def scanner(name, function):
  file = open(name, 'r')          # create file
  for line in file.readlines():
    function(line)              # call function
  file.close()

In [0]:
# usage
def processLine(line):
  print(line.upper())
  
scanner("data.txt", processLine)    # start scanner

THIS 

 IS 

 SPARTA



###Function gotchas

In [0]:
X = 99

def selector():
  X = 88          # X classified as a local name
  print(X)
  
selector()
X

88


99

In [0]:
X = 99

def selector():
  global X        # force X to be global
  X = 88          # X classified as a global name
  print(X)
  
selector()
X

88


88

In [0]:
X = 99

def selector():
  print(X)        # X classified as a global name
  
selector()
X

99


99

**Mutable defaults created just once**

In [0]:
def grow(A, B=[]):
  B.append(A)
  return B

grow(1)

[1]

In [0]:
grow(1)

[1, 1]

In [0]:
grow(1)

[1, 1, 1]

**Use defaults to save references**

still required to retain current value of loop variables!

In [0]:
def outer(x, y):
  def inner():
    return x ** y
  return inner

x = outer(2, 4)
x()

16

In [0]:
def outer(x, y):
  return lambda a=x, b=y: a**b

y = outer(2, 5)
y()

32

32
for I in someiterable:

In [0]:
actions = []
for I in [0,1,2,3,4]:
  actions.append(lambda I=I: print(I))  # retain current I, not last!
  
actions[2]()

2


###Optional reading: set functions

   * Functions process passed-in sequence objects
   * Work on any type of sequence objects
   * Supports mixed types: list and tuple, etc.

In [0]:
def intersect(seq1, seq2):
  res = []                     # start with an empty list
  for x in seq1:               # scan the first sequence
    if x in seq2:
      res.append(x)        # add common items to end
  return res

def union(seq1, seq2):
  res = list(seq1)        # copy of seq1
  for x in seq2:          # add new items in seq2
    if not x in res:
      res.append(x)
  return res

s1 = "SPAM"
s2 = "SCAM"
intersect(s1, s2), union(s1, s2)           # strings

(['S', 'A', 'M'], ['S', 'P', 'A', 'M', 'C'])

In [0]:
intersect([1,2,3], (1,4))                  # mixed types

[1]

In [0]:
union([1,2,3], (1,4))

[1, 2, 3, 4]

###Supporting multiple operands: *varargs

In [0]:
def intersect(*args):
  res = []
  for x in args[0]:                  # scan first sequence
    for other in args[1:]:         # for all other args
      if x not in other: break   # this in each one?
      else:
        res.append(x)              # add items to end
  return res

def union(*args):
  res = []
  for seq in args:                   # for all args
    for x in seq:                  # for all nodes
      if not x in res:
        res.append(x)          # add items to result
  return res
  
s1, s2, s3 = "SPAM", "SCAM", "SLAM"
intersect(s1, s2), union(s1, s2)           # 2 operands

(['S', 'A', 'M'], ['S', 'P', 'A', 'M', 'C'])

In [0]:
intersect([1,2,3], (1,4))

[1]

In [0]:
intersect(s1, s2, s3)                      # 3 operands

['S', 'S', 'A', 'A', 'M', 'M']

In [0]:
union(s1, s2, s3)

['S', 'P', 'A', 'M', 'C', 'L']

##Modules

**Why use modules?**

*     Code reuse
*     System name-space partitioning
*     Implementing shared services or data

**Module topics**
 
*     The basics
*     Import variations
*     Reloading modules
*     Design concepts
*     Modules are objects
*     Package imports
*     Odds and ends
*     Module gotchas

###Module basics

*     Creating modules: Python files, C extensions; Java classes (Jython)
*     Using modules: import, from, reload(), 3.X: imp.reload()
*     Module search path: $PYTHONPATH

In [0]:
! echo "def printer(x): # module attribute" > module1.py &&\
  echo "  print(x)" >> module1.py

**Module usage**

In [0]:
import module1 # get module
module1.printer('Hello world!')

Hello world!


In [0]:
from module1 import printer # get an export
printer('Hello world!')

Hello world!


In [0]:
from module1 import * # get all exports
printer('Hello world!')

Hello world!


from * can obscure variables meaning

In [0]:
! echo "def func(): # module attribute" > module1.py &&\
  echo "  print('module 1')" >> module1.py
! echo "def func(): # module attribute" > module2.py &&\
  echo "  print('module 2')" >> module2.py
! echo "def func(): # module attribute" > module3.py &&\
  echo "  print('module 3')" >> module3.py

In [0]:
from module1 import * # may overwrite my names
from module2 import * # no way to tell what we get
from module3 import *

func() # ← ?!

# Advice: use from * with at most 1 module per file

module 3


from does not play well with reload

In [0]:
! echo "def func(): # module attribute" > moduleA.py &&\
  echo "  print('module before change')" >> moduleA.py

In [0]:
from moduleA import func # copy variable out
func() # test it

module before change


In [0]:
# change moduleA.py

! echo "def func(): # module attribute" > moduleA.py &&\
  echo "  print('module AFTER change')" >> moduleA.py

In [0]:
from imp import reload # required in 3.X
print("ERROR EXPECTED")
reload(moduleA) # <- FAILS: unbound name!

ERROR EXPECTED


NameError: ignored

In [0]:
import moduleA # must bind name here
reload(moduleA) # ok: loads new code
func() # <- FAILS: old object!

module before change


In [0]:
moduleA.func() # this works now

module AFTER change


In [0]:
from moduleA import func # so does this
func()

# Advice: don't do that--run scripts other ways

module AFTER change


###Module files are a namespace

*     A single scope: local==global
*     Module statements run on first import
*     Top-level assignments create module attributes
*     Module namespace: attribute __dict__, or dir()

In [0]:
# file: moduleB.py

! echo "print('starting to load')" > moduleB.py &&\
  echo "import sys" >> moduleB.py &&\
  echo "name = 42" >> moduleB.py &&\
  echo "def func(): pass" >> moduleB.py &&\
  echo "class klass: pass" >> moduleB.py &&\
  echo "print('done loading.')" >> moduleB.py

Usage

In [0]:
import moduleB

starting to load
done loading.


In [0]:
moduleB.sys

<module 'sys' (built-in)>

In [0]:
moduleB.name

42

In [0]:
moduleB.func, moduleB.klass

(<function moduleB.func>, moduleB.klass)

In [0]:
moduleB.__dict__.keys() # add list() in 3.X

dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__', '__cached__', '__builtins__', 'sys', 'name', 'func', 'klass'])

###Name qualification

*     Simple variables
      -      X searches for name X in current scopes
*     Qualification
      -      X.Y searches for attribute Y in object X
*     Paths
      -      X.Y.Z gives a path of objects to be searched
*     Generality
      -      Qualification works on all objects with attributes: modules, classes, built-in types, etc.

![alt text](https://learning-python.com/class/Workbook/unit06_files/image002.gif)

###Import variants

*     Module import model
      -      module loaded and run on first import or from
      -      running a modules code creates its top-level names
      -      later import/from fetches already-loaded module
*     import and from are assignments
      -      import assigns an entire module object to a name
      -      from assigns selected module attributes to names

Operation |	 Interpretation
---|---
import mod |	fetch a module as a whole
from mod import name |	fetch a specific name from a module
from mod import * |	fetch all top-level names from a module
imp.reload(mod) |	force a reload of modules code

![alt text](https://learning-python.com/class/Workbook/unit06_files/image004.gif)

###Reloading modules

*     Imports only load/run module code first time
*     Later imports use already-loaded module
*     reload function forces module code reload/rerun
*     Allows programs to be changed without stopping
*     3.X: must first from imp import reload to use!

**General form**

```python
import module # initial import
[use module.attributes]
 # change module file

from imp import reload # required in 3.X
reload(module) # get updated exports
[use module.attributes]
```

**Usage details**

*     A function, not a statement
*     Requires a module object, not a name
*     Changes a module object in-place:
      -      runs module files new code in current namespace
      -      assignments replace top-level names with new values
      -      impacts all clients that use import to fetch module
      -      impacts future from clients (see earlier example)

**Reload example**

*     Changes and reload file without stopping Python
*     Other common uses: GUI callbacks, embedded code, etc.

In [0]:
! echo "message = 'First version'" > changer.py &&\
  echo "def printer():" >> changer.py &&\
  echo "  print(message)" >> changer.py

In [0]:
import changer
changer.printer()

First version


In [0]:
! echo "message = 'After editing'" > changer.py &&\
  echo "def printer():" >> changer.py &&\
  echo "  print(message)" >> changer.py

In [0]:
import changer
changer.printer() # no effect: uses loaded module

First version


In [0]:
import importlib
importlib.reload(changer) # forces new code to load/run
changer.printer()

After editing


###Package imports


Module package imports name directory paths:
*      Module name -> dir.dir.dir in import statements and reloads
       -      import dir1.dir2.mod -> loads dir1\dir2\mod.py
       -      from dir1.dir2.mod import name
*      dir1 must be contained by a directory on sys.path (., PYTHONPATH, etc.)
*      Each dir must have \_\_init\_\_.py file, possibly empty (till 3.3: optional)
*      \_\_init\_\_.py gives directorys namespace, can use \_\_all\_\_ for from*
*      Simplifies path, disambiguates same-named modules files

**Example**

```
For:
    dir0\dir1\dir2\mod.py
And:
    import dir1.dir2.mod
``` 

-      dir0 (container) must be listed on the module search path
-      dir1 and dir2 both must contain an \_\_init\_\_.py file
-      dir0 does not require an \_\_init\_\_.py
 
```
dir0\
    dir1\
      __init__.py
      dir2\
          __init__.py
          mod.py
```

**Why packages?**
 

```
root\
    sys1\
        __init__.py (__init__ needed if dir in import)
        util.py
        main.py (import util finds here)
        other.py
    sys2\
        __init__.py
        util.py
        main.py
        other.py
    sys3\ (here or elsewhere)
        __init__.py (your new code here)
        myfile.py (import util depends on path)
                  (import sys1.util doesnt)
```

**Advanced: Relative import syntax (2.5+)**

To enable in 2.X (standard in 3.X):
```python
from __future__ import absolute_import # till 2.7, from stmt only
``` 

In code located in package folder pkg:

```python
import string # skips pkg: finds the standard library's version

from .string import name1, name2 # import names from pkg.string only
from . import string # import pkg.string
```

**Advanced: Namespace packages (3.3+)**

Extension to usual import algorithm: directories without \_\_init\_\_.py, located anywhere on path, checked for last, and used only if no normal module or package found at level: concatenation of all found becomes a virtual package for deeper imports. Not yet used much in practice.

###Odds and ends

*     Python 2.0+: import module as name
      -      Like import module + name = module
      -      Also good for packages: import sys1.util as util

*     Loading modules by name string
      -      exec(import + name)
      -      \_\_import\_\_(name)

*     Modules are compiled to byte code on first import
      -      .pyc files serve as recompile dependency
      -      compilation is automatic and hidden

*     Data hiding is a convention
      -      Exports all names defined at the top-level of a module
      -      Special case: \_\_all\_\_ list gives names exported by from *
      -      Special case: _X names arent imported by a from*

*     The \_\_name\_\_ == \_\_main\_\_ trick
      -      \_\_name\_\_ auto set to \_\_main\_\_ only when run as script
      -      allows modules to be imported and/or run
      -      Simplest unit test protocol, dual usage modes for code

In [0]:
! echo "def func(s):" > tester.py &&\
  echo "  print(s)" >> tester.py &&\
  echo "if __name__ == '__main__': # only when run" >> tester.py &&\
  echo "  func('This is Sparta') # not when imported" >> tester.py

**Usage modes**

In [0]:
import tester
tester.func("This is Kyiv")

This is Kyiv


In [0]:
! python -u tester.py

This is Sparta


###Module design concepts

*     Always in a module: interactive = module \_\_main\_\_
*     Minimize module coupling: global variables
*     Maximize module cohesion: unified purpose
*     Modules should rarely change other modules variables

**Suppose mod.py contains**

```python
X = 99 # reader sees only this
```

**always okay to use**

```python
import mod
print(mod.X ** 2)
```

**almost always a Bad Idea!**

```python
import mod
mod.X = 88
```

 **better: isolates coupling**

```python
import mod
result = mod.func(88)
```

![alt text](https://learning-python.com/class/Workbook/unit06_files/image006.gif)

###Modules are objects: metaprograms

*     A module which lists namespaces of other modules
*     Special attributes: module.\_\_name\_\_, \_\_file\_\_, \_\_dict\_\_
*     getattr(object, name) fetches attributes by string name
*     Add to $PYTHONSTARTUP to preload automatically

**Example**

```python
verbose = 1

def listing(module):
  if verbose:
    print("-" * 30)
    print("name: %s file: %s" % (module.__name__, module.__file__))
    print("-" * 30)
 

  count = 0
  for attr in module.__dict__.keys(): # scan names
    print("%02d) %s" % (count, attr), end=",")
    if attr[0:2] == "__":
      print("<built-in name>") # skip specials
    else:
      print(getattr(module, attr)) #__dict__[attr]
    count = count+1
 

  if verbose:
    print("-" * 30)
    print(module.__name__, "has %d names" % count)
    print("-" * 30)

if __name__ == "__main__":
  import mydir
  listing(mydir) # self-test code: list myself
  ```

In [0]:
! curl -O https://raw.githubusercontent.com/fbeilstein/machine_learning/master/lecture_1/mydir.py

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   621  100   621    0     0   2083      0 --:--:-- --:--:-- --:--:--  2083


*     Running the module on itself

In [0]:
! python -u mydir.py

------------------------------
name: mydir file: /content/mydir.py
------------------------------
00) __name__,<built-in name>
01) __doc__,<built-in name>
02) __package__,<built-in name>
03) __loader__,<built-in name>
04) __spec__,<built-in name>
05) __file__,<built-in name>
06) __cached__,<built-in name>
07) __builtins__,<built-in name>
08) verbose,1
09) listing,<function listing at 0x7f2fb5db5d90>
------------------------------
mydir has 10 names
------------------------------


**Another program about programs**

*     exec runs strings of Python code
*     os.system runs a system shell command
*     \_\_dict\_\_ attribute is module namespace dictionary
*     sys.modules is the loaded-module dictionary

In [0]:
def python(cmd):
  import importlib
  import __main__
  namespace = __main__.__dict__
  exec(cmd, namespace, namespace)

def fix(modname):
  import sys # edit,(re)load
  if modname in sys.modules.keys():
    python('import importlib; importlib.reload(' + modname + ')')
  else:
    python('import ' + modname)   

In [0]:
! echo "def func():" > tester.py &&\
  echo "  print('UNEDITED')" >> tester.py

In [0]:
import tester
tester.func()

UNEDITED


In [0]:
! echo "def func():" > tester.py &&\
  echo "  print('hello world')" >> tester.py

In [0]:
import tester
tester.func()

UNEDITED


In [0]:
fix("tester")
tester.func()

hello world


###Module gotchas

In [0]:
# nested1.py

! echo "X = 99" > nested1.py &&\
  echo "def printer(): print(X)" >> nested1.py
 
# nested2.py

! echo "from nested1 import X, printer" > nested2.py &&\
  echo "X = 88" >> nested2.py &&\
  echo "printer()" >> nested2.py

# nested3.py
! echo "import nested1" > nested3.py &&\
  echo "nested1.X = 88" >> nested3.py &&\
  echo "nested1.printer()" >> nested3.py


In [0]:
! python -u nested2.py
! python -u nested3.py

99
88


Statement order matters at top-level
 
*     Solution: put most immediate code at bottom of file

In [0]:
#func1() # error: "func1" not yet assigned

def func1():
  print(func2()) # okay: "func2" looked up later

#func1() # error: "func2" not yet assihned

def func2():
  return "Hello"

func1() # okay: "func1" and "func2" assigned

Hello


Recursive from import gotchas

*     Solution: use import, or from inside functions

In [0]:
#file: recur1.py
! echo "X = 1" > recur1.py &&\
  echo "import recur2" >> recur1.py &&\
  echo "Y = 2" >> recur1.py

#file recur2.py
! echo "from recur1 import X" > recur2.py &&\
  echo "from recur1 import Y" >> recur2.py

In [0]:
import recur1

ImportError: ignored

In [0]:
import recur2

In [0]:
from recur1 import Y # error: "Y" not yet assigned

**reload may not impact from imports**

*     reload overwrites existing module object
*     But from names have no link back to module
*     Use import to make reloads more effective

-> import module module.X reflects module reloads


-> from module import X X may not reflect module reloads!

**reload isnt applied transitively**

*     Use multiple reloads to update subcomponents
*     Or use recursion to traverse import dependencies

###Optional reading: a shared stack module

*     Manages a local stack, initialized on first import
*     All importers share the same stack: single instance
*     Stack accessed through exported functions
*     Stack can hold any kind of object (heterogeneous)

In [0]:
! curl -O https://raw.githubusercontent.com/fbeilstein/machine_learning/master/lecture_1/mystack.py
! cat mystack.py

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   727  100   727    0     0   2506      0 --:--:-- --:--:-- --:--:--  2515
stack = [] # on first import
error = 'mystack.error' # local exceptions

def push(obj):
  global stack # 'global' to change
  stack = [obj] + stack # add item to front

def pop():
  global stack
  if not stack:
    raise Exception(error, 'stack underflow') # raise local error
  top, stack = stack[0], stack[1:] # remove front item
  return top

def top():
  if not stack: # raise local error
    raise Exception(error, 'stack underflow') # or let IndexError
  return stack[0]

def empty(): return not stack # is the stack []?
def member(obj): return obj in stack # item in stack?
def item(offset): return stack[offset] # index the stack
def length(): return len(stack) # number entries
def dump(): print('<Stack:%s>' % stack)


**Using the stack module**

In [0]:
import mystack
for i in range(5): mystack.push(i)

mystack.dump()

<Stack:[4, 3, 2, 1, 0]>


**Sequence-like tools**

In [0]:
mystack.item(0), mystack.item(-1), mystack.length()

(4, 0, 5)

In [0]:
mystack.pop(), mystack.top()

(4, 3)

In [0]:
mystack.member(4), mystack.member(3)

(False, True)

In [0]:
for i in range(mystack.length()): print(mystack.item(i),end=",")

3,2,1,0,

**Exceptions**

In [0]:
while not mystack.empty(): x = mystack.pop(),
try:
  mystack.pop()
except Exception as e:
  print(e.args)

('stack1.error', 'stack underflow')


**Module clients**

In [0]:
from mystack import *
push(123) # module-name not needed
dump()

<Stack:[123]>


In [0]:
import mystack
mystack.dump()
if not mystack.empty(): # qualify by module name
  x = mystack.pop()
mystack.push(1.23)
mystack.dump()

<Stack:[123]>
<Stack:[1.23]>


##Classes


**Why use classes?**
 
*     Implementing new objects
*      Multiple instances
*      Customization via inheritance
*      Operator overloading


**Class topics**

*     Class basics
*     The class statement
*     Class methods
*     Attribute inheritance
*     Operator overloading
*     Name spaces and scopes
*     OOP, inheritance, and composition
*     Classes and methods are objects
*     Odds and ends
*     Class gotchas

###OOP: The Big Picture


**How it works**

*      All about: object.attr
*      Kicks off a search for first attr → inheritance
*      Searches trees of linked namespace objects
*      Search is DFLR (except in diamonds), per order in class headers
*      Class objects: supers and subs
*      Instance objects: generated from a class
*      Classes can also define expression behavior

![classes](https://learning-python.com/class/Workbook/unit07_files/image002.gif)

```python
class C1: # make class objects (ovals)
class C2:
class C3(C1, C2): # links to superclasses, search order
 

I1 = C3() # make instance objects (rectangles)
I2 = C3() # linked to their class
I1.x # finds customized version in C3
```

**Why OOP?**

*      OOP great at code reuse, structure, and encapsulation
*      Program by customizing in new levels of hierarchy, not changing
*      Extra structure of classes virtually required once programs hit 1K lines (see frigcal)
*      Inheritance: basis of specializing, customizing softwarelower in tree means customization
*      Example: an employee database app company, departments, employees
*      Much more complete than functional paradigm (though the two can often work together)


###A first look: class basics


*   Multiple Instances
*   Specialization by inheritance
*   Implementing operators

**1. Classes generate multiple instance objects**

*      Classes implement new objects: state + behavior
*      Calling a class like a function makes a new instance
*      Each instance inherits class attributes, and gets its own
*      Assignments in class statements make class attributes
*      Assignments to self.attr make per-instance attributes

In [0]:
class FirstClass: # define a class object
  def setdata(self, value): # define class methods
    self.data = value # self is the instance
 
  def display(self):
    print(self.data) # self.data: per instance

    
x = FirstClass() # make two instances
y = FirstClass() # each is a new namespace
 
x.setdata("King Arthur") # call methods: self=x/y
y.setdata(3.14159)
 

x.display() # self.data differs in each
y.display()

King Arthur
3.14159


In [0]:
x.data = "New value" # can get/set attributes
x.display() # outside the class too

New value


![2classes](https://learning-python.com/class/Workbook/unit07_files/image004.gif)

**The worlds simplest Python class?**

In [0]:
# an empty class, class and instance attrs filled in later
class rec: pass # empty namespace object

In [0]:
rec.name = 'Bob' # just objects with attributes
rec.age = 40

print(rec.name) # like a struct in C, a record

Bob


In [0]:
x = rec() # instances inherit class names
y = rec()
x.name, y.name

('Bob', 'Bob')

In [0]:
x.name = 'Sue' # but assignment changes x only
rec.name, x.name, y.name

('Bob', 'Sue', 'Bob')

In [0]:
# really just linked dictionaries
rec.__dict__.keys()

dict_keys(['__module__', '__dict__', '__weakref__', '__doc__', 'name', 'age'])

In [0]:
x.__dict__.keys()

dict_keys(['name'])

In [0]:
y.__dict__.keys()

dict_keys([])

In [0]:
x.__class__

__main__.rec

In [0]:
# even methods can be created on the fly (but not typical)
 
def upperName(self):
  return self.name.upper() # still needs a self

rec.method = upperName
x.method()

'SUE'

In [0]:
y.method() # run method to process y

'BOB'

In [0]:
rec.method(x) # can call through instance or class

'SUE'

**2. Classes are specialized by inheritance**

*      Superclasses listed in parenthesis in class's header
*      Classes inherit attributes from their superclasses
*      Instances inherit attributes from all accessible classes
*      Logic changes made in subclasses, not in-place

In [0]:
class SecondClass(FirstClass): # inherits setdata
  def display(self): # changes display
    print('Current value = "%s"' % self.data)

z = SecondClass()
z.setdata(42) # setdata found in FirstClass
z.display() # finds/calls overridden method

Current value = "42"


In [0]:
x.display() # x is a FirstClass instance

King Arthur


![second class](https://learning-python.com/class/Workbook/unit07_files/image006.gif)

**3. Classes can intercept Python operators**

*      Methods with names like "__X__" are special hooks
*      Called automatically when Python evaluates operators
*      Classes may override most built-in type operations
*      Allows classes to integrate with Python's object model

In [0]:
class ThirdClass(SecondClass): # isa SecondClass
  def __init__(self, value): # "ThirdClass(x)"
    self.data = value

  def __add__(self, other): # "self + other"
    return ThirdClass(self.data + other)

  def __mul__(self, other):
    self.data = self.data * other # "self * other"

a = ThirdClass("abc") # new __init__ called
a.display() # inherited method

Current value = "abc"


In [0]:
b = a.__add__("de") # new __add__ called
b.display()

Current value = "abcde"


In [0]:
a * 3 # new __mul__ called
a.display()

Current value = "abcabcabc"


![overload](https://learning-python.com/class/Workbook/unit07_files/image008.gif)

###A closer look: class terminology

  - Dynamic typing and polymorphism are keys to Python
  - self and __init__ are key concepts in Python OOP

*     Class
  *      An object (and statement) which defines inherited members and methods
*     Instance
  *      Objects created from a class, which inherit its attributes; each instance is a new namespace
*     Member
  *      An attribute of a class or instance object, thats bound to an object
*     Method
  *      An attribute of a class object, thats bound to a function object (a callable member)
*     Self
  *      By convention, the name given to the implied instance object in methods
*     Inheritance
  *      When an instance or class accesses a classs attributes
*     Superclass
  *      Class or classes another class inherits attributes from
*     Subclass
  *      Class which inherits attribute names from another class

###Using the class statement

*     Pythons main OOP tool (like C++)
*     Superclasses are listed in parenthesis
*     Special protocols, operator overloading: __X__
*     Multiple inheritance: class X(A, B, C)
*     Search = DFLR, except for new-style BF in diamonds

**General form**

```python
class <name>(superclass,): # assign to name
  data = value # shared class data
  
  def method(self,): # methods
    self.member = value # per-instance data
```

**Example**

*      class introduces a new local scope
*      Assignments in class create class object attributes
*      self.name = X creates/changes instance attribute

In [0]:
class Subclass(ThirdClass): # define subclass
  data = 'spam' # assign class attr

  def __init__(self, value): # assign class attr
    self.data = value # assign instance attr

  def display(self):
    print(self.data, Subclass.data) # instance, class

    
x, y = Subclass(1), Subclass(2)
x.display(); y.display()

1 spam
2 spam


###Using class methods

*     class statement creates and assigns a class object
*     Calling a class object generates an instance object
*     Class methods provide behavior for instance objects
*     Methods are nested def functions, with a self
*     self is passed the implied instance object
*     Methods are all public and virtual in C++ terms

![subclass](https://learning-python.com/class/Workbook/unit07_files/image010.gif)

**Example**

In [0]:
class NextClass: # define class
  def printer(self, text): # define method
    print(text)

x = NextClass() # make instance
x.printer('Hello world!') # call its method

Hello world!


In [0]:
NextClass.printer(x, 'Hello!') # class method

Hello!


**Commonly used for calling superclass constructors**

```python
class Super:
  def __init__(self, x):
    default code

class Sub(Super):
  def __init__(self, x, y):
    Super.__init__(self, x) # run superclass init
    custom code # do my init actions

I = Sub(1, 2)

# See also: super() built-in for generic superclass access
# But this call has major issues in multiple-inheritance trees
```

###Customization via inheritance

*     Inheritance uses attribute definition tree (namespaces)
*     object.attr searches up namespace tree for first attr
*     Lower definitions in the tree override higher ones

**Attribute tree construction:**
 
1.    Instance → assignments to self attributes
2.    Class → statements (assignments) in class statements
3.    Superclasses → classes listed in parenthesis in header

![iheritance](https://learning-python.com/class/Workbook/unit07_files/image012.gif)

###Specializing inherited methods

*     Inheritance finds names in subclass before superclass
*     Subclasses may inherit, replace, extend, or provide
*     Direct superclass method calls: Class.method(self,)

In [0]:
class Super:
  def method(self):
    print('in Super.method')


class Sub(Super):
  def method(self):
    print('starting Sub.method')
    Super.method(self)
    print('ending Sub.method')


x = Super()
x.method()

in Super.method


In [0]:
x = Sub()
x.method()

starting Sub.method
in Super.method
ending Sub.method


![inh](https://learning-python.com/class/Workbook/unit07_files/image014.gif)

In [0]:
class Super:
  def method(self):
    print('in Super.method') # default
    
  def delegate(self):
    self.action() # expected


class Inheritor(Super):
  pass


class Replacer(Super):
  def method(self):
    print('in Replacer.method')


class Extender(Super):
  def method(self):
    print('starting Extender.method')
    Super.method(self)
    print('ending Extender.method')


class Provider(Super):
  def action(self):
    print('in Provider.action')


for klass in (Inheritor, Replacer, Extender):
  print('\n' + klass.__name__ + '')
  klass().method()
  print('\nProvider')
  Provider().delegate()


Inheritor
in Super.method

Provider
in Provider.action

Replacer
in Replacer.method

Provider
in Provider.action

Extender
starting Extender.method
in Super.method
ending Extender.method

Provider
in Provider.action


###Operator overloading in classes

*     Lets classes intercept normal Python operations
*     Can overload all Python expression operators
*     Can overload object operations: print, call, qualify,
*     Makes class instances more like built-in types
*     Via providing specially-named class methods

In [0]:
class Number:
  def __init__(self, start): # on Number()
    self.data = start
    
  def __add__(self, other): # on x + other
    return Number(self.data + other)
  

X = Number(4)
Y = X + 2
Y.data  

6

![ovl](https://learning-python.com/class/Workbook/unit07_files/image016.gif)

**Common operator overloading methods**
 
*     Special method names have 2 _ before and after
*     See Python manuals or reference books for the full set

Method | Overloads | Called for
---|---|---
__init__ | Constructor | object creation: X()
__del__ | Destructor | object reclamation
__add__ | operator + | X + Y
__or__ | operator \| | X \| Y
__repr__ | Printing | print X, `X`
__call__ | function calls | X()
__getattr__ | Qualification | X.undefined
__getitem__ | Indexing | X[key], iteration, in
__setitem__ | Qualification | X[key] = value
__len__ | Length | len(X), truth tests
__cmp__ | Comparison, 2.X | X == Y, X < Y
__radd__ | operator + | non-instance + X
__iter__ | iteration | for item in X, I=iter(X)
__next__ | iteration | next(I)


 **Examples**
 
*     call: a function interface, with memory

In [0]:
class Callback:
  def __init__(self, color): # state information
    self.color = color

  def __call__(self, *args): # support calls
    print('turn', self.color)

      
cb1 = Callback('blue') # remember blue
cb2 = Callback('green')

cb1() # on events
cb2()

turn blue
turn green


In [0]:
cb3 = (lambda color='red': 'turn ' + color) # or: defaults
cb3()

'turn red'

*     getitem intercepts all index references

In [0]:
class indexer:
  def __getitem__(self, index):
    return index ** 2

X = indexer()
for i in range(5):
  print(X[i]) # __getitem__

0
1
4
9
16


*     getattr catches undefined attribute references

In [0]:
class empty:
  def __getattr__(self, attrname):
    return attrname + ' not supported!'

X = empty()
X.age # __getattr__

'age not supported!'

*     init called on instance creation
*     add intercepts + expressions
*     repr returns a string when called by print

In [0]:
class adder:
  def __init__(self, value=0):
    self.data = value # init data

  def __add__(self, other):
    self.data = self.data + other # add other

  def __repr__(self):
    return str(self.data) # to string

X = adder(1) # __init__
X + 2 # __add__
X + 2 # __add__
X # __repr__

5

*     iter called on start of iterations, auto or manual
*     next called to fetch each item along the way

In [0]:
class squares:
  def __init__(self, start): # on squares()
    self.count = start
    
  def __iter__(self): # on iter()
    return self # or other object with state
  
  def __next__(self): # on next()
    if self.count == 1:
      raise StopIteration # end iteration
    else:
      self.count -= 1
    return self.count ** 2

  
for i in squares(5): # automatic iterations
  print(i)

16
9
4
1


In [0]:
S = squares(10) # manual iterations
I = iter(S) # iter() optional if returns self
next(I)

81

In [0]:
next(I)

64

In [0]:
list(I)

[49, 36, 25, 16, 9, 4, 1]

*     Attribute access management
      - setattr: partial attribute privacy for Python classes
      - getattr: full get/set attribute privacy for Python classes

###Namespace rules: the whole story

*     Unqualified names (X) deal with lexical scopes
*     Qualified names (O.X) use object namespaces
*     Scopes initialize object namespaces: modules, classes

**The Zen of Python Namespaces**

In [0]:
# all 5 Xs are different variables
X = 1 # global
 
def f():
  X = 2 # local

class C:
  X = 3 # class
  
  def m(self):
    X = 4 # local
    self.X = 5 # instance

**Unqualified names: global unless assigned**

*     Assignment: X = value
       -      Makes names local: creates or changes name in the current local scope, unless declared global
 

*     Reference: X
        -      Looks for names in the current local scope, then the current global scope, then the outer built-in scope
 

 

**Qualified names: object name-spaces**

*     Assignment: X.name = value
      -      Creates or alters the attribute name in the namespace of the object being qualified
*     Reference: X.name
        -      Searches for the attribute name in the object, and then all accessible classes above it (none for modules)
 


**Namespace dictionaries**
 
*     Object name-spaces: built-in __dict__ attributes
*     Qualification == indexing a name-space dictionary
*      To get a name from a module M:
   - ```python   
   M.name
   ```
   - ```python   
   M.__dict__['name']
   ```
   - ```python   
   sys.modules['M'].name
   ```
   - ```python   
   sys.modules['M'].__dict__['name']
   ```
   - ```python   
   sys.__dict__['modules']['M'].__dict__['name']
   ```
*  Attribute inheritance == searching dictionaries

In [0]:
class super:
  def hello(self):
    self.data = 'spam' # in self.__dict__

class sub(super):
  def howdy(self): pass

X = sub()
X.__dict__ # a new name-space/dict

{}

In [0]:
X.hola = 42 # add member to X object
X.__dict__

{'hola': 42}

In [0]:
sub.__dict__

mappingproxy({'__doc__': None,
              '__module__': '__main__',
              'howdy': <function __main__.sub.howdy>})

In [0]:
super.__dict__

mappingproxy({'__dict__': <attribute '__dict__' of 'super' objects>,
              '__doc__': None,
              '__module__': '__main__',
              '__weakref__': <attribute '__weakref__' of 'super' objects>,
              'hello': <function __main__.super.hello>})

In [0]:
X.hello()
X.__dict__

{'data': 'spam', 'hola': 42}

###OOP and inheritance

*     Inheritance based on attribute qualification
*     In OOP terminology: is-a relationship
*     On X.name, looks for name in:
  1.   Instance <--- Xs own name-space
  2.   Class <--- class that X was made from
  3.   Superclasses <--- depth-first, left-to-right

**Example: a zoo hierarchy in Python**

In [0]:
class Animal:
  def reply(self): self.speak()
    
  def speak(self): print('spam')


class Mammal(Animal):
  def speak(self): print('huh?')


class Cat(Mammal):
  def speak(self): print('meow')


class Dog(Mammal):
  def speak(self): print('bark')


class Primate(Mammal):
  def speak(self): print('Hello world!')


class Hacker(Primate): pass

![classes](https://learning-python.com/class/Workbook/unit07_files/image018.gif)

In [0]:
spot = Cat()
spot.reply() # Animal.reply, Cat.speak
data = Hacker() # Animal.reply, Primate.speak
data.reply()

meow
Hello world!


###OOP and composition

*     Class instances simulate objects in a domain
*     Nouns→classes, verbs→methods
*     Class objects embed and activate other objects
*     In OOP terminology: has-a relationship

**Example: the dead-parrot skit in Python**

In [0]:
class Actor:
  def line(self): print(self.name + ':', self.says())


class Customer(Actor):
  name = 'customer'
  
  def says(self): return "that's one ex-bird!"


class Clerk(Actor):
  name = 'clerk'

  def says(self): return "no it isn't"


class Parrot(Actor):
  name = 'parrot'

  def says(self): return None


class Scene:
  def __init__(self):
    self.clerk = Clerk() # embed some instances
    self.customer = Customer() # Scene is a composite
    self.subject = Parrot()


  def action(self):
    self.customer.line() # delegate to embedded
    self.clerk.line()
    self.subject.line()
  
  
Scene().action() # activate nested objects  

customer: that's one ex-bird!
clerk: no it isn't
parrot: None


###Classes are objects: factories

*     Everything is a first-class object
*     Only objects derived from classes are OOP objects
*     Classes can be passed around as data objects

In [0]:
def factory(aClass, *args): # varargs tuple
  return aClass(*args) # call aClass
 

class Spam:
  def doit(self, message):
    print(message)
    
    
class Person:
  def __init__(self, name, job):
    self.name = name
    self.job = job
    
    
object1 = factory(Spam) # make a Spam
object2 = factory(Person, "Guido", "guru") # make a Person
object2.name

'Guido'

###Methods are objects: bound or unbound

*     Unbound class methods: call with a self
*     Bound instance methods: instance + method pairs

In [0]:
object1 = Spam()
x = object1.doit # bound method object
x('hello world') # instance is implied


t = Spam.doit # unbound method object
t(object1, 'howdy') # pass in instance

hello world
howdy


###Odds and ends

**Pseudo-private attributes**

*     Data hiding is a convention (until Python1.5 or later)
*     Were all consenting adults Pythons BDFL
*     1.5 name mangling: self.\_\_X → self._Class\_\_X
*     Class name prefix makes names unique in self instance
*     Only works in class, and only if at most 1 trailing _
*     Mostly for larger, multi-programmer, OO projects
*     See \_\_getattr\_\_ above for implementing full privacy

In [0]:
class C1:
  def meth1(self): self.__X = 88 # now X is mine
  
  def meth2(self): print(self.__X) # becomes _C1__X in I


class C2:
  def metha(self): self.__X = 99 # me too
  
  def methb(self): print(self.__X) # becomes _C2__X in I


class C3(C1, C2): pass
  

I = C3() # two X names in I
I.meth1()
I.metha()
print(I.__dict__)

I.meth2()
I.methb()

{'_C1__X': 88, '_C2__X': 99}
88
99


**Documentation strings**

*     Still not universally used (but very close!)
*     Woks for classes, modules, functions, methods
*      String constant before any statements
*      Stored in objects \_\_doc\_\_ attribute

In [0]:
"I am: docstr.__doc__"
 

class spam:
  "I am: spam.__doc__ or docstr.spam.__doc__"
  pass
 

def method(self, arg):
  "I am: spam.method.__doc__ or self.method.__doc__"
  pass
 

def func(args):
  "I am: docstr.func.__doc__"
  pass

**Classes versus modules**

*     Modules
  *     Are data/logic packages
  *     Creation: files or extensions
  *     Usage: imported

*     Classes
  *     Implement new objects
  *     Always live in a module
  *     Creation: statements
  *     Usage: called

**OOP and Python**
 
*     Inheritance
  *      Based on attribute lookup: X.name
*     Polymorphism
  *      In X.method(), the meaning of method depends on the type (class) of X
*     Encapsulation
  *      Methods and operators implement behavior; data hiding is a convention (for now)

In [0]:
class C:
  def meth(self, x): # like x=1; x=2
    pass
  
  def meth(self, x, y, z): # the last one wins!
    pass


class C:
  def meth(self, *args):
    if len(args) == 1:
      pass
    elif type(arg[0]) == int:
      pass


class C:
  def meth(self, x): # the python way:
    x.operation() # assume x does the right thing

**Pythons dynamic nature**

*      Members may be added/changed outside class methods

In [0]:
class C: pass

X = C()
X.name = 'bob'
X.job = 'psychologist'

*     Scopes may be expanded dynamically: run-time binding

In [0]:
def printer():
  print(message) # name resolved when referenced

message = "Hello" # set message now
printer()

Hello


###Subclassing builtin types

*     All types behave like classes: list, str, tuple, dict,
*     Subclass to customize builtin object behavior
*     Alternative to writing wrapper code

In [0]:
# subclass builtin list type/class
# map 1..N to 0..N-1, call back to built-in version
 

class MyList(list):
  def __getitem__(self, offset):
    print('(indexing %s at %s)' % (self, offset))
    return list.__getitem__(self, offset - 1)
  
  
print(list('abc'))
x = MyList('abc') # __init__ inherited from list
print(x) # __repr__ inherited from list
print(x[1]) # MyList.__getitem__
print(x[3]) # customizes list superclass method

x.append('spam')
print(x) # attributes from list superclass

x.reverse()
print(x)

['a', 'b', 'c']
['a', 'b', 'c']
(indexing ['a', 'b', 'c'] at 1)
a
(indexing ['a', 'b', 'c'] at 3)
c
['a', 'b', 'c', 'spam']
['spam', 'c', 'b', 'a']


###New Style Classes in 2.2+ (Advanced)

 

*     Adds new features, changes one inheritance case (diamonds)
*     Py 2.X: only if object or a builtin type as a superclass
*     Py 3.X: all classes automatically new style (object added auto)

In [0]:
class newstyle(object): # NS requires object in 2.X only
  pass

*     Changes: behavior of diamond multiple inheritance
Per the linear MRO: the DFLR path, with all but last appearance of each class removed

In [0]:
class A(object): attr = 1 # NEW STYLE
class B(A): pass
class C(A): attr = 2
class D(B,C): pass # tries C before A
x = D() # more breadth-first
x.attr

2

*     Adds: slots, limits legal attributes set
Used to catch typos and limit memory requirements in pathological cases (ONLY!)

In [0]:
class limiter(object):
  __slots__ = ['age', 'name', 'job']

  
x = limiter()
print("ERROR EXPECTED")
x.ape = 1000

ERROR EXPECTED


AttributeError: ignored

*     Adds: properties, computed attributes alternative.
Used to route attribute access to databases, approvals, special-case code.

In [0]:
class classic():
  def __getattr__(self, name):
    if name == 'age':
      return 40
    else:
      raise AttributeError

x = classic()
x.age # <= runs __getattr__

40

In [0]:
# less verbose way (nneds object in 2.X)
class newprops():
  def getage(self):
    return 40
  
  age = property(getage, None, None, None) # get,set,del

x = newprops()
x.age # <= runs getage

40

*     Adds: static and class methods, new calling patterns
Used to process class data, instead of per-instance date

In [0]:
class Spam: # static: no self
  numInstances = 0 # class: class, not instance

  def __init__(self):
    Spam.numInstances += 1

  def printNumInstances():
    print("Number of instances:", Spam.numInstances)

  # can avoid this in 3.X
  #printNumInstances = staticmethod(printNumInstances)


a = Spam()
b = Spam()
c = Spam()
Spam.printNumInstances()

Number of instances: 3


*     Function and class decorators (not just newstyle).
Rebinds names to objects that process functions and classes, or later calls to them.

**Function name rebinding**

```python
@funcdecorator
def F():
...
 
# is equivalent to
 
def F():
  F = funcdecorator(F) # rebind name, possibly to proxy object
```

**Class name rebinding**

```python
@classdecorator
class C:
...

# is equivalent to
 
class C:
  C = classdecorator(C) # rebind name, possibly to proxy object
```
  
**Use for static methods (and properties, etc.)**

```python
class C:
  @staticmethod
  def meth():
...
 
# is equivalent to
 
class C:
  def meth():
  ...
  meth = staticmethod(meth) # rebind name to call handler
 
# ditto for properties

class newprops(object):
  @property
  def age(self): # age = property(age)
    return 40 # use X.age, not X.age()
``` 

**Nesting: multiple augmentations**

```python
@A
@B
@C
def f(): 
 
# is equivalent to

def f(): 
  f = A(B(C(f)))
```

**Arguments: closures, retain state for later calls**

```python
@funchandler(a, b)
def F():
 
# is equivalent to
 
def F():
  F = funchandler(a, b)(F)
 ```

 

 


###Class gotchas

*     Multiple inheritance: order matters
*      Solution: use sparingly and/or carefully

In [0]:
class Super1:
  def method2(self): # a 'mixin' superclass
    print('in Super1.method2')


class Super2:
  def method1(self):
    self.method2() # calls my method2??
  
  def method2(self):
    print('in Super2.method2')


class Sub1(Super1, Super2):
  pass # gets Super1's method2


class Sub2(Super2, Super1):
  pass # gets Super2's method2


class Sub3(Super1, Super2):
  method2 = Super2.method2 # pick method manually


Sub1().method1()
Sub2().method1()
Sub3().method1()

in Super1.method2
in Super2.method2
in Super2.method2


###Example: a set class

*     Wraps a Python list in each instance
*     Supports multiple instances
*     Adds operator overloading
*     Supports customization by inheritance
*     Allows any type of component: heterogeneous

In [0]:
class Set:
  def __init__(self, value = []): # constructor
    self.data = [] # manages a list
    self.concat(value)
    
  def intersect(self, other): # other is a sequence
    res = [] # self is the subject
    for x in self.data:
      if x in other:
        res.append(x)
    return Set(res) # return a new Set
  
  def union(self, other):
    res = self.data[:] # copy of my list
    for x in other:
      if not x in res:
        res.append(x)
    return Set(res)
  
  def concat(self, value): # value: list, Set
    for x in value: # removes duplicates
      if not x in self.data:
        self.data.append(x)

  def __len__(self): return len(self.data)
  def __getitem__(self, key): return self.data[key]
  def __and__(self, other): return self.intersect(other)
  def __or__(self, other): return self.union(other)
  def __repr__(self): return 'Set:' + str(self.data)

In [0]:
x = Set([1,2,3,4]) # __init__
y = Set([3,4,5])

x & y, x | y # __and__,__or__,__repr__

(Set:[3, 4], Set:[1, 2, 3, 4, 5])

In [0]:
z = Set("hello") # set of strings
z[0] # __getitem__

'h'

In [0]:
z & "mello", z | "mello"

(Set:['e', 'l', 'o'], Set:['h', 'e', 'l', 'o', 'm'])

###Summary: OOP in Python

*     Class objects provide default behavior
    *      Classes support multiple copies, attribute inheritance, and operator overloading
    *      The class statement creates a class object and assigns it to a name
    *      Assignments inside class statements create class attributes, which export object state and behavior
    *      Class methods are nested defs, with special first arguments to receive the instance
*     Instance objects are generated from classes
    *      Calling a class object like a function makes a new instance object
    *      Each instance object inherits class attributes, and gets its own attribute namespace
    *      Assignments to the first argument ("self") in methods create per-instance attributes
*     Inheritance supports specialization
    *      Inheritance happens at attribute qualification time: on object.attribute, if object is a class or instance
    *      Classes inherit attributes from all classes listed in their class statement header line (superclasses)
    *      Instances inherit attributes from the class they are generated from, plus all its superclasses
    *      Inheritance searches the instance, then its class, then all accessible superclasses (depth-first, left-to-right)

##Exceptions

**Why use exceptions?**

*     Error handling
*     Event notification
*     Special-case handling
*     Unusual control-flows

**Exception topics**

*     The basics
*     Exception idioms
*     Exception catching modes
*     Class exceptions
*     Exception gotchas

###Exception basics

*     A high-level control flow device
*     try statements catch exceptions
*     raise statements trigger exceptions
*     Exceptions are raised by Python or programs

**Basic forms**

-      Python 2.5+: except/finally can now be mixed
-      Python 2.5+: with/as context managers
-      Python 3.X: except E as X, raise from E

*     try/except/else
```python
try:
<statements> # run/call actions
except <name>:
<statements> # if name raised during try block
except <name> as <var>:
<statements> # if name raised during try block
else:
<statements> # if no exception was raised
```

*  try/finally
```python
try:
<statements>
finally:
<statements> # always run 'on the way out'
```

* raise
```python
raise <name> # manually trigger an exception
```

* assert
```python
assert <test>, <message>
# if not test: raise AssertionError, message
``` 

* with/as context managers (2.5+)

```python
# alternative to common try/finally idioms
# 
file reading example
with open('/etc/passwd', 'r') as f: # auto-closed after with
  for line in f: # even if exception in block
    print line
    # more processing code
 
# thread locking example
lock = threading.Lock()
with lock: # auto acquired, released
# critical section of code 
# classes may define managers
```

###First examples

**Builtin exceptions**

*     Python triggers builtin exceptions on errors
*     Displays message at top-level if not caught

In [0]:
def kaboom(list, n):
  print(list[n]) # trigger IndexError

try:
  kaboom([0, 1, 2], 30)
except IndexError: # catch exception here
  print('Hello world!')

Hello world!


**User-defined exceptions**

*     Python (and C) programs raise exceptions too
*     User-defined exceptions are objects

In [0]:
class TestFailed(Exception):
    def __init__(self, m):
        self.message = m
    def __str__(self):
        return self.message

def stuff(args):
  raise TestFailed('oops')
  
  
args = 'some data'
try:
  stuff(args) # raises exception
except TestFailed as e:
    print(e)
finally:
  print("failed...finally") # always close file

oops
failed...finally


###Exception idioms

*     EOFError sometimes signals end-of-file
```python
while 1:
  try:
          line = raw_input() # read from stdin
  except EOFError:
          break
  else:
          <process next line here>
```

*     Outer try statements can be used to debug code
```python
try:
        <run program>
except: # all uncaught exceptions come here
        import sys
        print 'uncaught!', sys.exc_info()[:2] # type, value
```

###Exception catching modes

*     Try statements nest (are stacked) at runtime
*     Python selects first clause that matches exception
*     Try blocks can contain a variety of clauses
*     Multiple excepts: catch 1-of-N exceptions
*     Try can contain except or finally, but not both

**Try block clauses**

Operation |	 Interpretation
---|---
except: |	catch all exception types
except name: |	catch a specific exception only
except name, value: |	2.X: catch exception and its extra data
except name as value: |	3.X: catch exception and its instance
except (name1, name2): |	catch any of the listed exceptions
else: |	run block if no exceptions raised
finally: |	always perform block, exception or not

*     Exceptions nest at run-time
      -      Runs most recent matching except clause

In [0]:
def action2():
  print(1 + []) # generate TypeError
  
def action1():
  try:
    action2()
  except TypeError: # most recent matching try
    print('inner try')

    
try:
  action1()
except TypeError: # here iff action1 re-raises
  print('outer try')

inner try


*     Catching 1-of-N exceptions
  -      Runs first match: top-to-bottom, left-to-right
  -      See manuals or reference text for a complete list

In [0]:
try:
  action2()
except NameError:
  print("name error")
except IndexError:
  print("index error")
except KeyError:
  print("key error")
except (AttributeError, TypeError, SyntaxError):
  print("more errors")
else:
  print("did not except")

more errors


*     finally clause executed on the way out
  -      useful for cleanup actions: closing files,
  -      block executed whether exception occurs or not
  -      Python propagates exception after block finishes
  -      but exception lost if finally runs a raise, return, or break

In [0]:
def divide(x, y):
  return x / y # divide-by-zero error?

def tester(y):
  try:
    print(divide(8, y))
  finally:
    print('on the way out')
    
    
print('\nTest 1:')
tester(2)
print('\nTest 2:')
tester(0) # trigger error


Test 1:
4.0
on the way out

Test 2:
on the way out


ZeroDivisionError: ignored

*     Optional data
  -      Provides extra exception details
  -      Python passes None if no explicit data

In [0]:
class myException(Exception):
    def __init__(self, m):
        self.message = m
    def __str__(self):
        return self.message

def raiser1():
  raise myException("hello") # raise, pass data
  
def raiser2():
  raise myException("world") # raise, None implied
  
def tryer(func):
  try:
    func()
  except myException as e:
    print('got this:', e)
    
tryer(raiser1)
tryer(raiser2)

got this: hello
got this: world


###Class exceptions

-      Should use classes today: only option in 3.X, per BDFL
-      Useful for catching categories of exceptions
-      String exception match: same object (is identity)
-      Class exception match: named class or subclass of it
-      Class exceptions support exception hierarchies

**General raise forms**

```python
raise Exception(string) # matches same string object
raise Exception(string, data) # optional extra data (default=None)
raise class, instance # matches class or its superclass
raise instance # = instance.__class__, instance
```

In [0]:
class Super(Exception): pass
class Sub(Super): pass
 
def raiser1():
  X = Super() # raise listed class instance
  raise X
 
def raiser2():
  X = Sub() # raise instance of subclass
  raise X
 
for func in (raiser1, raiser2):
  try:
    func()
  except Super: # match Super or a subclass
    import sys
    print('caught:', sys.exc_info()[0])
 


caught: <class '__main__.Super'>
caught: <class '__main__.Sub'>


In [0]:
class MyBad(Exception):
  def __init__(self, file, line):
    self.file = file
    self.line = line

  def display(self):
    print(self.file * 2)

def parser():
  raise MyBad('spam.txt', 5)

try:
  parser()
except MyBad as X:
  print(X.file, X.line)
  X.display()

spam.txt 5
spam.txtspam.txt


In [0]:
# built-in file error numbers
def parser():
  open('nonesuch')

try:
  parser()
except IOError as X:
  print(X.errno, '=>', X.strerror)
 


2 => No such file or directory


###Exception gotchas

**What to wrap in a try statement?**

-      Things that commonly fail: files, sockets, etc.
-      Calls to large functions, not code inside the function
-      Anything that shouldnt kill your script
-      Simple top-level scripts often should die on errors
-      See also atexit module for shutdown time actions

**Catching too much?**

-      Empty except clauses catch everything
-      But may intercept error expected elsewhere
```python
try:
[]
except:
[] # everything comes here: even sys.exit()!
```

**Catching too little?**

-      Specific except clauses only catch listed exceptions
-      But need to be updated if add new exceptions later
-      Class exceptions would help here: category name
```python
try:
[]
except (myerror1, myerror2): # what if I add a myerror3?
[] # non-errors
else:
[] # assumed to be an error
```

**Solution: exception protocol design**