![MLTrain logo](https://mltrain.cc/wp-content/uploads/2017/11/mltrain_logo-4.png "MLTrain logo")

In [1]:
% run changeNBLayout.py

# TL;DR #
This is a crash course on Python. __Focuses on language features that will be used throughout the rest of the lessons but is not exhauseted to this__.  
  
We'll go through the following language features in sequence:
1. Literals
2. Containers
3. Comprehensions and 'Generator Expressions'
4. Function Objects and Closures
5. The Python data model

  
### References ###
[Fluent Python](http://shop.oreilly.com/product/0636920032519.do
 ) from O'Reilly. A great book for mastering Python programming
 
 [Python for Data Anaysis](http://shop.oreilly.com/product/0636920023784.do). Probably the best book on Python for scientific computing

# Coding Semantics #

Python uses whitespace (tabs or spaces) to structure code instead of using braces as in many other languages like C++, Java or R.  
A __colon__ (:) denotes the start of an indented code block after which all of the code must be indented by the same amount until the end of the block  
__Comments__ start with a hash (#) and include everythin to the end of the line

``` Python
for x in array:
    # This is a comment line
    if x < pivot:
        less.append(x) # This is comment too
    else:
        greater.append(x)
```

---

# Literals #


In [4]:
# integers
print 123456

# floats
print 1., 1.0, 1e-2

# complex numbers
print 1 + 3.j

# strings and characters
print 'a', 'abcdefg', "hijklmnop"

# Strings of Hex and octal codes
print '\x12\x40\x41', '\123\456\124'

# Strings with and without interpreted backslash
print 'escaped quote: \'', "unescaped quote: '\ta" 
print r'\a\b\c\n\t'

# Multiline strings
print """
Tensorflow MLTrain Athens
The first Deep-learning course ever"""

# Boolean literals and expressions
print True, False

123456
1.0 1.0 0.01
(1+3j)
a abcdefg hijklmnop
@A S.T
escaped quote: ' unescaped quote: '	a
\a\b\c\n\t

Tensorflow MLTrain Athens
The first Deep-learning course ever
True False


# Variables #

Variables are defined by assigning them values.  
This is called __binding__ in Python because a reference, rather than the actual value is assigned to the variable.  

In [7]:
x = 1
y = 123e-4

a = True
b = 'MLTrain Athens'

# Assignments are quite flexible (or vague) in Python. 
# More on this later
b = a
print b

True


# Operators and Expressions #

In [5]:
# Arithmetic operators
x, y = 123, 1e-2
print x + y, 1 - y, x/y, y**2

# Logical operators
x, y = True, False
print x and y, x or y, not x 
print not (x or y)

# Bitwise operators
x, y = 1, 0
print x & y, x | y, x >> 1, ~y

# Relational (comparison) operators
x, y = 1, -1
print x < y, x <= x, x >= y, x > y, x == y, x != y

123.01 0.99 12300.0 0.0001
False True False
False
0 1 0 -1
False True True True False True


# Objects and Object References #

Every number, string, data structure, function, class, module, exists
in the Python interpreter in its own “box” which is referred to as a Python object.  
Each object has an associated type (for example, string or function) and internal data.  

When we assign values to variables in Python, we actually assign __the memory address__ of (a reference to) the object


In [9]:
# All assignments copy references to objects e.g.
a = [1, 2, 3]
b = a

a[0] = -1
print b

[-1, 2, 3]


In [10]:
# Even literals are objects and as such have attributes and methods
# Calling integer_ratio method on float literal:
1e-2.as_integer_ratio()

(5764607523034235, 576460752303423488)

### <span style = "color: red"> Quiz </span> ###

``` Python
a = 4
b = a
a = 2
```

What is the value of b?

# Function calls and arguments #

A function named `foo` taking two arguments a and b and returning val is defined as follows:  

``` Python
def foo(a, b):
    # function body statements
    return val
```

`foo` can be called either as  
  
`foo(2, 3)`  
or  
`foo(2, b = 3)`  
or  
`foo(a = 2, b = 3)`
  
In the last call a and b are refered to as _keyword arguments_  
In the first call as _positional arguments_.
  
__NB:__ Postitional arguments cannot follow keyword arguments

In [41]:
def foo(a, b):
    return a + b

print 'integer foo:', endl, foo(1, 2)
print endl, 'string foo:', endl, foo('ab', 'c')
print endl, 'list foo', endl, foo([1, 2, 3], [4, 5, 6])

integer foo: 
3

string foo: 
abc

list foo 
[1, 2, 3, 4, 5, 6]


---
# Containers #

### Lists ###

1. Lists are __mutable__ sequences of objects.  
2. Lists hold references to the contained objects, so they can store elements of different types, including other lists (ie list of lists)


In [6]:
from os import linesep as endl

lis = [1, 2, 3, 4]
print lis, endl

lis.append('five')
print lis, endl

lis.insert(0, 0.)
print lis

[1, 2, 3, 4] 

[1, 2, 3, 4, 'five'] 

[0.0, 1, 2, 3, 4, 'five']


------------------------------
### List methods and operators ###

In [7]:
# List methods
print len(lis), lis.index(2), (lis * 2).count(lis.index(0))

# List operators
print endl, lis + lis, endl, lis * 3

6 2 2

[0.0, 1, 2, 3, 4, 'five', 0.0, 1, 2, 3, 4, 'five'] 
[0.0, 1, 2, 3, 4, 'five', 0.0, 1, 2, 3, 4, 'five', 0.0, 1, 2, 3, 4, 'five']


-----------------
### List Indexing ###
  
Subsets of list elements (shards) are defined by __3 numbers__ separated by semicolons: __starting:ending:step__  
Any of them can be negative  
A negative _starting_ or _ending_ number means _list-end_ minus the _number_  
A negative _step_ means `move backwards`

In [17]:
# List ctor from string
lis = list('abcdefghijklm')

print lis[:5], endl, lis[5:] 

['a', 'b', 'c', 'd', 'e'] 
['f', 'g', 'h', 'i', 'j', 'k', 'l', 'm']


Print the whole list except the first and last element:

In [18]:
print lis[1:-1] 

['b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l']


Traverse backwards by 2 from (last - 1) to the 2nd element

In [23]:
print lis[-1:1:-2] 

['m', 'k', 'i', 'g', 'e', 'c']


Reverse the list

In [24]:
print lis[::-1]

['m', 'l', 'k', 'j', 'i', 'h', 'g', 'f', 'e', 'd', 'c', 'b', 'a']


Create a composite list from a string:

In [34]:
sl = 'My name is Christos Malliopoulos'.split()
sl1 = [sl[:2], sl[2], sl[-2:]]
print sl1

[['My', 'name'], 'is', ['Christos', 'Malliopoulos']]


Now reference my first name:

In [35]:
print sl1[-1][0]

Christos


### String functions and methods ###

In [13]:
import string
from os import linesep as endl

s = string.ascii_letters
print s


abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ


Split 's' into lowercase and uppercase strings:

In [26]:
lower, upper = s.split('A')
upper = 'A' + upper

print 'lowercase:', endl, lower 
print 'uppercase', endl, upper

lowercase: 
abcdefghijklmnopqrstuvwxyz
uppercase 
ABCDEFGHIJKLMNOPQRSTUVWXYZ


Split 's' using a 'find' string method:

In [25]:
lower, upper = s[:s.find('A')], s[s.find('A'):]
print 'lowercase:', endl, lower 
print 'uppercase', endl, upper

lowercase: 
abcdefghijklmnopqrstuvwxyz
uppercase 
ABCDEFGHIJKLMNOPQRSTUVWXYZ


Create a composite list from a string:

In [34]:
sl = 'My name is Christos Malliopoulos'.split()
sl1 = [sl[:2], sl[2], sl[-2:]]
print sl1

[['My', 'name'], 'is', ['Christos', 'Malliopoulos']]


Now reference my first name:

In [35]:
print sl1[-1][0]

Christos


Searching in a string:

In [43]:
s = 'Killroy was here'
print False if s.lower().find('killroy') == -1 else True

True


How many instances are there?

In [47]:
s.count('l')

2

Convert a list of characters back to string:

In [52]:
letters = [chr(i) for i in range(40, 60)]
print ''.join(letters)
print '-'.join(letters)

()*+,-./0123456789:;
(-)-*-+-,---.-/-0-1-2-3-4-5-6-7-8-9-:-;


__String formatting__  
`format` string method is analogous to `sprintf` in C. Formats objects into a string.  
`format` is very rich in formatting capabilities. We'll go through its elementary features:

In [60]:
anInt = 123
aFloat = 1e-2
aString = 'MLTrain'

'{} calculator: {:<4d} by {:0<12.4f} is {}'.format(aString, anInt, aFloat, anInt * aFloat)

'MLTrain calculator: 123  by 0.0100000000 is 1.23'

----------------
### Tuples ###
Tuples are  __immutable__  sequences of objects.  
Tuples also hold references to their elements, thereby being able to contain objects of any type, including themselves  
  
What is the use of immutability?  
Internally a hash code is associated with each immutable object so that the object can be used as key in other structures described later (namely dictionaries and sets).

Create a composite list from a string:

In [34]:
sl = 'My name is Christos Malliopoulos'.split()
sl1 = [sl[:2], sl[2], sl[-2:]]
print sl1

[['My', 'name'], 'is', ['Christos', 'Malliopoulos']]


Now reference my first name:

In [35]:
print sl1[-1][0]

Christos


In [9]:
tup = tuple(range(1, 12, 2))

# immutability
try: 
    tup[0] = 4
except TypeError, e: 
    print 'Throws a TypeError with message:', str(e)

# Tuple methods:
# 'count(value)' returns the number of occurances of value in tuple
print tup.count(2)
# 'index' returns the position of value in tuple (or in the part of the tuple between 'start' and 'stop')
print tup.index(5, 1, 4)

# Tuple operators
print tup * 4, endl, tup + tup

Throws a TypeError with message: 'tuple' object does not support item assignment
0
2
(1, 3, 5, 7, 9, 11, 1, 3, 5, 7, 9, 11, 1, 3, 5, 7, 9, 11, 1, 3, 5, 7, 9, 11) 
(1, 3, 5, 7, 9, 11, 1, 3, 5, 7, 9, 11)


### Tuple unpacking ###
It is the automatic assignment of tuple members to variables.  
The tuple members and the variables can be tuples themselves, thereby permitting nested unpackings

In [10]:
tup = ('Athens', 'PFBDAML101', 2017, 16)

city, title, year, hrs = tup
yearHrs = tup[-2:]

print tup
print city, title, year, hrs
print yearHrs

('Athens', 'PFBDAML101', 2017, 16)
Athens PFBDAML101 2017 16
(2017, 16)


In [14]:
# Interesting: Swap values in one step using tuple unpacking
a, b = 10, 20
b, a = a, b
print a, b

20 10


----------------------
### Dictionaries ###
Dictionaries are mutable key-value collections.  

In [16]:
di = {'one': 1, 2: 'two', 3.: (1, 1, 1)}
print di['one'], di.get(3.)
print di.keys(), di.values()
print di.items()

1 (1, 1, 1)
[2, 3.0, 'one'] ['two', (1, 1, 1), 1]
[(2, 'two'), (3.0, (1, 1, 1)), ('one', 1)]


Composite (nested) dictionaries:

In [37]:
# an 'inorder' traversal of a binary tree
d = {'1': {'11': 1, '12': 2}, '2': {'21': 3, '22': 4}}

# Its rightmost leaf:
print d['2']['22']

4


### Sets ###
Sets are collections of unique hashable objects of any type.  
Sets implement standard set operations as methods: union, interection, difference and symmetric_difference  
Use sets to discard duplicates or when seeking an efficient 'in' operator

In [17]:
s = set(range(10) * 2)
print s

print True if 3 in s else False

# Sets do not support indexing
try: 
    s[0]
except TypeError, te:
    print "I don't support indexing: (" + str(te) + ")"

set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
True
I don't support indexing: ('set' object does not support indexing)


### Set Operations ###

In [18]:
s1 = set(range(12))
s2 = {1, 1, 2, 3, 0}

# Set difference
print set.difference(s1, s2)

# Intersection
print set.intersection(s1, set.difference(s2, s1))

# Union
print set.union(s1, s2)

# Membership test
if 2 in s1: print '2 in s1' 
else: print '2 not in s1'

set([4, 5, 6, 7, 8, 9, 10, 11])
set([])
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
2 in s1


--------------------------------------------------------------------------
# 3. List, dict and set comprehensions #

Jargon: List comprehensions are also refered to as `listcomps`
  
Comprehensions construct lists, dicts and sets by wrapping nested for-loops and if-then statements elegantly  

In [12]:
# List comprehensions
symbols = '~!@#$%^&*'

# for loop:
res = []
for sym in symbols:
    if sym in '^&*':
        res.append(ord(sym))
print 'Filtered for-loop:', endl, res

# Listcomp
res = [ord(sym) for sym in symbols if sym in '^&*']
print endl, 'Listcomp', endl, res

Filtered for-loop: 
[94, 38, 42]

Listcomp 
[94, 38, 42]


In [13]:
# Dictcomps
print {sym: ord(sym) for sym in symbols if sym in '^&*'}

{'*': 42, '&': 38, '^': 94}


In [17]:
# Set comprehensions
import random as ran

multiSymbols = list(symbols * 3)
ran.shuffle(multiSymbols)

print 'multiSymbols:', endl, multiSymbols
print endl, 'Setcomp removes duplicate entries:', endl, {ord(sym) for sym in multiSymbols if sym in '^&*'}


multiSymbols: 
['#', '%', '@', '$', '~', '&', '^', '@', '*', '&', '&', '$', '*', '!', '^', '~', '^', '!', '@', '*', '%', '#', '~', '#', '$', '!', '%']

Setcomp removes duplicate entries: 
set([42, 94, 38])


---------------------
# 4. Generator Expressions #
Generator expressions work like comprehensions with the notable difference that they generate contained values upon request.  
  
This is particularely useful when multiple trainsformations must be applied to an initial dataset because it __eliminates copies of the intermediate results__


In [22]:
# xrange is a built in returning a generator
for t in xrange(10000000):
    print t ** 2,
    if t > 10: print; break

# DO NOT try the following with large values
for t in range(10):
    print t ** 2,


0 1 4 9 16 25 36 49 64 81 100 121
0 1 4 9 16 25 36 49 64 81


### Creating generator expressions ###

In [23]:
def xrangeEmulator(upper_):
    i = 0
    while i < upper_: 
        yield i
        i += 1

In [24]:
for i in xrangeEmulator(100000000):
    print i**2,
    if i > 10: break

0 1 4 9 16 25 36 49 64 81 100 121


# 5. Functions and Function Objects #

- Functions are created with the statement `def <name>(<arguments>)`  
- Since Python is dynamicly typed (objects obtain type during instantiation or binding), functions can return different types and the types of their arguments can vary

In [25]:
# A recursive function
def factorial(n_):
    ''' Returns the factorial of a number 
    '''
    return n_ * factorial(n_ - 1) if n_ > 1 else 1

print factorial(10)

# A function accepting args of different types and returning objects of different types:
def firstElement(arg_):
    return arg_[0] if isinstance(arg_, (list, tuple, dict, str)) else arg_

print firstElement([1,2])
print firstElement('qwerty')
print firstElement({1, 2, 3})
print firstElement(1e2)
print firstElement(True)


3628800
1
q
set([1, 2, 3])
100.0
True


### Function Objects ###

In PL theory a function is an object if it can be:
- assigned to a variable or an element of a data structure
- passed as a function argument
- returned as a result from a function

In [26]:
x = factorial
x(5)

120

In [27]:
# We can assign arbitrary atributes to functions
x.myAttr = True
print x.__dict__

{'myAttr': True}


In [18]:
# We get ALL the attributes of a funtion object through 'dir' function:
print dir(x)

['__abs__', '__add__', '__and__', '__class__', '__cmp__', '__coerce__', '__delattr__', '__div__', '__divmod__', '__doc__', '__float__', '__floordiv__', '__format__', '__getattribute__', '__getnewargs__', '__hash__', '__hex__', '__index__', '__init__', '__int__', '__invert__', '__long__', '__lshift__', '__mod__', '__mul__', '__neg__', '__new__', '__nonzero__', '__oct__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdiv__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'bit_length', 'conjugate', 'denominator', 'imag', 'numerator', 'real']


<div style = "color: darkred; font-size: 140%; font-weight: bold;  text-decoration: underline"> Quiz </div>

Using set operations as described in 'Set Operations' section and, the definition of a dummy class that does nothing, find the object attributes that are unique in function objects  
  
Hint: a dummy class is defined as

```Python
class Dummy(object): pass
```

Use `dir(obj)` to get a list of the attributes of an object in Python

In [50]:
class fooC(): pass
def fooF(): pass

print set(dir(fooF)) - set(dir(fooC))

set(['func_closure', '__str__', '__reduce__', '__dict__', '__sizeof__', '__code__', '__init__', 'func_code', '__setattr__', '__reduce_ex__', '__new__', '__format__', '__class__', '__closure__', '__call__', 'func_globals', 'func_dict', 'func_name', '__getattribute__', '__subclasshook__', '__name__', '__get__', '__defaults__', '__globals__', '__delattr__', 'func_defaults', '__repr__', '__hash__', 'func_doc'])


### Unonymous Functions (lambdas) ###

In [25]:
# Remove the numeric digits from string
import re

removeDigits = lambda _: re.sub('[0-9]', '', _)
print removeDigits('123456abcdefg789')

abcdefg


### Higher-order functions (Closures) ###

In [27]:
# sort a list of strings based on the number of distinct letters in the string
# 'sort' is a higher order function

strings = ['foo', 'card', 'bar', 'aaaa', 'abab']
strings.sort(key = lambda _: len(set(list(_))))
print strings

['aaaa', 'foo', 'abab', 'bar', 'card']


### Closures: Functions as return values ##

A function that fabricates and returns a function is called __closure__  
Closures are good for generating polymorphic functions and, most important, creating stateful functions

In [31]:
# A stateful function that keeps track of the arguments it has been called with:
from os import linesep as endl

def make_watcher():
    have_seen = {}

    def has_been_seen(x):
        if x in have_seen:
            return True
        else:
            have_seen[x] = True
        return False

    return has_been_seen

watcher = make_watcher()
print [watcher(_) for _ in range(3) * 2]

[False, False, False, True, True, True]


__Caveat:__  
If you try to _rebind_ in the inner function variables declared in the outer function  
you get a `local variable referenced before assignment` error  
i.e. inner declarations hide the outer
  
There's no problem if you use the outer var as an lvalue:

In [28]:
def rebindFoo():
    outerVar = 123
    def ret(innerArg): outerVar = innerArg - outerVar; print outerVar
    return ret

def mutateFoo():
    outerVar = [123]
    def ret(innerArg): outerVar[0] = innerArg - outerVar[0]; print outerVar[0]
    return ret

# rebindFoo(123)(4)
mutateFoo()(4)


-119


-----------------------------------------
# The Python data model #

We have seen above that Python is a __dynamic language__: a variable is a lexical symbol for a reference that can be assigned to anything.  
A string, then a list, a float etc.  The interpreter will not complain.  
there's no compiler and no static type-checking, however this does not mean that there's no type-checking at all!.  
There're quite strict type checks at run-time.  
  
This language feature is the tip of the iceberg that is called __the Python data model__. It is a thorough set of interfaces without the interface (abstract) classes!  
If you want the objects of your class to have length (be able to get called by the `len` function) all you need to do is implement a `__len__(self)` method.  

You can even write a class that, in some parts of your code has length and elsewhere does not, because you can assign methods to classes _at will_. This is really _a new kind of polymorphism!_
  
__Let's see how this works in practice:__

# A Pythonic card-deck #

In [15]:
import collections

Card = collections.namedtuple('Card', ['rank', 'suit'])

class FrenchDeck(object):
    ranks = [str(n) for n in range(2, 11)] + list('JQKA')
    suits = 'spades diamonds clubs hearts'.split()

    def __init__(self):
        self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks]

    def __len__(self):
        return len(self._cards)

    def __getitem__(self, position):
        return self._cards[position]

In [16]:
from os import linesep as endl

beer_card = Card('7', 'diamonds')
print beer_card

Card(rank='7', suit='diamonds')


In [17]:
deck = FrenchDeck()

print len(deck)
print deck[0], deck[-1]

52
Card(rank='2', suit='spades') Card(rank='A', suit='hearts')


Should we create a method to pick a random card?  
No need. Python already has a function to get a random item from a sequence: `random.choice`.   
We can just use it on a deck instance:

In [18]:
from random import choice
choice(deck)

Card(rank='6', suit='clubs')

------------------------------
<span style = "font-size: 180%; font-weight: bold; color: darkgreen">Exercise</span>  
the attribute access and setting in the Python datamodel is specified by the protocol methods `__setattr__` and `__getattr__`.  
Indexing is specified by `__setitem__` and `__getitem__`.  
  
Create a dictionary with attribute-like getter and setter by just assigning the base class `__getitem_` and `__setitem__` to `__getattr__` and `__setattr__`.  

In [None]:
ad = AttrDict({'one': 1, 'two': 2, 3: 'three'})
print ad.one

ad.four = 4
print ad.four

# Any caveats?