# Introduction to Python II

## Intermediate Programming Constructs

### DATA 601: Fall 2019 

**Usman Alim ([ualim@ucalgary.ca](mailto:ualim@ucalgary.ca))** 

Further Reading:

* **Python for Data Analysis** (second edition), by _Wes McKinney_ (Chapter 3). ([Library link](https://ucalgary-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=01UCALG_ALMA51642853910004336&context=L&vid=UCALGARY&search_scope=EVERYTHING&tab=everything&lang=en_US) for book)
* [**The Python Tutorial**](https://docs.python.org/3/tutorial/index.html) by the Python Software Foundation.


## Outline

- [Sets and Dictionaries](#sets)
- [Tuple Packing and List Slicing](#packingAndSlicing)
- [Comprehensions](#comprehensions)
- [Anonymous Functions and Generators](#lambdas)

## <a name="sets"></a>Sets

- A set is a mathematical set, i.e. an _unordered_ collection of unique Python objects.
- Declare a set using curly braces (`{}`).
- Set operations like unions, intersections and differences are supported.
- Sets are _mutable_.

In [4]:
# Sets and set operations

A = {0,2,4,6,8}
B = {1,3,5,7,9}

print( A | B ) # set union, can also use A.union(B)
print( A & B ) # set intersection, can also use A.intersection(B)
print("\n")

# subset, superset and disjoint sets
print(A.issubset(B))
print(A.issuperset(B))
print(A.isdisjoint(B))
print("\n")

# We cannot index sets, the following is not allowed.
# A[0]
print(list(A))


{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
set()


False
False
True


[0, 2, 4, 6, 8]


In [5]:
# Duplicates are automatically removed when constructing sets
A = set(range(10))
B = {0,2,2,4,6,8}

print(A)
print(B)
print("\n")

# sets are mutable, we can do a set operation and an assignment together 
A -= B
print(A)
A |= B
print(A)
print("\n")

# A set can be converted to a list or a tuple
X = {5,4,3,2,1}
print(tuple(X))
print(list(X))

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
{0, 2, 4, 6, 8}


{1, 3, 5, 7, 9}
{1, 3, 4, 5, 2, 7, 0, 9, 6, 8}


(1, 2, 3, 4, 5)
[1, 2, 3, 4, 5]


### Dictionaries

- A *dictionary* is a collection of *key*,  *value* pairs where both the  **key**  and the **value** are Python objects.
- A dictionary is an associative array, a key is mapped to a value.
- Declare a `dict` using curly braces (`{}`) and separate keys using a colon (`:`).
- Dictionaries are variable-length.
- Keys are _immutable_ while the associated values are _mutable_.
- Keys in a dictionary must be unique.
- Any _hashable_ object can be used as a key.

In [6]:
# Working with dictionaries

ages = {'susan':23, 'brian':25, 'joe':28, 'al':21}
print(ages)
print("\n")

# Keys are used for indexing
ages['susan'] = 22
print(ages)
print("\n")

# Adding, deleting and modifying key-value pairs
ages['frank'] = 30  # A new key-value pair is created
del ages['brian']
ages.update({'salim':27, 'joe':29})
print(ages)
print("\n")

# Check for membership by the key
print('alim' in ages)
print(ages.get('alim')) # Returns None
print(ages['alim']) # Throws an exception



{'susan': 23, 'brian': 25, 'joe': 28, 'al': 21}


{'susan': 22, 'brian': 25, 'joe': 28, 'al': 21}


{'susan': 22, 'joe': 29, 'al': 21, 'frank': 30, 'salim': 27}


False
None


KeyError: 'alim'

In [7]:
# More dictionary examples

# Iterating over elements in a dictionary
for k in ages:
    print((k, ages[k]))
print("\n")
    
# using enumerate()
mapping = {}
for i, k in enumerate(ages):
    mapping[k] = i
print(mapping)
print("\n")

# using zip()
cities = dict(zip(ages.keys(),['Calgary', 'Calgary', 'Vancouver', 'Toronto', 'Beirut']))
print(cities)

('susan', 22)
('joe', 29)
('al', 21)
('frank', 30)
('salim', 27)


{'susan': 0, 'joe': 1, 'al': 2, 'frank': 3, 'salim': 4}


{'susan': 'Calgary', 'joe': 'Calgary', 'al': 'Vancouver', 'frank': 'Toronto', 'salim': 'Beirut'}


In [8]:
# Exercise:
# Make sure the previous cell is evaluated so that we can work with
# the ages and cities dictionaries.
#
# Make a new dictionary that maps a student name (key) to a 
# 2-tuple (value) consisting of the student's age
# and the city they are from.
# Iterate over the dictionary and print out each student's information 
# in the format:
#
# name  age   city
    


## <a name="packingAndSlicing"></a>Packing and Slicing

- Python provides convenient mechanisms for packing/unpacking tuples in assignment statements, for loops and function return values.
- Lists can be sliced and diced in a number of ways to get sublists.

In [9]:
# Unpacking tuples (or lists)

# If a tuple appears on the RHS of an assignment expression and
# variables appear on the left, the tuple gets unpacked. This requires 
# that there are as many variables on the left side of the equals sign 
# as there are elements in the tuple.

tup = ('a', 'b', (1,2))
a = tup
p, q, r = tup


# keep the first, discard the rest or keep the last discard the rest. An underscore is 
# conventionally used to collect all unwanted variables in a tuple.
x, *_ = tup 
*_, y = tup

print(a)
print(r)
print(x)
print(y)

print("\n")
print(_) 


('a', 'b', (1, 2))
(1, 2)
a
(1, 2)


['a', 'b']


In [10]:
# Tuple unpacking in for loops and functions

# We can unpack a tuple in a for loop to get access to individual
# scalars, e.g.

for x, y, z in ((0,1,2), (1,2,3), (3,4,5)):
    print("x={0}, y={1}, z={2}".format(x,y,z))
    
print("\n")
    
# Another use is to return multiple values from a function, e.g.

def fun():
    return (1, 2)
    
a = fun()
x, y = fun()

print(a)
print(x)
print(y)

    

x=0, y=1, z=2
x=1, y=2, z=3
x=3, y=4, z=5


(1, 2)
1
2


In [11]:
# Another use of tuple unpacking is in defining functions that can take an
# arbitrary number of arguments
# Here we are using the unpacking operator (*)

def dist2(x, y, *args):
    "Returns the Euclidean squared distance of a two or higher dimensional point"
    result = x**2 + y**2
    print(args)
    for v in args:
        result += v**2
    return result

print(dist2(1,1))
print(dist2(1,1,1))
print(dist2(1,1,1,1,1))

()
2
(1,)
3
(1, 1, 1)
5


In [12]:
# Slicing is used to extract portions of lists (or tuples).
# Use the colon (:) operator


x = list(range(10))

# The last index is exclusive but the first is inclusive.
print(x[0:5]) 
print("\n")

# First and last can be omitted.
print(x[:5])
print(x[5:])
print("\n")

# Negative indices are used to index relative to the end
print(x[-5:-1])
print(x[-5:])
print("\n")

# We can also specify a step size with a second colon
print(x[0:10:2])
print("\n")

# Conveniently, the step can be negative
print(x[::-1])
print(x[9:5:-1])

[0, 1, 2, 3, 4]


[0, 1, 2, 3, 4]
[5, 6, 7, 8, 9]


[5, 6, 7, 8]
[5, 6, 7, 8, 9]


[0, 2, 4, 6, 8]


[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
[9, 8, 7, 6]


In [13]:
# Slicing can be used in assignments

x = list(range(10))
x[::2] = [0,0,0,0,0]
print(x)
print("\n")

# The following is not allowed. Size on LHS and RHS must be the same
# x[::2] = []

# Slicing can be used to slice tuples. However, since tuples are
# immutable, we cannot do assignments.

x = tuple(range(10))
print(x[::2])
print(x[::-2])
# The following is not allowed.
# x[::2] = (1,3,5,7)

[0, 1, 0, 3, 0, 5, 0, 7, 0, 9]


(0, 2, 4, 6, 8)
(9, 7, 5, 3, 1)


## <a name="comprehensions"></a>Comprehensions

- Comprehensions are a convenient (syntactically) and preferred (for efficiency reasons) way to filter items in a collection and create a new collection as a result.
- A comprehension is a filter+map operation.
- Comprehensions work with lists, sets and dicts.
- The syntax for a list comprehension is as follows:

  `[<expr> for v in collection if <condition>]` 
  
  
- Set and dict comrehensions are similar. 

In [14]:
# List comprehensions
odds = [v for v in list(range(20)) if v % 2 == 1]
odds2 = [v**2 for v in list(range(20)) if v % 2 == 1]
print(odds)
print(odds2)
print("\n")

# Set comprehensions are smilar, just use curly braces
evens = {v for v in list(range(20)) if v % 2 == 0}
print(evens)
print("\n")

# For a dict comprehension, use a key-expr and a value-expr
# separated by a colon. The entire expression is wrapped inside 
# curly braces. For example:
Ages = {v.upper() : ages[v] for v in ages if ages[v] < 30}
print(Ages)

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
[1, 9, 25, 49, 81, 121, 169, 225, 289, 361]


{0, 2, 4, 6, 8, 10, 12, 14, 16, 18}


{'SUSAN': 22, 'JOE': 29, 'AL': 21, 'SALIM': 27}


In [15]:
# Comprehensions can also be arbitratily nested. Two levels of nesting 
# are common but additional levels can make the code hard to read. 

my_list = [[0,1,2],[3,4,5,6],[6,7,8]]
flattened = [x for l in my_list if len(l) == 3 for x in l if 0 < x < 5]
print(flattened)

# The above is equivalent to:
flattened = []
for l in my_list:
    if len(l) == 3:
        for x in l:
            if 0 < x < 5:
                flattened.append(x)
print(flattened)
print("\n")

# A nested comprehension and a comprehension as an expression in
# another comprehension are different.
test = [[p for p in li if 0 < p < 5] for li in my_list]
print(test)



[1, 2]
[1, 2]


[[1, 2], [3, 4], []]


### Exercises:

- Write code for a comprehension that iterates over a list of 3-tuples and returns a list consisting of the maximum in each tuple.
- Write a list comprehension to determine all the divisors of a positive integer $n$.
- Use a comprehension to generate the following set: $$F = \{(i,j,k) \in \mathbb{Z}^3 \;\lvert\; -5 \le i,j,k \le 5 \text{ AND } i+j+k = 0\}$$


## <a name="lambdas"></a>Anonymous Functions (Lambdas)

- Recall that everything in Python is an object. So are functions, we can pass them around as arguments.
- We can also create functions __on-the-fly__ where a function is expected as an argument. These functions have no name and are not accessible from anywhere else.
- Use the `lambda` keyword to define an anonymous function. An anonymous function consists of a single statement the result of which is returned. They can have any number of parameters.

In [16]:
# A function as an argument

def apply_to_list( f, l ):
    return [f(v) for v in l]
    
def square(x):
    return x**2

print(apply_to_list(square, list(range(10))))
print("\n")

# The above can be achieved with an anonymous function
print(apply_to_list( lambda x: x**2 , list(range(10))))

print([(lambda y: y**2)(x) for x in range(10)])
    

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


In [17]:
# Function currying

# Currying is a way to derive a new function from an existing function by
# partial argument application. For example:

def adder(x,y):
    return x + y

adder_ten = lambda x: adder(x,10)

print(adder_ten(5))
print("\n")

# The builtin 'sorted' function optionally takes a key function
# that specifies how keys are to be sorted.

tab = [(i,j) for i in range(3) for j in range(4)]
print(tab)
print("\n")
print(sorted(tab, key = lambda x: x[0])) # sort based on first component
print(sorted(tab, key = lambda x: x[1])) # sort based on second component


15


[(0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (1, 2), (1, 3), (2, 0), (2, 1), (2, 2), (2, 3)]


[(0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (1, 2), (1, 3), (2, 0), (2, 1), (2, 2), (2, 3)]
[(0, 0), (1, 0), (2, 0), (0, 1), (1, 1), (2, 1), (0, 2), (1, 2), (2, 2), (0, 3), (1, 3), (2, 3)]


## Generators and Generator Expressions

- Python allows you to iterate over objects that are _iterable_. Tuples, Lists, Sets and Dicts are all iterable.
- A _generator_ is a way to construct iterable objects in Python so that they can be used inside the context of a `for` loop.
- Defining a generator is similar to defining a function. Use `yield` instead of `return` to yield the next item in the iteration. The generator will remember the state of the iterator between subsequent calls. 
- _Generator expressions_ provide syntatic sugar for creating generators. Their syntax is smilar to comprehensions. 


In [18]:
# Suppose we want to iterate over the Fibonacci numbers

def fib_itr(n=10):
    "A Fibonacci number generator that generates upto the n-th Fibonacci number"
    prev = 1
    prevprev = 1
    i = -1
    while i < n:
        i = i+1
        if i == 0:
            yield prevprev
        elif i == 1:
            yield prev
        else:
            next_num = prev + prevprev
            yield next_num
            prevprev = prev
            prev = next_num


In [19]:
# The generated values can be converted to a sequential type, e.g. 
print([*fib_itr()]) # can use unpacking operator

#print( list(fib_itr()) ) # this syntax can also be used


# Builtin functions like 'max', 'min' and 'sum' can take a generator as an argument
print(min(fib_itr(10)))
print(max(fib_itr(10)))
print(sum(fib_itr(10)))
print("\n")

# Generator expressions are similar to comprehensions except that 
# they yield a generator as a result
fib10 = (f for f in fib_itr(10))
print(type(fib10))
print(list(fib10))

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
1
89
232


<class 'generator'>
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]


### Exercise


- Suppose $X$ is distributed according to the Bionomial distribution, i.e. the probability of getting $k$ successes in $n$ trials is given by: 
$$\mathsf{Pr}(X=k) = \frac{n!}{(n-k)! k!} p^k (1-p)^{n-k},$$
where $p \in [0,1]$ represents the probability of success for one trial. The cumulative distribution function (CDF) is given by:
$$
\mathsf{P}(X \le k) = \sum_{i=0}^k \frac{n!}{(n-i)! i!} p^i (1-p)^{n-i}.
$$
Write a generator to compute the CDF of the binomial distribution. The generator should take $n$ and $p$ as parameters.