# Introduction to Python and Natural Language Technologies

# Lecture 04, Week 04

### February 28, 2018

# List comprehension

- transform any iterable into a list in one line
- syntactic sugar
- example: create a list of the first N odd numbers starting from 1

In [None]:
l = []
for i in range(10):
    l.append(2*i+1)
l

one-liner equivalent

In [None]:
l = [2*i+1 for i in range(10)]
l

## The general form of list comprehension is

~~~
[<expression> for <element> in <sequence>]
~~~

conditional expressions can be added to filter the sequence:

~~~
[<expression> for <element> in <sequence> if <condition>]
~~~

In [None]:
even = [n*n for n in range(20) if n % 2 == 0]
even

which is equivalent to

In [None]:
even = []
for n in range(20):
    if n % 2 == 0:
        even.append(n)
even

- since this expression implements a filtering mechanism, there is no `else` clause

- an if-else clause can be used as the first expression though:

In [None]:
l = [1, 0, -2, 3, -1, -5, 0]

signum_l = [int(n / abs(n)) if n != 0 else 0 for n in l]
signum_l

More than one sequence may be traversed. Is this depth-first or breadth-first traversal?

In [None]:
l1 = [1, 2, 3]
l2 = [4, 5, 6]

[(i, j) for i in l1 for j in l2]

In [None]:
[(i, j) for j in l2 for i in l1]

List comprehensions may be nested by replacing the first expression with another list comprehension:

In [None]:
matrix = [
    [1, 2, 3],
    [5, 6, 7]
]

[[e*e for e in row] for row in matrix]

## What is the type of a (list) comprehension?

In [None]:
i = (i for i in range(10))
type(i)

# Generator expressions

Generator expressions are a generalization of list comprehension. They were introduced in PEP 289 in 2002.

Check out the memory consumption of these cells.

In [None]:
%%time
N = 8
s = sum([i*2 for i in range(int(10**N))])
print(s)

In [None]:
%%time
s = sum(i*2 for i in range(int(10**N)))
print(s)

Generators do not generate a list in memory

In [None]:
even_numbers = (2*n for n in range(10))
even_numbers

therefore they can only be traversed once

In [None]:
for num in even_numbers:
    print(num)

the generator is empty after the first run

In [None]:
for num in even_numbers:
    print(num)

calling `next()` raises a `StopIteration` exception

In [None]:
# next(even_numbers)  # raises StopIteration

these are actually the defining properties of the **iteration protocol**

# Iteration protocol

A class satisfies the iteration protocol if:

1. it has a `__iter__` function that returns and iterator, which
1. has a `__next__` function (this function is called `next` in Python 2),
2. raises a `StopIteration` after a certain number of iterations

For loops use the iteration protocol.

In [None]:
class MyIterator:
    def __init__(self):
        self.iter_no = 5
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.iter_no <= 0:
            raise StopIteration()
        self.iter_no -= 1
        print("Returning {}".format(self.iter_no))
        return self.iter_no
    
myiter = MyIterator()

for i in myiter:
    print(i)

# Set and dict comprehension

Sets and dictionaries can be instantiated via generator expressions too.

A generator expression between curly brackets instantiates a set:

In [None]:
fruit_list = ["apple", "plum", "apple", "pear"]

fruits = {fruit.title() for fruit in fruit_list}

type(fruits), len(fruits), fruits

if the expression in the generator is a key-value pair separated by a colon, it instantiates a dictionary:

In [None]:
word_list = ["apple", "plum", "pear"]
word_length = {word: len(word) for word in word_list}
type(word_length), len(word_length), word_length

In [None]:
word_list = ["apple", "plum", "pear", "avocado"]
first_letters = {word[0]: word for word in word_list}
first_letters

# `yield` keyword

- if a function uses `yield` instead of return, it becomes a **generator function**
- `yield` temporarily gives back the execution to the caller
- the generator function continues

In [None]:
def hungarian_vowels():
    alphabet = ("a", "á", "e", "é", "i", "í", "o", "ó",
                "ö", "ő", "u", "ú", "ü", "ű")
    for vowel in alphabet:
        yield vowel

this function returns a generator object

In [None]:
type(hungarian_vowels())

In [None]:
for vowel in hungarian_vowels():
    print(vowel)

In [None]:
gen = hungarian_vowels()

print("first iteration: {}".format(", ".join(gen)))
print("second iteration: {}".format(", ".join(gen)))

The `next` function returns the next element of the generator.
A `StopIteration` is raised when no more elements are left:

In [None]:
gen = hungarian_vowels()

while True:
    try:
        print("The next element is {}".format(next(gen)))
    except StopIteration:
        print("No more elements left :(")
        break

# Exercises

Generator expressions can be particularly useful for formatted output. We will demonstrate this through a few examples.

In [None]:
numbers = [1, -2, 3, 1]

# print(", ".join(numbers))  # raises TypeError
print(", ".join(str(number) for number in numbers))

In [None]:
shopping_list = ["apple", "plum", "pear"]

~~~
The shopping list is:
item 1: apple
item 2: plum
item 3: pear
~~~

In [None]:
shopping_list = ["apple", "plum", "pear"]
shopping_list = ["apple"]

print("The shopping list is:\n{0}".format(
    "\n".join(
        "item {0}: {1}".format(i+1, item)
        for i, item in enumerate(shopping_list)
    )
))

In [None]:
shopping_list = ["apple", "plum", "pear"]

for i, item in enumerate(shopping_list):
    print("item {} {}".format(i+1, item))

## Q. Print the following shopping list with quantities.

For example:

~~~
item 1: apple, quantity: 2
item 2: pear, quantity: 1
~~~

In [None]:
shopping_list = {
    "apple": 2,
    "pear": 1,
    "plum": 5,
}
print("\n".join(
    "item {0}: {1}, quantity: {2}".format(i+1, item, quantity)
    for i, (item, quantity) in enumerate(shopping_list.items()
)))

## Q. Print the same format in alphabetical order.

- Decreasing order by quantity

In [None]:
shopping_list = {
    "apple": 2,
    "pear": 1,
    "plum": 5,
}
print("\n".join(
    "item {0}: {1}, quantity: {2}".format(i+1, item, quantity)
    for i, (item, quantity) in 
    enumerate(
        sorted(shopping_list.items(),
               key=lambda x: x[1], reverse=True)
)))

## Q. Print the list of students. 

In [None]:
students = [
    ["Joe", "John", "Mary"],
    ["Tina", "Tony", "Jeff", "Béla"],
    ["Pete", "Dave"],
]

## Q. Print one class-per-line and print the size of the class too

Example:
~~~
class 1, size: 3, students: Joe, John, Mary
class 2, size: 2, students: Pete, Dave
~~~

## Q. Sort the classes by size in increasing order

Example:
~~~
class 1, size: 2, students: Pete, Dave
class 2, size: 3, students: Joe, John, Mary
~~~

# Exception handling

- fully typed exception handling

In [None]:
try:
    int("abc")
except ValueError as e:
    print(type(e), e)
    print(e)

- more than one except clauses may be defined
- ordered from more specific to least specific

In [None]:
try:
    age = int(input())
    if age < 0:
        raise Exception("Age cannot be negative")
except ValueError as e:
    print("ValueError caught")
except Exception as e:
    print("Other exception caught: {}".format(type(e)))

### More than one type of exception can be handled in the same except clause

In [None]:
def age_printer(age):
    next_age = age + 1
    print("Next year your age will be " + next_age)
    
try:
    your_age = input()
    your_age = int(your_age)
    age_printer(your_age)
except ValueError:
    print("ValueError caught")
except TypeError:
    print("TypeError caught")

In [None]:
def age_printer(age):
    next_age = age + 1
    print("Next year your age will be " + next_age)
    
try:
    your_age = input()
    your_age = int(your_age)
    age_printer(your_age)
except (ValueError, TypeError) as e:
    print("{} caught".format(type(e).__name__))

### except without an Exception type

- without specifying a type, `except` catches everything but all information about the exception is lost

In [None]:
try:
    age = int(input())
    if age < 0:
        raise Exception("Age cannot be negative")
except ValueError:
    print("ValueError caught")
except:
#except Exception as e:
    print("Something else caught")

- the empty `except` must be the last except block since it blocks all others
- `SyntaxError` otherwise

In [None]:
try:
    age = int(input())
    if age < 0:
        raise Exception("Age cannot be negative")
#except:
    #print("Something else caught")
except ValueError:
    print("ValueError caught")

### Base class' except clauses catch derived classes too

In [None]:
try:
    age = int(input())
    if age < 0:
        raise Exception("Age cannot be negative")
except Exception as e:
    print("Exception caught: {}".format(type(e)))
except ValueError:
    print("ValueError caught")

### finally

- the `finally` block is guaranteed to run regardless an exception was raised or not

In [None]:
try:
    age = int(input())
except Exception as e:
    print(type(e), e)
finally:
    print("this always runs")

### else

- try-except blocks may have an else clause that **only** runs if no exception was raised

In [None]:
try:
    age = int(input())
except ValueError as e:
    print("Exception", e)
else:
    print("No exception was raised")
finally:
    print("this always runs")

### `raise` keyword

- `raise` throws/raises an exception
- an empty `raise` in an `except`

In [None]:
try:
    int("not a number")
except Exception:
    # raise
    pass

### Defining exceptions

- any type that subclasses `Exception` (`BaseException` to be exact) can be used as an exception object

In [None]:
class NegativeAgeError(Exception):
    pass

try:
    age = int(input())
    if age < 0:
        raise NegativeAgeError("Age cannot be negative. Invalid age: {}".format(age))
except NegativeAgeError as e:
    print(e)
except Exception as e:
    print("Something else happened. Caught {}, with message {}".format(type(e), e))

Using exception for trial-and-error is considered Pythonic:

In [None]:
try:
    int(input())
except ValueError:
    print("not an int")
else:
    print("looks like an int")

# Context managers

- there are two types of resources: managed and unmanaged

## Managed resources

- resource acquisition and release are automatically done
- no need for manual resource management
- example: memory
  - C++ has both managed and unmanaged memory management. The stack is managed, but the heap is not, we need to manually call `new` and `delete`.

## Unmanaged resources

- unmanaged resources need explicit release
- otherwise the operating system may run out of the resource
- examples include files, network sockets

In [None]:
fh = []
while True:
    try:
        fh.append(open("abc.txt", "w"))
    except OSError:
        break
len(fh)

In [None]:
for f in fh:
    f.close()

- we need to manually close the file
- what happens when an exception occurs

In [None]:
s1 = "important text"
fh = open("file.txt", "w")
# fh.write(s2)  # raises NameError
fh.close()

- the file is never closed, the file descriptor **is leaked**
- a solution would be to use try-except blocks with `finally` clauses

In [None]:
from sys import stderr

fh = open("file.txt", "w")
try:
    fh.write(important_variable)
except Exception as e:
    stderr.write("{0} happened".format(type(e).__name__))
finally:
    print("Closing file")
    fh.close()

## Context managers handle this automatically

- the `with` keyword opens a resource
- keeps it open until the execution leaves with's scope
- releases the resource regardless whether an exception is raised or not

In [None]:
with open("file.txt", "w") as fh:
    fh.write("abc\n")
    # fh.write(important_variable)  # raises NameError

## Defining context managers

- any class can be a context manager if it implements:
  1. `__enter__`: runs at the beginning of the `with`. Returns the resource.
  1. `__exit__`: runs after the with block. Releases the resource.

In [None]:
class DummyContextManager:
    def __init__(self, value):
        self.value = value
        
    def __enter__(self):
        print("Dummy resource acquired")
        return self.value
    
    def __exit__(self, *args):
        print("Dummy resource released")
        
with DummyContextManager(42) as d:
    print("Resource: {}".format(d))

`__exit__` takes 3 extra arguments that describe the exception: `exc_type`, `exc_value`, `traceback`

In [None]:
class DummyContextManager:
    def __init__(self, value):
        self.value = value
        
    def __enter__(self):
        print("Dummy resource acquired")
        return self.value
    
    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type is not None:
            print("{0} with value {1} caught\nTraceback: {2}".format(exc_type, exc_value, traceback))
        print("Dummy resource released")
        
with DummyContextManager(42) as d:
    print(d)
    # raise ValueError("just because I can")  # __exit__ will be called anyway