# 3.2 more language features

# 1 overview

1. With this last lecture (**PI1: why last? this is not the last one.**), our advice is to **skip it on first pass**, unless we have a burning desire to read it.
   - It's here
     - as a **reference**, so we can link back to it when required, and
     - for those who have worked through a number of applications, and now want to learn more about the Python language.
   - A variety of topics are treated in the lecture, including 
     - **generators**, 
     -  exceptions and 
     - descriptors.

# 2 iterable and iterators

1. We've already said something in lecture 2.4 about iterating in Python.
   - Now let's look more closely at how it all works, focusing in (**PI2: on**) Python's implementation of the `for` loop.

## 2.1 iterators

1. Iterators are a **uniform interface** (**???**) to stepping through elements in a collection.
   - Here, we will talk about using iterators
     - later we will learn how to build our own.
2. Formally, an **iterator** is an object with **a `__next__` method**.
   - e.g., **file objects** are iterators.
     - To see this, let's have another look at the US cities data, which is written to the present working directory in Program 1.
     - We see that file objects do indeed have a `__next__` method, and that calling this method returns the next line in the file, see Program 2-3.
       - The next method (**PI3: add ` `**) can also be accessed visa the built-in function `next()`, which directly calls this method, see Program 4.
   - The **objects returned by `enumerate()`** are also iterators, see Program 5-6.
   - As are the **reader objects from the `csv` module**, see Program 7-9.
     - Let's create a small csv file that contains data from the NIKKEI index.

In [20]:
%%file us_cities.txt    
new york: 8244910
los angeles: 3819702
chicago: 2707120
houston: 2145146
philadelphia: 1536471
phoenix: 1469471
san antonio: 1359758
san diego: 1326179
dallas: 1223229
# Program 1

Overwriting us_cities.txt


In [21]:
# Program 2

f = open('us_cities.txt')
f.__next__()

'new york: 8244910\n'

In [22]:
# Program 3

f.__next__()

'los angeles: 3819702\n'

In [23]:
# Program 4

next(f)

'chicago: 2707120\n'

In [24]:
# Program 5

e = enumerate(['foo', 'bar'])
next(e)

(0, 'foo')

In [25]:
# Program 6

next(e)

(1, 'bar')

In [26]:
%%file test_table.csv
Date,Open,High,Low,Close,Volume,Adj Close
2009-05-21,9280.35,9286.35,9189.92,9264.15,133200,9264.15
2009-05-20,9372.72,9399.40,9311.61,9344.64,143200,9344.64
2009-05-19,9172.56,9326.75,9166.97,9290.29,167000,9290.29
2009-05-18,9167.05,9167.82,8997.74,9038.69,147800,9038.69
2009-05-15,9150.21,9272.08,9140.90,9265.02,172000,9265.02
2009-05-14,9212.30,9223.77,9052.41,9093.73,169400,9093.73
2009-05-13,9305.79,9379.47,9278.89,9340.49,176000,9340.49
2009-05-12,9358.25,9389.61,9298.61,9298.61,188400,9298.61
2009-05-11,9460.72,9503.91,9342.75,9451.98,230800,9451.98
2009-05-08,9351.40,9464.43,9349.57,9432.83,220200,9432.83

# Program 7

Overwriting test_table.csv


In [27]:
# Program 8

from csv import reader

f = open('test_table.csv', 'r')
nikkei_data = reader(f)
next(nikkei_data)

['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']

In [28]:
# Program 9

next(nikkei_data)

['2009-05-21', '9280.35', '9286.35', '9189.92', '9264.15', '133200', '9264.15']

## 2.2 iterators in `For` loops

1. All iterators can be replaced to the right of the `in` keyword in `for` loop statements.
   - In fact, this is how the `for` loop works: 
     - if we write Program 1, then the interpreter
       - calls `iterator.__next__()` and binds `x` to the result,
       - executes the code block
       - repeats until a `StopIteration` error occurs
   - Now we know how this magical looking syntax in Program 2 works.
     - The interpreter just keeps
       - calling `f.__next__()` and binding `line` to the result,
       - executing the body of the loop.
     - This continues until a `StopIteration` error occurs.

In [None]:
# Program 10
for x in iterator:
    <code block>`

In [None]:
# Program 11
f = open('somefile.txt', 'r') 
for line in f:
    # do something

## 2.3 iterables

1. We already know that we can put a Python list to the right of `in` in a `for` loop, e.g. see Program 12.
   - So does that mean that a list is an iterator?
     - The answer is no, see Program 13-14, due to the definition of iterators in 2.1.
   - So why we can iterate over a list in a `for` loop?
     - The reason is that a list is iterable (as opposed to an iterator).
2. Formally, an object is **iterable** if it can be **converted to an iterator** using the **built-in function `iter()`**.
   - **Lists** are one such object, see Program 15-19.
   - Many other objects are iterable, such as **dictionaries and tuples**.
   - But **not all objects are iterable**, see Program 20.
3. To conclude our discussion of `for` loops
   - `for` loops work on either **iterators or iterables**,
   - in the second case (**Program 12?**), the iterable is converted into an iterator before the loop starts.

In [29]:
# Program 12

for i in ['spam', 'eggs']:
    print(i)

spam
eggs


In [30]:
# Program 13

x = ['foo', 'bar']
type(x)

list

In [31]:
# Program 14
next(x)

TypeError: 'list' object is not an iterator

In [36]:
# Program 15

x = ['foo', 'bar']
type(x)

list

In [38]:
# Program 16

y = iter(x)
type(y)

list_iterator

In [39]:
# Program 17

next(y)

'foo'

In [40]:
# Program 18

next(y)

'bar'

In [41]:
# Program 19

next(y)

StopIteration: 

In [42]:
# Program 20

iter(42)

TypeError: 'int' object is not iterable

## 2.4 iterables and built-ins

1. Some **built-in functions** that act on **sequences** also work with **iterables**.
   - `max()`, `min()`, `sum()`, `all()`, `any()`.
     - e.g. see Programs 21-23.
   - One thing to remember about iterators is that they are depleted by use (**PI: disposable**), see Program 24-25.

In [43]:
# Program 21

x = [10, -10]
max(x)

10

In [44]:
# Program 22

y = iter(x)
type(x)

list

In [45]:
# Program 23

max(y)

10

In [46]:
# Program 24

x = [10, -10]
y = iter(x)
max(y)

10

In [48]:
# Program 25

max(y)

ValueError: max() arg is an empty sequence

# 3 names and name resolution



## 3.1 variable names in python

1. Consider the Python statement in Program 1.
   - We now know that when this statement is executed, Python creates an object of type `int` in our computer's memory, containing
     - the value `42`,
     - some associated attributes.
   - But what is `x` itself?
2. In Python, `x` is called a **name**, and the statement `x=42` **binds** the name `x` to the integer object we have just discussed.
   - Under the hood (**?**), this **process of binding names to objects** is implemented as a **dictionary**.
   - There is no problem binding two or more names to the one object, regardless of what that object is, see Program 2-3.
     - In the first step, a function object is created, and the name `f` is bound to it.
     - After binding the name `g` to the same object, we can use it anywhere we would use `f`.
   - What happens when the number of names bound to an object goes to zero? (**?**)
     - Here's an example of this situation, where the name `x` is first bound to one object and the rebound to another, see Program 4-5.
       - What happens here is that the first object (**PI: x in Program 4**) is garbage collected.
       - In other words, the memory slot that stores that object is deallocated, and returned to the operating system (**PI4: we cannot see it from Program 4-5, maybe we should add `id(x)` in Program 5.**).

In [52]:
# Program 1

x = 42

In [53]:
# Program 2: create a function called f that prints any string it's called

def f(string):
    print(string)
    
g = f
id(g) == id(f)

True

In [54]:
# Program 3

g('test')

test


In [89]:
# Program 4

x = 'foo'
id(x)

140553208819824

In [59]:
# Program 5: No names bound to the first object (in the Program 4, how we know it?)

x = 'bar'
id(x) # add this or add `x`

140553208819760

## 3.2 namespaces

1. Recall from the preceding discussion that the statement in Program 1, binds the name `x` to the integer object on the right-hand side.
   - We also mentioned that this process of binding `x` to the correct object is implemented as a dicitionary.
     - The dictionary is called a **namespace**.
     
2. **Definition:** A **namespace** is a symbol table (**PI5: a [link](https://en.wikipedia.org/wiki/Symbol_table) clarifying the symbol table might be attached here.**) that maps names to objects in memory.
   - Python uses multiple namespaces, creating them on the fly as necessary.
     - e.g., every time we import a module, Python creates a namespace for that module.
       - To see this in action, suppose we write a script `math2.py` with a single line, see Program 6.
       - Now we start the **Python interpreter** and import it, see Program 7.
       - Both of these modules (**PI7: add `math` and `math2`**) have an attribute called `pi`, see Program 9-10.
         - These two different bindings of `pi` exist in different namespaces, each one implemented as a dictionary.
   - We can look at the dictionary directly, using `module_name.__dict__` (**PI8: it is `module_name.__dict__.items()` in the code. Why math2 has more built-in functions?**) see Program 11-12.
   - We access elements of the namespace using the **dotted attribute notation**, see Program 9.
     - In fact, this is entirely equivalent to `math.__dict__['pi']`, see Program 13.

In [60]:
%%file math2.py
pi = 'foobar'

Writing math2.py


In [61]:
# Program 7

import math2

In [94]:
# Program 8

import math

In [63]:
# Program 9

math.pi

3.141592653589793

In [64]:
# Program 10

import math2 # PI9: it should be superfluous.

math2.pi

'foobar'

In [68]:
# Program 11

import math # PI: it should be superfluous.

math.__dict__.items() # PI9: it should be `math.__dict__`

dict_items([('__name__', 'math'), ('__doc__', 'This module provides access to the mathematical functions\ndefined by the C standard.'), ('__package__', ''), ('__loader__', <_frozen_importlib_external.ExtensionFileLoader object at 0x7fd5480fe750>), ('__spec__', ModuleSpec(name='math', loader=<_frozen_importlib_external.ExtensionFileLoader object at 0x7fd5480fe750>, origin='/Users/shuhu/anaconda3/lib/python3.7/lib-dynload/math.cpython-37m-darwin.so')), ('acos', <built-in function acos>), ('acosh', <built-in function acosh>), ('asin', <built-in function asin>), ('asinh', <built-in function asinh>), ('atan', <built-in function atan>), ('atan2', <built-in function atan2>), ('atanh', <built-in function atanh>), ('ceil', <built-in function ceil>), ('copysign', <built-in function copysign>), ('cos', <built-in function cos>), ('cosh', <built-in function cosh>), ('degrees', <built-in function degrees>), ('erf', <built-in function erf>), ('erfc', <built-in function erfc>), ('exp', <built-in funct

In [69]:
# Program 12

import math2 # PI: it should be superfluous.

math2.__dict__.items() # PI: it should be `math2.__dict__`

All Rights Reserved.

Copyright (c) 2000 BeOpen.com.
All Rights Reserved.

Copyright (c) 1995-2001 Corporation for National Research Initiatives.
All Rights Reserved.

Copyright (c) 1991-1995 Stichting Mathematisch Centrum, Amsterdam.
All Rights Reserved., 'credits':     Thanks to CWI, CNRI, BeOpen.com, Zope Corporation and a cast of thousands
    for supporting Python development.  See www.python.org for more information., 'license': Type license() to see the full license text, 'help': Type help() for interactive help, or help(object) for help about object., '__IPYTHON__': True, 'display': <function display at 0x7fd508077200>, 'get_ipython': <bound method InteractiveShell.get_ipython of <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7fd5380a4790>>}), ('pi', 'foobar')])

In [70]:
# Program 13

math.__dict__['pi']  == math.pi

True

## 3.3 viewing namespaces
1. As we saw above, the `math` namespace can be printed by typing `math.__dict__`.
2. Another way to see its contents is to type `vars(math)`, see Program 14.
   - If we just want to see the names (**PI10: `the first 10 names of the contents`**), then we can just type `dir(math)[0:10]`, see Program 15.
2. Notice the special names `__doc__` and `__name__`.
   - These are **initialized** in the namespace when any module is imported.
     - `__doc__` is the **doc string of the module**, see Program 16,
     - `__name__` is the **name of the module**, see Program 17.

In [73]:
# Program 14

vars(math).items()

dict_items([('__name__', 'math'), ('__doc__', 'This module provides access to the mathematical functions\ndefined by the C standard.'), ('__package__', ''), ('__loader__', <_frozen_importlib_external.ExtensionFileLoader object at 0x7fd5480fe750>), ('__spec__', ModuleSpec(name='math', loader=<_frozen_importlib_external.ExtensionFileLoader object at 0x7fd5480fe750>, origin='/Users/shuhu/anaconda3/lib/python3.7/lib-dynload/math.cpython-37m-darwin.so')), ('acos', <built-in function acos>), ('acosh', <built-in function acosh>), ('asin', <built-in function asin>), ('asinh', <built-in function asinh>), ('atan', <built-in function atan>), ('atan2', <built-in function atan2>), ('atanh', <built-in function atanh>), ('ceil', <built-in function ceil>), ('copysign', <built-in function copysign>), ('cos', <built-in function cos>), ('cosh', <built-in function cosh>), ('degrees', <built-in function degrees>), ('erf', <built-in function erf>), ('erfc', <built-in function erfc>), ('exp', <built-in funct

In [92]:
# Program 15

dir(math)[0:10]

['__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'acos',
 'acosh',
 'asin',
 'asinh']

In [93]:
# Program 16 PI11: why `print()` here, not in Program 17? Should we make it consistent?

print(math.__doc__)

This module provides access to the mathematical functions
defined by the C standard.


In [81]:
# Program 17

math.__name__

'math'

## 3.4 interactive sessions

1. In Python, **all** code executed by the interpreter runs in some module.
   - What about commands typed at the prompt (**PI12: `prompt` can be replaced with `Jupyter Notebook Python cell`**)
     - These are also regarded as being executed within a module.
       - In this case, a module called `__main__`.
       - To check this, we can **look at the current module name** via the value of `__name__` given at the prompt (**PI13: `prompt` can be replaced with `Jupyter Notebook Python cell`**), see Program 18.
   - When we run a script using IPython's `run` command, the **contents of the file** are executed as **part of `__main__`** too.
     - To see this, let's create a file `mod.py` that prints its own `__name__` attribute, see Program 19.
     - Now let's look at **two different ways of running it** in IPython, see Program 20-21.
       - In the second case (Program 21), the code is executed as part of `__main__`, so `__name__` is equal to `__main__`.
     - To see the contents of the **namespace of `__main__`**(**PI:,**) we use `vars()` rather than `vars(__main__)`.
       - If we do this in Python, then we will see a whole lot of **variables that IPython needs**, and has **initialized** when we started up our session (**PI14: as what we did above? Need an example to clarify it?**).
       - If we prefer to see **only the variables we have initialized**, use `whos`, see Program 22.

In [98]:
# Program 18

print(__name__)

__main__


In [83]:
%%file mod.py
print(__name__)

Writing mod.py


In [84]:
# Program 20: Way1-Standard import

import mod

mod


In [100]:
# Program 21: Way2

%run mod.py  # Run interactively

__main__


In [103]:
# Program 22: see only the variables we have initialized

x = 2
y = 3

import numpy as np
%whos

Variable      Type                          Data/Info
-----------------------------------------------------
e             enumerate                     <enumerate object at 0x7fd51844d1e0>
f             function                      <function f at 0x7fd5382434d0>
g             function                      <function f at 0x7fd5382434d0>
i             str                           eggs
math          module                        <module 'math' from '/Use<...>h.cpython-37m-darwin.so'>
math2         module                        <module 'math2' from '/Us<...>ith John/0a qe/math2.py'>
mod           module                        <module 'mod' from '/User<...> with John/0a qe/mod.py'>
name          str                           
nikkei_data   reader                        <_csv.reader object at 0x7fd5184412d0>
np            module                        <module 'numpy' from '/Us<...>kages/numpy/__init__.py'>
reader        builtin_function_or_method    <built-in function reader>
x            

## 3.5 the global namespace

1. Python documentation often makes reference to the "global namespace".
   - The **global namespace** is the **namespace of the module currently being executed**.
     - e.g., suppose that we start the interpreter and begin assignments.
     - We are now working in the module `__main__`, and hence the namespace for `__main__` is the global namespace.
     - Next, we import a module called `amodule`.
       - At this point, the interpreter creates a namespace for the module `amodule` and starts executing commands in the module.
       - While this occurs, the namespace `amodule.__dict__` is the global namespace.
     - Once execution of the module finishes, the interpreter returns to the module from where the import statement was made.
       - In this case, it's `__main__`, so the namespace of `__main__` again becomes the global namespace.

## 3.6 local namespaces

1. **Important fact**: when we call a function, the interpreter **creates a local namespace** for that function, and **registers the variables in that namespace**.
   - The reason for this will be explained in just a moment (where?).
2. Variables in the local namespace are called **local variables**.
   - After the function returns, the namespace is deallocated and lost.
   - While the function is executing, we can **view the contents of the local namespace** with `locals()` (**PI15: A bit hard to understand it here**).
     - e.g., consider Program 23.
       - We can see the local namespace of `f` before it is destroyed.

In [123]:
# Program 23

def f(x):
    a = 2
    print(locals())
    return a * x

f(1)

{'x': 1, 'a': 2}


2

## 3.7 the `__builtins__` namespace

1. We have been using various built-in functions, such as `max(), dir(), str(), list(), len(), range(), type()`, etc.
   - How does access to these names work?
     - These **definitions** (**PI16: definitions of those functions?**) are stored in a module called `__builtin__` (**PI17: it should be `__builtins__`? See https://docs.python.org/3/library/builtins.html#module-builtins**).
     - They have there own namespace called `__builtins__`.
   - We can access elements of the namespace as Program 26.
     - But `__builtins__` is special, because we can always access them directly as well, see Program 27-28.
     - Subsection 3.8 explains how this works (**PI18: only one dot is enough?**)
   

In [110]:
# Program 24: PI: what's this?

dir()[0:10]

['In', 'Out', '_', '_10', '_102', '_105', '_106', '_107', '_108', '_109']

In [109]:
# Program 25

dir(__builtins__)[0:10]

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError']

In [112]:
# Program 26

__builtins__.max

<function max>

In [113]:
# Program 27

max

<function max>

In [114]:
# Program 28

__builtins__.max == max

True

## 3.8 name resolution

1. **Namespaces** are great because they **help us organize variable names**.
   - Type `import this` at the prompt (**PI19: `prompt` should be replaced with `Jupyter Notebook Python cell`**) and look at the last item that's printed.
2. However, we do need to understand how the **Python interpreter** works with **multiple namespaces**.
   - At any point of execution, there are(**PI:,**) in fact(**PI:,**) **at least two** namespaces that can be **accessed directly**.
     - "Accessed directly" means without using a dot, as in `pi` rather than `math.pi`.
     - These namespaces are
       - The global namespace (of the module being executed)
       - The builtin namespace
   - If the interpreter is executing a function, then the directly accessible namespaces are
     - The local namespace of the function.
     - The global namespace (of the module being executed)
     - The builtin namespace
   - Sometimes(**PI:,**) functions are defined within other functions, see Program 28.
     - Here(**PI:,**) `f` is the enclosing function (**PI20: What is an enclosing function?**), and each function gets its own namespaces.
3. Now(**PI:,**) we can give the rule for how namespace resolution works:
   - The order in which the interpreter searches for names is
     - the local namespace (if it exists)
     - the hierarchy of enclosing namespaces (if they exist)
     - the global namespace
     - the builtin namespace
   - If the **name** is not in any of these namespaces, the interpreter raises a `NameError`.
     - This is called the **LEGB rule** (local, enclosing, global, builtin).
     - Here's an example that helps to illustrate(**PI21: no space here**).
       - Consider a script `test.py` that looks as follows, see Program 29.
         - What happens when we run this script?
           - See Program 30-31.
       - First,
         - The global namespace `{}` is created (**PI: where is it?**).
         - The function object is created, and `g` is bound to it within the global namespace.
         - The name `a` is bound to `0`, again in the global namespace.
       - Next(**PI:,**) `g` is called via `y = g(10)`, leading to the following sequence of actions:
         - The local namespace for the function is created.
         - Local names `x` and `a` are bound, so that the local namespace becomes `{'x': 10, 'a': 1}`.
         - Statement `x = x + a` uses the local `a` and local `x` to compute `x + a`, and binds local name `x` to the result.
         - This value is returned, and `y` is bound to it in the global namespace.
         - Local `x` and `a` are discarded, and the local namespace is deallocated.
       - Note that the global `a` was not affected by the local `a`.

In [116]:
# Program 28

def f():
    a = 2
    def g():
        b = 4
        print(a * b)
    g()

In [118]:
%%file test.py
def g(x): # Program 29
    a = 1
    x = x + a
    return x

a = 0
y = g(10)
print("a = ", a, "y = ", y)

Overwriting test.py


In [119]:
# Program 30

%run test.py

a =  0 y =  11


In [120]:
# Program 31

x

2

## 3.9 mutable vs immutable parameters

1. This is a good time to say a little more about mutable vs immutable objects.
   - Consider a mutable case, see Program 32.
     - We now understand what will happen here:
       - The code prints `2` as the value of `f(x)` and `1` as the value of `x`.
       - The call `f(x)` creates a local namespace and adds `x` to it, bound to `1`.
       - Next, this local `x` is rebound to the new integer object `2`, and this value is returned.
     - None of this (**PI22: it should be `these`**) affects the global `x`.
   - However, it's a different story when we use a **mutable** data type(**PI:,**) such as a **list**.
     - e.g., see Program 33.
       - This prints `[2]` as the value of `f(x)` and same for `x`.
     - Here's what happens:
       - `f` is registered as a function in the global namespace.
       - `x` bound(**PI: s**) to `[1]` in the global namespace.
       - The call `f(x)`
         - Creates a local namespace
         - Adds `x` to local namespace, bound to `[1]`
         - The list `[1]` is modified to `[2]` (**PI23: maybe we can change it to `Modifies the list `[1]` to `[2]``**)
         - Returns the list `[2]`.
         - The local namespace is deallocated, and local `x` is lost. (**PI24: We should put it to a higher order.**)
       - Global `x` has been modified.

In [121]:
# Program 32

def f(x):
    x = x + 1
    return x

x = 1
print(f(x), x)

2 1


In [122]:
# Program 33

def f(x):
    x[0] = x[0] + 1
    return x

x = [1]
print(f(x), x)

[2] [2]


# 4 handling errors

1. Sometimes(**PI:,**) it's possible to anticipate errors as we're writing code.
   - e.g., the unbiased sample variance of sample $y_1, \dots, y_n$ is defined as 
     $$
     s^2 = \frac{1}{n-1} \sum^n_{i=1} (y_i - \bar y)^2 \\
     \bar y = sample \ mean
     $$
     - This can be calculated in NumPy using `np.var`.
     - But if we were writing a function to handle such a calculation, we might anticipate a divide-by-zero error when the sample size is one.
     - One possible action is to do nothing--the program will just crash, and split out an error message.
   - But sometimes it's worth writing our code in a way that anticipates and deals with runtime errors that we think might arise.
     - Why?
       - Because the debugging information provided by the interpreter is often **less useful** than the information on possible errors we have in our head when writing code.
       - Because errors causing execution to stop are frustrating if we're in the middle of a large computation.
       - Because it's reduces (**PI25: it has reduced/ it is reduced?**) confidence in our code on the part of our users (if we are writing for others).
     
## 4.1 assertions

1. A relatively easy way to handle checks is with the `assert` keyword.
   - e.g., pretend for a moment that the `np.var` function doesn't exist and we need to write our own, see Program 1.
   - If we run this with an array of length one, the program will terminate and print our error message.
2. The advantage is that we can
   - fail early, as soon as we know there will be a problem,
   - supply specific information on why a program is failing.

In [125]:
# Program 1

def var(y):
    n = len(y)
    assert n > 1, 'Sample size must be greater than one.'
    return np.sum((y - y.mean())**2) / float(n-1)

In [126]:
# Program 2

var([1])

AssertionError: Sample size must be greater than one.

## 4.2 handling errors during runtime

1. The approach used above is a bit limited, because it always leads to termination.
   - Sometimes, we can handle errors more **gracefully**, by treating special cases.
     - Let's look at how this is done.

### exceptions

1. There are many different errors (**PI26: add error types**).
   - Here's an example of a common error type, see Program 3.
     - Since illegal syntax cannot be executed, a **syntax error** terminates execution of the program.
   - Here's a different kind of error, called **ZeroDivisionError**, unrelated to syntax, see Program 4.
   - Here's another, called **NameError**, see Program 5.
   - And another, called **TypeError**, see Program 6.
   - And another, called **IndexError**, see Program 7.
2. On each occasion, the interpreter informs us of the error type:
   - `NameError`, `TypeError`, `IndexError`, `ZeroDivisionError`, etc.
   - In Python, these errors are called **exceptions**.

In [127]:
# Program 3: SyntaxError

def f:

SyntaxError: invalid syntax (<ipython-input-127-84229afe8ed2>, line 3)

In [128]:
# Program 4: ZeroDivisionError

1 / 0

ZeroDivisionError: division by zero

In [130]:
# Program 5: NameError

x1 = y1

NameError: name 'y1' is not defined

In [132]:
# Program 6: TypeError

'foo' + 6

TypeError: can only concatenate str (not "int") to str

In [133]:
# Program 7: IndexError

X = []
x = X[0]

IndexError: list index out of range

### catching exceptions

1. We can catch and deal with exceptions using `try-except` blocks (**PI27: attach a link https://docs.python.org/tutorial/errors.html#handling-exceptions**).
   - e.g., see Program 8.
   - When we call `f`(**PI: ,**) we get the following output, see Program 9-11.
   - The error is caught and execution of the program is not terminated.
   
2. Note that other error types are not caught.
   - If we are worried the user might pass in a string, then we can catch that error too, see Program 12.
   - Here's what happens, see Program 13-15.
   
3. If we feel lazy(**PI:,**) then we can catch these errors together, see Program 16.
   - Here's what happens, see Program 17-19.
   
4. If we feel lazy(**PI:,**) then we can catch all error types as in Program 20.

5. In general(**PI:,**) it's better to be specific.

In [134]:
# Program 8

def f(x):
    try:
        return 1.0 / x
    except ZeroDivisionError:
        print('Error: division by zero. Returned None')
    return None

In [136]:
# Program 9

f(2)

0.5

In [137]:
# Program 10

f(0)

Error: division by zero. Returned None


In [138]:
# Program 11

f(0.0)

Error: division by zero. Returned None


In [139]:
# Program 12

def f(x):
    try:
        return 1.0 / x
    except ZeroDivisionError:
        print('Error: Division by zero. Returned None')
    except TypeError:
        print('Error: Unsupported operation. Returned None')
    return None

In [140]:
# Program 13

f(2)

0.5

In [141]:
# Program 14

f(0)

Error: Division by zero. Returned None


In [142]:
# Program 15

f('foo')

Error: Unsupported operation. Returned None


In [143]:
# Program 16

def f(x):
    try:
        return 1.0 / x
    except (TypeError, ZeroDivisionError):
        print('Error: Unsupported operation. Returned None')
    return None

In [144]:
# Program 17

f(2)

0.5

In [145]:
# Program 18

f(0)

Error: Unsupported operation. Returned None


In [146]:
# Program 19

f('foo')

Error: Unsupported operation. Returned None


In [147]:
# Program 20

def f(x):
    try:
        return 1.0 / x
    except:
        print('Error. Returned None')
    return None

# 5 decorators and descriptors

1. Let's look at **some specific syntax elements** that are routinely used by Python developers.
   - We might not need the following concepts immediately, but we will see them in other people's code.
   - Hence(**PI:,**) we need to understand them at some stage of our Python education.

## 5.1 Decorators

1. Decorators are a bit of syntactic sugar that, while easily avoided, have turned out to be popular.
   - It's very easy to say what decorators do.
   - On the other hand(**PI:,**) it takes a bit of effort to explain **why we might use them**.

### an example

1. Suppose we are working on a program that looks like in Program 1.
   - Now suppose there's a **problem**:
     - occasionally negative numbers get fed to `f` and `g` in the calculations that follow.
   - If we try it, then we would see that when these functions are called with **negative numbers**, they return a NumPy object called `nan`.
     - This stands for "not a number" (and indicates that we are trying to evaluate a mathematical function at a point where it is not defined).
     - Perhaps this isn't what we want, because it causes other problems (**e.g.?**) that are hard to pick up later on.
   - Suppose that instead we want the program to terminate whenever this happens, with a **sensible error message**.
     - This change is easy enough to implement, see Program 2.
2. Notice however that there is some repetition here, in the form of two identical lines of code.
   - Repetition makes out code longer and harder to maintain, and hence is something we try hard to avoid.
     - Here, it's not a big deal, but imagine now that instead of just `f` and `g`, we have 20 such functions that we need to modify in exactly the same way.
       - This means we need to repeat the test logic (i.e., the `assert` line testing nonnegativity) 20 times.
     - The situation is still worse if the test logic is longer and more complicated.
   - In this kind of scenario, the following approach would be neater, see Program 3.
     - This looks complicated, so let's work through it slowly.
       - To unravel the logic, consider what happens when we say `f = check_nonneg(f)`.
         - This calls the function `check_nonneg` with parameter `func` set equal to `f`.
         - Now `check_nonneg` creates a new function called `safe_function` that verifies `x` as nonnegative and then calls `func` on it (which is the same as `f`).
       - Finally, the global name `f` is then set equal to `safe_function`.
         - Now the behavior of `f` is as we desire, and the same is true of `g`.
         - At the same time, the test logic is written only once.

In [148]:
# Program 1

import numpy as np

def f(x):
    return np.log(np.log(x))


def g(x):
    return np.sqrt(42 * x)

# The program continues with various calculations (?) using f and g

In [1]:
# Program 2

import numpy as np

def f(x):
    assert x >= 0, "Argument must be nonnegative"
    return np.log(np.log(x))


def g(x):
    assert x >= 0, "Argument must be nonnegative"
    return np.sqrt(42 * x)

# The program continues with various calculations (?) using f and g

In [2]:
# Program 3

import numpy as np

def check_nonneg(func):
    def safe_function(x):
        assert x >= 0, "Argument must be nonnegative"
        return safe_function

def f(x):
    return np.log(np.log(x))


def g(x):
    return np.sqrt(42 * x)

f = check_nonneg(f)
g = check_nonneg(g)

# The program continues with various calculations (?) using f and g

### enter decorators

1. The last version of our code is still not ideal. 
   - e.g., if someone is reading our code and wants to know how `f` works, they will be looking for the function definition, which is in Program 4.
     - This may well miss the line `f = check_nonneg(f)`.
     - For this and other reasons, decorators were introduced to Python.
   - With decorators, we can replace the lines in Program 5 with Program 6.
     - These two pieces of code do exactly the same thing.
       - If they do the same thing, do we really need decorator syntax?
     - Well, notice that the decorators sit right on top of the function definitions.
       - Hence, anyone looking at the definition of the function will see them and be aware that the function is modified.
       - In the opinion of many people, this makes the decorator syntax a significant improvement to the language.

In [3]:
# Program 4

def f(x):
    return np.log(np.log(x))

In [4]:
# Program 5

def f(x):
    return np.log(np.log(x))

def g(x):
    return np.sqrt(42 * x)

f = check_nonneg(f)
g = check_nonneg(g)

In [5]:
# Program 6

@check_nonneg
def f(x):
    return np.log(np.log(x))

@check_nonneg
def g(x):
    return np.sqrt(42 * x)

## 5.2 descriptors

1. Descriptors solve a **common problem** regarding **management of variables**.
   - To understand the issue, consider a `Car` class, that simulates a car.
     - Suppose that this class defines the variables `miles` and `kms`.
       - Which give the distance traveled in miles and kilometers respectively.
     - A highly simplified version of the class might look like in Program 7.
     - One potential problem we might have here is that **a user alters one of these variables but not the other**, see Program 8.
       - In Program 10, we see that `miles` and `kms` are out of sync.
       - What we really want is some mechanism whereby each time a user sets one of these variables, the other is automatically updated.


In [6]:
# Program 7

class Car:
    
    def __init__(self, miles=1000):
        self.miles = miles
        self.kms = miles * 1.61
        
    # Some other functionality, details omitted

In [7]:
# Program 8

car = Car()
car.miles

1000

In [8]:
# Program 9

car.kms

1610.0

In [9]:
# Program 10

car.miles = 6000
car.kms

1610.0

### a solution
1. In Python, this issue is solved **using descriptors**.
   - **A descriptor** is just **a Python object that implements certain methods**.
   - These methods are **triggered** when **the object is assessed through dotted attribute notation**.
2. The best way to understand this is to see it in action.
   - Consider this alternative version of the `Car` class in Program 11.
     - First let's check that we get the desired behavior, see Program 12.
     - Yep, that's what we want-`car.kms` is automatically updated.

In [11]:
# Program 11

class Car:
    
    def __init__(self, miles=1000):
        self._miles = miles
        self._kms = miles * 1.61
        
    def set_miles(self, value):
        self._miles = value
        self._kms = value * 1.61
        
    def set_kms(self, value):
        self._kms = value
        self._miles = value / 1.61
        
    def get_miles(self):
        return self._miles
    
    def get_kms(self):
        return self._kms
    
    miles = property(get_miles, set_miles)
    kms = property(get_kms, set_kms)

In [12]:
# Program 12
car = Car()
car.miles

1000

In [13]:
# Program 13

car.miles = 6000
car.kms

9660.0

### how it works

1. The names `_miles` and `_kms` are arbitrary names we are using to store the values of the variables.
2. The objects `miles` and `kms` are **properties**, a common kind of descriptor.
   - The methods `get_miles`, `set_miles`, `get_kms` and `set_kms` **define what happens when we get (i.e., access) or set (bind) these variables**.
     - So-called "getter" and "setter" methods.
   - The builtin Python function `property`takes getter and setter methods and creates a **property**.
     - e.g., after `car` is created as an instance of `Car`, the object `car.miles` is a property.
     - Being a property, when we set its value via `car.miles = 6000`, its setter method is triggered.
       - In this case `set_miles`.

### decorators and properties

1. These days(**PI:,**) it(**PI:'**)s very common to see the `property` function used via a decorator.
   - Here's another version of our `Car` class that works as before but now uses decorators to set up the properties, see Program 14.
     - We won't go through all the details here.
   - For further information(**PI:,**) we can refer to the descriptor documentation.

In [14]:
# Program 14

class Car:
    
    def __init__(self, miles=1000):
        self._miles = miles
        self._kms = miles * 1.61
        
    @property
    def miles(self):
        return self._miles
    
    @property
    def kms(self):
        return self._kms
    
    @miles.setter
    def miles(self, value):
        self._miles = value
        self._kms = value * 1.61
        
    @kms.setter
    def kms(self, value):
        self._kms = value
        self._miles = value / 1.61

# 6 generators

1. A generator is a kind of iterator (i.e., it works with a `next` function).
   - We will study two ways to **build generators**:
     - **generator expressions** and,
     - **generator functions**.

## 6.1 generator expressions

1. The **easiest** way to build generators is using **generator expressions**.
   - Just **like a list comprehension**, but **with round brackets**.
     - Here is the list comprehension, see Program 1-3.
     - And here is the generator expression, see Program 4-6.
   - Since `sum()` can be called on iterators, we can do this, see Program 7 (**why this?**).
     - The function `sum()` calls `next()` to get the items, adds successive terms.
     - In fact, we can omit the outer brackets in this case, see Program 8.

In [15]:
# Program 1

singular = ('dog', 'cat', 'bird')
type(singular)

tuple

In [16]:
# Program 2

plural = [string + 's' for string in singular]
plural

['dogs', 'cats', 'birds']

In [17]:
# Program 3

type(plural)

list

In [23]:
# Program 4

singular = ('dog', 'cat', 'bird')
plural = (string + 's' for string in singular)
type(plural)

generator

In [19]:
# Program 5

next(plural)

'dogs'

In [20]:
# Program 6

next(plural)

'cats'

In [25]:
# Program 7

sum((x * x for x in range(10)))

285

In [26]:
# Program 8

sum(x * x for x in range(10))

285

## 6.2 generator functions

1. The **most flexible** way to create generator objects is to use **generator functions**.
   - Let's look at 2 examples.

### Example 1

1. Here's a very simple example of a generator function, see Program 9.
   - It looks like a function, but uses a keyword `yield` that we haven't met before.
2. Let's see how it works after running this code, see Program 10-15.
   - The generator function `f()` is used to **create generator objects**.
     - In this case, `gen`.
     - Generators are iterators, because they support a `next` method.
   - The first call (Program 12) to `next(gen)`
     - **executes** code in the body of `f()` until it meets a `yield` statement.
     - **returns** that values to the caller of `next(gen)`.
   - The second call (Program 13) to `next(gen)` starts executing **from the next line** of Program 9 and continues until the next `yield` statement.
     - At that point(**PI:,**) it returns the value following `yield` to the caller of `next(gen)`, and so on.
   - When the code block ends, the generator throws a `StopIteration` error, see Program 15. 

In [27]:
# Program 9

def f():
    yield 'start'
    yield 'middle'
    yield 'end'

In [28]:
# Program 10

type(f)

function

In [29]:
# Program 11

gen = f()
gen

<generator object f at 0x7fb1c90ffc50>

In [30]:
# Program 12

next(gen)

'start'

In [31]:
# Program 13

next(gen)

'middle'

In [32]:
# Program 14

next(gen)

'end'

In [33]:
# Program 15

next(gen)

StopIteration: 

### Example 2

1. Our next example receives an argument `x` from the caller, see Program 16.
2. Let's see how it works, see Program 17.
   - The call `gen = g(2)` binds `gen` to a generator, see Program 18.
     - Inside the generator, the name `x` is bound to `2`.
   - When we call `next(gen)`, see Program 19.
     - The body of `g()` executes until the line `yield x`, and the value of `x` is returned.
     - Note that value of `x` is returned inside the generator.
   - When we call `next(gen)` again, execution continues **from where it left off**.
     - When `x < 100` fails, the generator throws a `StopIteration` error.
     - Incidentally, the loop inside the generator can be infinite, see Program 23 (**PI28: briefly clarify why this is infinite**).

In [34]:
# Program 16

def g(x):
    while x < 100:
        yield x
        x = x * x

In [35]:
# Program 17

g

<function __main__.g(x)>

In [36]:
# Program 18

gen = g(2)
type(gen)

generator

In [37]:
# Program 19

next(gen)

2

In [38]:
# Program 20

next(gen)

4

In [39]:
# Program 21

next(gen)

16

In [40]:
# Program 22

next(gen)

StopIteration: 

In [84]:
# Program 23

def g(x):
    while 1: # PI: Why this is infinite?
        yield x
        x = x * x

## 6.3 advantages of iterators

1. What's the advantage of using an iterator here?
2. Suppose we want to sample a binomial(n, 0.5) (**PI29: add ` `**).
   - One way to do it is as follows, see Program 24.
     - But we are creating two huge lists here, `range(n)` and `draws`.
     - This uses lots of memory and is very slow.
       - If we make `n` even bigger(**PI:,**) then this happens, see Program 25 (**PI30: we cannot see the difference, maybe we can time it?**)
   - We can avoid these problems using iterators.
     - Here is the generator function, see Program 26.
     
3. In summary, iterables
   - avoid the need to create big lists/tuples, and
   - provide a uniform interface to iteration that can be used transparently in `for` loops.

In [59]:
# Program 24

import random
n = 10000000   # PI: change it to `n = 10_000_000`
draws = [random.uniform(0, 1) < 0.5 for i in range(n)]
sum(draws)

5001837

In [54]:
# Program 25

n = 100000000  # PI: change it to `n = 100_000_000`
draws = [random.uniform(0, 1) < 0.5 for i in range(n)]

In [55]:
# Program 26

def f(n):
    i = 1
    while i <= n:
        yield random.uniform(0, 1) < 0.5
        i += 1

In [57]:
# Program 27

n = 10000000  # PI: change it to `n = 10_000_000`
draws = f(n)
draws

<generator object f at 0x7fb1c90ffe50>

In [58]:
# Program 28

sum(draws)

5000775

# 7 recursive function calls

1. This is not something that we will use every day, but it is still useful.
   - We should learn it at some stage.
2. Basically, a recursive function is a function that calls itself.
   - e.g., consider the problem of computing $x_t$ for some $t$ when
     $$
     x_{t+1} = 2x_t, \ x_0 =1 \tag{1}
     $$
     - Obviously, the answer is $2^t$.
   - We can compute this easily enough with a loop, see Program 1.
   - We can also use a recursive solution, see Program 2.
     - What happens here is that each successive call uses it's (**PI31: its**) own **frame** in the **stack**.
       - a frame is where the local variables of a given function call are held,
       - stack is memory used to process function calls.
         - a First In Last Out (FILO) queue (**PI32: what is it?**)
     - This example is somewhat contrived, since the first (iterative) solution would usually be preferred to the recursive solution.
       - We'll meet less contrived applications of recursion later on.

In [60]:
# Program 1

def x_loop(t):
    x = 1
    for i in range(t):
        x = 2 * x
    return x

In [61]:
# Program 2

def x(t):
    if t == 0:
        return 1
    else:
        return 2 * x(t-1)

# 8 exercises

### Exercise 1

The Fibonacci numbers are defined by


<a id='equation-fib'></a>
$$
x_{t+1} = x_t + x_{t-1}, \quad x_0 = 0, \; x_1 = 1 \tag{2}
$$

The first few numbers in the sequence are $ 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55 $.

Write a function to recursively compute the $ t $-th Fibonacci number for any $ t $.

In [65]:
def x(t):
    if t == 0:
        return 0
    if t == 1:
        return 1
    else:
        return x(t-1) + x(t-2)

### Exercise 2

Complete the following code, and test it using [this csv file](https://github.com/QuantEcon/QuantEcon.lectures.code/blob/master/python_advanced_features/test_table.csv) (**PI33: we can use `https://raw.githubusercontent.com/QuantEcon/QuantEcon.lectures.code/master/python_advanced_features/test_table.csv` and change the later assumption to make it smooth to implement the **), which we assume that you’ve put in your current working directory

In [186]:
import pandas as pd

In [191]:
def column_iterator(target_file, column_number):
    """A generator function for CSV files.
    When called with a file name target_file (string) and column number
    column_number (integer), the generator function returns a generator
    that steps through the elements of column column_number in file
    target_file.
    """
    # put your code here
    n = column_number
    df = pd.read_csv(target_file)
    target = df.iloc[:, n-1]
    return (t for t in target)
    

target_file = 'https://raw.githubusercontent.com/QuantEcon/QuantEcon.lectures.code/master/python_advanced_features/test_table.csv'
dates = column_iterator(target_file, 1)

i = 1
for date in dates:
    print(date)
    if i == 10:
        break
    i += 1

2009-05-21
2009-05-20
2009-05-19
2009-05-18
2009-05-15
2009-05-14
2009-05-13
2009-05-12
2009-05-11
2009-05-08


### Exercise 3

Suppose we have a text file `numbers.txt` containing the following lines

In [3]:
%%writefile numbers.txt
prices
3
8

7
21

Writing numbers.txt


Using `try` – `except` (**PI34: add `block`**), write a program to read in the contents of the file and sum the numbers, ignoring lines without numbers.

In [2]:
with open('numbers.txt') as f:
    y = f.read()

list = y.splitlines()
list

['prices', '3', '8', '', '7', '21']

In [3]:
with open('numbers.txt') as f:
    x = [line.strip() for line in f]

x

['prices', '3', '8', '', '7', '21']

In [7]:
def f(target_file):
    with open(target_file) as f:
        x = [line for line in f]
    sum = 0
    for i in x:
        try:
            sum = sum + float(i) # or int(i)
        except:
            pass
    return sum

In [8]:
target = 'numbers.txt'
f(target)

39.0