# DAML 01 - Jupyter Exercises 2 - Solutions

Michal Grochmal <michal.grochmal@city.ac.uk>

Exercises rating:

★☆☆ - You should be able to do it based on Python knowledge plus lecture contents.

★★☆ - You will need to do extra thinking and some extra reading/searching.

★★★ - The answer is difficult to find by a simple search,
      requires you to do a considerable amount of extra work by yourself
      (feel free to ignore these exercises if you're short on time).

#### 1. Write a Python class that is instantiated with a list of numbers and has a **`.mean()`** method. (★☆☆)

In [1]:
class MyMeanie(object):

    def __init__(self, numbers):
        self.numbers = numbers

    def mean(self):
        return sum(self.numbers) / len(self.numbers)


meanie = MyMeanie([1, 2, 3, 4, 5, 6])
print(meanie.mean())

3.5


#### 2. Save the following into a file called **`by_zero.py`** on the **`U:`** drive (★★☆)

---

    def div_xy(x, y):
        return x / y

    def div_by_zero(x):
        return div_xy(x, 0.0)

---

**`import`** the module and execute the **`div_xy`** function to divide 10 by 2.

P.S. You do not need to worry about an **`__init__.py`** file if the module is in the same directory.

In [2]:
%%writefile by_zero.py

def div_xy(x, y):
    return x / y


def div_by_zero(x):
    return div_xy(x, 0.0)

Overwriting by_zero.py


Using `%%writefile` is a cheat but it is here for simplicity.
Another way of doing this is to:

1.  Go into the main page of the notebook
2.  Click `new` -> `Text File`
3.  Open the text file in the editor
4.  Click on the title of the file and rename it to `by_zero.py`
5.  Write the contents into the file
6.  Click `File` -> `Save`

In [3]:
import by_zero


print(by_zero.div_xy(10, 2))

5.0


#### 3. What is the difference between the *magic* in the following cells? (★★☆)

This exercise introduces *magics*, commands that perform extra functionality not
available in vanilla Python.  The magic introduced is `%xmode`, which toggles between
different ways of displaying errors.

Detailed error messages will be needed in order to figure out how to attempt the last two exercises. 

In [4]:
%xmode Plain

by_zero.div_by_zero(3)

Exception reporting mode: Plain


ZeroDivisionError: float division by zero

In [5]:
%xmode Context

by_zero.div_by_zero(3)

Exception reporting mode: Context


ZeroDivisionError: float division by zero

In [6]:
%xmode Verbose

by_zero.div_by_zero(3)

Exception reporting mode: Verbose


ZeroDivisionError: float division by zero

#### 4. Use `%timeit` to time a function  that takes the mean of a list, compare it with `np.sum` (★★☆)

You can use **`np.arange(1024)`** to create a list of the first 1024 integers.  Here is a start:

(This ought to take a while to run.)

In [7]:
def my_mean(numbers):
    return sum(numbers)/len(numbers)

In [8]:
import numpy as np


long_list = np.arange(1024)
%timeit np.sum(long_list)
%timeit np.mean(long_list)
%timeit my_mean(long_list)

8.69 µs ± 50.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
20.7 µs ± 138 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
146 µs ± 208 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


Our function takes considerably more time against `np.sum`.  Even when compared to `np.mean`.
This means that `numpy` must do something quite clever to speed things up.

#### 5. Use `%debug` to print out the value of `y` (inside `div_xy`) before the division by zero. (★★★)

In [9]:
import by_zero


%debug
by_zero.div_by_zero(3)

> [0;32m/home/grochmal/programs/my/daml/city/01-python-jupyter/by_zero.py[0m(3)[0;36mdiv_xy[0;34m()[0m
[0;32m      1 [0;31m[0;34m[0m[0m
[0m[0;32m      2 [0;31m[0;32mdef[0m [0mdiv_xy[0m[0;34m([0m[0mx[0m[0;34m,[0m [0my[0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0m
[0m[0;32m----> 3 [0;31m    [0;32mreturn[0m [0mx[0m [0;34m/[0m [0my[0m[0;34m[0m[0m
[0m[0;32m      4 [0;31m[0;34m[0m[0m
[0m[0;32m      5 [0;31m[0;34m[0m[0m
[0m
ipdb> print(y)
0.0
ipdb> continue


ZeroDivisionError: float division by zero

The objective here is to understand that we do have a debugger inside jupyter.
And that the debugger will, by default start during the processing of an exception.

The tricky bit is that if you place `%debug` at the very beginning of the cell you will
step into the `import` statement, which will produce a lot of non-trivial debugger output
that is not relevant to our division by zero problem.

#### 6. Use `%prun` to profile the function you wrote in exercise 2. (★★★)

In [10]:
# This is just a trick to output into the notebbok instead of the pager,
# for the purpose of the exercise it is completely fine to use the pager.
from IPython.core import page
page.page = print


import numpy as np


long_list = np.arange(32256)
%prun my_mean(long_list)

         6 function calls in 0.006 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.005    0.005    0.005    0.005 {built-in method builtins.sum}
        1    0.000    0.000    0.006    0.006 <ipython-input-7-51bc297f1678>:1(my_mean)
        1    0.000    0.000    0.006    0.006 {built-in method builtins.exec}
        1    0.000    0.000    0.006    0.006 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
 

We have some time used by `sum` (`builtin.sum`) but almost no time used by `len` (`builtin.len`).
This means that if we substitute our call to `sum` with `np.sum` and then use NumPy arrays to hold
the data we will achieve a speed similar to `np.mean`.