# Week 1 in class

This is one of the notebooks that we work on together in class. In this way you can work on tougher problems, on your own or with a small group of other students, and have plenty of time to ask questions about things that you do not understand. 

If you do not manage to finish the notebook in class, then finish it yourself at home. The notebook is written in such a way that you can go through it on your own. If you do run into problems, ask us on the discussion forum, in the Q&A session, or next week.

#### Learning goals
- Extending your knowlegde of object types to *tuples*.
- Introducing you to an important application of strings: *string formatting*.
- Increasing your comfort with `numpy`, with an application to financial arithmetic.

## Tuples
<a id='3'></a>

Tuples are very similar to lists and hold ordered collections of items.

However, tuples and lists have three main differences:

1. Tuples are created using parenthesis — `(` and `)` — instead of
  square brackets — `[` and `]`.  
1. Tuples are *immutable*, which is a fancy computer science word
  meaning that they can’t be changed or altered after they are created.  
1. Tuples and multiple return values
  from functions are tightly connected, as we will see in functions.  

In [None]:
t = (1, "hello", 3.0)
print("t is a", type(t))
t

We can *convert* a list to a tuple by calling the `tuple` function on
a list.

In [None]:
x = [1,2]
print("x is a", type(x))
print("tuple(x) is a", type(tuple(x)))
tuple(x)

We can also convert a tuple to a list using the list function.

In [None]:
list(t) # type convertion

### List vs Tuple: Which to Use?

Should you use a list or tuple?

This depends on what you are storing, whether you might need to reorder the elements,
or whether you’d add
new elements without a complete reinterpretation of the
underlying data.

For example, take data representing the GDP (in trillions) and population
(in billions) for China in 2015.

In [2]:
china_data_2015 = ("China", 2015, 11.06, 1.371)

print(china_data_2015)

('China', 2015, 11.06, 1.371)


In this case, we have used a tuple since: (a) ordering would
be meaningless; and (b) adding more data would require a
reinterpretation of the whole data structure.

On the other hand, consider a list of GDP in China between
2014 and 2015.

In [None]:
gdp_data = [10.48, 11.06]
print(gdp_data)

In this case, we have used a list, since adding on a new
element to the end of the list for GDP in 2016 would make
complete sense.

Along these lines, collecting data on China for different
years may make sense as a list of tuples (e.g. year, GDP,
and population – although we will see better ways to store this sort of data
when we learn Pandas

In [None]:
china_data = [(2015, 11.06, 1.371), (2014, 10.48, 1.364)]
print(china_data)

#### Exercise 1

Verify that tuples are indeed immutable and lists are not, by attempting the following:

- Changing the GDP in 2015 to 12.00
- Appending observations for 2013: a GDP of 9.607, and a population of 1.357
- Sorting the data ascending in years

- Changing the GDP in 2015 to 12.00

In [3]:
china_data = [(2015, 11.06, 1.371), (2014, 10.48, 1.364)]
#                     0                      1
#              0.     1.      2.      0.     1.     2
china_data[0][1] = 12.00 # Because of immutable varible type (TUPLE)
# ====='tuple' object does not support item assignment=====

TypeError: 'tuple' object does not support item assignment

- Appending observations for 2013: a GDP of 9.607, and a population of 1.357

In [4]:
china_data = [(2015, 11.06, 1.371), (2014, 10.48, 1.364)]
china_data.append((2013,9.607,1.357))
print(china_data)

[(2015, 11.06, 1.371), (2014, 10.48, 1.364), (2013, 9.607, 1.357)]


- Sorting the data ascending in years

In [6]:
china_data.sort() # inplace function
######## china_data.sort(reverse=True) # descending sort
print(china_data)

[(2013, 9.607, 1.357), (2014, 10.48, 1.364), (2015, 11.06, 1.371)]


### Conclusion

In general, a rule of thumb is to use a list unless you *need* to use a tuple.

Key criteria for tuple use are when you want to:

- ensure the *order* of elements can’t change  
- ensure the actual values of the elements can’t
  change  
- use the collection as a key in a dictionary (we will learn what this
  means next week)  

## String Formatting
<a id='1'></a>
Sometimes we’d like to reuse some portion of a string repeatedly, but
still make some relatively small changes at each usage.

We can do this with *string formatting*, which done by using `{}` as a
*placeholder* where we’d like to change the string, with a variable name
or expression.

Let’s look at an example.

In [None]:
country = "Vietnam"
GDP = 223.9
year = 2017
my_string = f"{country} had ${GDP} billion GDP in {year}"
print(my_string)

Rather than just substituting a variable name, you can use a calculation
or expression.

In [None]:
print(f"{5}**2 = {5**2}")

Or, using our previous example

In [None]:
my_string = f"{country} had ${GDP * 1_000_000} GDP in {year}"
print(my_string)

In these cases, the `f` in front of the string causes Python interpolate
any valid expression within the `{}` braces.

Alternatively, to reuse a formatted string, you can call the `format` *method* (noting that you do **not** put `f` in front).

In [None]:
gdp_string = "{country} had ${GDP} billion in {year}"

gdp_string.format(country = "Vietnam", GDP = 223.9, year = 2017)

#### Exercise 2

Lookup a country in [World Bank database](https://data.worldbank.org/), and format a string showing the growth rate of GDP over the last 2 years.

Note, incidently how we created a link in markdown in the previous sentence. Double click on this cell, to see the syntax for a link in markdown.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

Now, instead of hard-coding the values above, try using `format` function by defining country, GDP and year as variables.

In [6]:
# YOUR CODE HERE
raise NotImplementedError()

#### Exercise 3

Create a new string and use formatting to produce each of the following statements:
- "The 1st quarter revenue was 110M"
- "The 2nd quarter revenue was 95M"
- "The 3rd quarter revenue was 100M"
- "The 4th quarter revenue was 130M"

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

## Numpy

Numpy’s core contribution is a data-type called an array. An array is similar to a list, but numpy imposes some additional restrictions on how the data inside is organized.

These restrictions allow numpy to

1. Be more efficient in performing mathematical and scientific computations.
2. Expose functions that allow numpy to do the necessary linear algebra for machine learning and statistics.

Before we get started, please note that the convention for importing the numpy package is to use the nickname np:

In [17]:
import numpy as np

### Array Indexing

An array is a multi-dimensional grid of values. Now that there are multiple dimensions, indexing might feel somewhat non-obvious. Do the rows or columns come first? In higher dimensions, what is the order of
the index?

Notice that the array is built using a list of lists. Indexing into the array will correspond to choosing elements from each list.

First, notice that a 3-dimensional array give two stacked matrices:

In [None]:
x_3d_list = [[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]]
x_3d = np.array(x_3d_list)
print(x_3d)

We can access the matrices with

In [None]:
print(x_3d[0])
print(x_3d[1])

In the case of the first, it is synonymous with

In [None]:
print(x_3d[0, :, :])

Let’s work through another example to further clarify this concept with our
3-dimensional array.

Our goal will be to find the index that retrieves the `4` out of `x_3d`.

Recall that when we created `x_3d`, we used the list `[[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]]`.

Notice that the 0 element of that list is `[[1, 2, 3], [4, 5, 6]]`. This is the
list that contains the `4` so the first index we would use is a 0.

In [None]:
print(x_3d[0, 1, 0])

**Exercise 4**

What would you do to extract the array `[[5, 6], [50, 60]]` from `x_3d`?

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

### Array Functionality

#### Creating Arrays

It’s usually impractical to define arrays by hand as we have done so far.

We’ll often need to create an array with default values and then fill it
with other values.

We can create arrays with the functions `np.zeros` and `np.ones`.

Both functions take a tuple that denotes the shape of an array and creates an
array filled with 0s or 1s respectively.

In [None]:
sizes = (2, 3, 4)
x = np.zeros(sizes) # note, a tuple!
x

In [None]:
y = np.ones((4))
y

#### Broadcasting Operations

Two types of operations that will be useful for arrays of any dimension are:

1. Operations between an array and a single number.  
1. Operations between two arrays of the same shape.  


When we perform operations on an array by using a single number, we simply apply that operation to every element of the array.

In [None]:
# Using np.ones to create an array
x = np.ones((2, 2))
print("x = ", x)
print("2 + x = ", 2 + x)
print("2 - x = ", 2 - x)
print("2 * x = ", 2 * x)
print("x / 2 = ", x / 2)

Operations between two arrays of the same size, in this case `(2, 2)`, simply apply the operation element-wise between the arrays.

In [None]:
x = np.array([[1.0, 2.0], [3.0, 4.0]])
y = np.ones((2, 2))
print("x = ", x)
print("y = ", y)
print("x + y = ", x + y)
print("x - y", x - y)
print("(elementwise) x * y = ", x * y)
print("(elementwise) x / y = ", x / y)

**Exercise 5**

Consider the code `[1, 2, 3] + [4]`. What is the difference between the output of this code and the output of `np.array([1, 2, 3]) + [4]`?

YOUR ANSWER HERE

**Exercise 6**

Do you recall what multiplication by an integer did for lists?

How does broadcasting differ?

YOUR ANSWER HERE

#### Universal Functions

We will often need to transform data by applying a function to every element of an array.

Numpy has good support for these operations, called *universal functions* or ufuncs for short.

The
[numpy documentation](https://docs.scipy.org/doc/numpy/reference/ufuncs.html?highlight=ufunc#available-ufuncs)
has a list of all available ufuncs.

>**Note**
>
>You should think of operations between a single number and an array, as we
just saw, as a ufunc.

Below, we will create an array that contains 10 points between 0 and 25.

In [28]:
# This is similar to range -- but spits out 10 evenly spaced points from 0.5 to 25.
x = np.linspace(0.5, 25, 10)

We will experiment with some ufuncs below:

In [None]:
# Applies the sin function to each element of x
np.sin(x)

You can use the inspector or the docstrings with `np.<TAB>` to see other available functions, such as

In [None]:
# Takes natural log of each element of x
np.log(x)

A benefit of using the numpy arrays is that numpy has succinct code for combining vectorized operations.

In [None]:
# Calculate log(z) * z elementwise
z = np.array([1,2,3])
np.log(z) * z

**Exercise 7**

Recall from your courses that the equation for pricing a bond with coupon payment $ C $,
face value $ M $, yield to maturity $ i $, and periods to maturity
$ N $ is

$$
\begin{align*}
    P &= \left(\sum_{n=1}^N \frac{C}{(i+1)^n}\right) + \frac{M}{(1+i)^N} \\
      &= C \left(\frac{1 - (1+i)^{-N}}{i} \right) + M(1+i)^{-N}
\end{align*}
$$

In the code cell below, we have defined variables for `i`, `M` and `C`.

You have two tasks:

1. Define a numpy array `N` that contains all maturities between 1 and 10, including endpoints (*hint:* look at the `np.arange` function).  
1. Using the equation above, determine the bond prices of all maturity levels in your array.  

In [None]:
i = 0.03
M = 100
C = 5

# Define array here
raise NotImplementedError()

# price bonds here
raise NotImplementedError()