# Exercises 3 - with solutions

### Exercise: the Fibonacci sequence

The [Fibonacci sequence](#http://en.wikipedia.org/wiki/Fibonacci_number) is a famous sequence of numbers. It begins with 1, 1 then followed by the sum of the previous two elements. Exercise: find the 100th element of the sequence! Also, how many digits has the 1000th element?

* initialize a list `sequence = [1, 1]`
* use a while loop to test whether `sequence` has a length less than 1000
* if yes, calculate the next value and append to the sequence
* you can turn an integer to a string by using the str() function: this way you can determine the number of digits of a number (with which function?)

### Solution

In [1]:
num_1 = 1
num_2 = 1

sequence = [1, 1]
while len(sequence) < 1000:
    # generate new number
    new_num = sequence[-1] + sequence[-2]
    # add the new value
    sequence.append(new_num)
    
print("The 100th number in the Fibonacci sequence is %d" % sequence[99])
print("The 1000th number in the Fibonacci sequence has %d digits" % len(str(sequence[999])))

The 100th number in the Fibonacci sequence is 354224848179261915075
The 1000th number in the Fibonacci sequence has 209 digits


### Exercise: the birthday paradox

The [birtday problem](#http://en.wikipedia.org/wiki/Birthday_problem) is a classical and surprising observation in probability theory. It loosely states that given a group of people, there is a surprisingly high probability that there are two of them who happen to share birthdays. Even in a group of only 23 people, this probability is 0.5. In this problem, you should prove this by simulation.

1. to the beginning of your code, put `import random`. This will load Python functions that generate random numbers.
2. to generate a random integer between a and b, use `random.randint(a, b)`. Inspect its help by typing `random.int?`
and experiment with it for some time.
3. write a function that takes a parameter `n_class` and generates a list of birthdays. The has length `n_class`. Each element of the list is a random integer between 1 and 365, representing the possible birthdays (don't bother now with 29th February)
    * you can do this either by using a for-loop or using list comprehension
    * the "skeleton" of your function should look like
    ```
    def generate_class_birthdays(n_class):
          """
          Generate a list containing "n_class" random integers between 1 and 365,
          representing birthdays.
          """
          # here comes your code that will produce a variable "list_birthdays" that the
          # function returns
          return list_birthdays
    ```
4. Write another function `are_elements_unique` that takes a list as input and returns `True` if there are two elements which are the same and `False` otherwise
    * you can get the number of unique elements of a list `list_a` by calling `len(set(list_a))`
    * there are two elements that are the same if the number of unique elements is strictly less than the number of elements
5. Write a third function `birthday_paradox` that takes two inputs, `n_class` and `n_rep`, the size of the class and the number of random classes we want to generate. Within the function, write a loop that `n_rep` times does the following:
    * generate a list containing birthdays using `generate_class_birthdays(n_class)`
    * inspect the list with `are_elements_unique` if there are common birthdays
        * if elements are unique, it means there are NO common birthdays
    * before the for loop, you should define an integer `count_if_common = 0` whose value you increase by 1 if the currently generated class contains common birthdays
After the loop, divide `count_if_common` by `n_reps` to get the probability that for a class of size `n_class` there are common birthdays. BE CAREFUL! If you divide an integer with a bigger integer, you'll get zero. Please convert `n_reps` to float, that is, calculate `n_class / float(n_reps)`. 
    * if you want to print a `float`, use the `%.2f` formatter for precision of 2 digits, `%.4f` for 4 digits, etc., similarly to `%d` and `%s`

6. Test your function using `n_class = 23` and `n_reps = 10000`.

### Solution

In [2]:
import random

def generate_class_birthdays(n_class):
    """
    Generate a list containing "n_class" random integers between 1 and 365,
    representing birthdays.
    """
    list_birthdays = []
    for x in range(n_class):
        list_birthdays.append(random.randint(1, 365))
    return list_birthdays

In [3]:
def are_elements_unique(list):
    """
    Test if elements are unique in a list
    """
    return len(list) == len(set(list))

In [4]:
def birthday_paradox(n_class, n_reps):
    """
    Prints the probability that in a group of people with size n_class
    there are at least two people with the same birthdays.
    """
    count_if_common = 0
    for x in range(n_reps):
        group_birthdays = generate_class_birthdays(n_class)
        if not(are_elements_unique(group_birthdays)):
            count_if_common += 1
    
    print("The probability for a class of size %d is %.2f" % (n_class, count_if_common / float(n_reps)))
    

In [5]:
birthday_paradox(23, 10000)

The probability for a class of size 23 is 0.51


### Exercise: read and write .csv data

Wherever you have the current file where you work on this problem, create ONE LEVEL UP a directory called "data" and put there the datafile `amazon_stock_data.csv`. This data contains data on Amazon's stock price on a daily frequency.

1. Read the data using the `csv` module's `DictReader` function and build a list of the daily price data. That is, this will be a list of dictionaries where each dictionary contains data from a day.

2. Go through this list and for each dictionary, create a new key-value pair: `"Avg"` will be the key and the value will be the average of the `Open` and `Close` prices.
    * define an empty list `list_new_dicts` in which you'll collect the new dictionaries
    * write a for loop that steps thourgh the dictionaries
        * from each dictionary, get the value corresponding to the keys `"Open"` and `"Close"`. These are strings, turn them into `float`s using the `float` function
        * take the average of these two float numbers and assign it to a variable `avg_price`. Its type is `float`.
        * add the new key-value pair `"Avg"` as key and the calculated average as value. CAUTION! Turn fist the float value into string using `"{%.2f}".format(avg_price)`

3. Take the new list and write it using `csv.DictWriter`. You have to supply an argument `fieldnames` that specifies the fields that are written (just copy and paste, and inspect how you can use it).

```
with open("../data/amazon_stock_data.csv", "w") as f:
      writer = csv.DictWriter(f, fieldnames=["Date", "Open", "High", "Low", "Close", "Volume", "Adj Close","Avg"],
                              delimiter=",")
      # write a header
      writer.writeheader()
      # loop through the new list of dictionaries
      for d in list_new_dicts:
            writer.writerow(d)
      
      
```

### Solution

In [9]:
# 1.
import csv

list_dicts = []
with open("../data/amazon_stock_data.csv", "r") as f:
    reader = csv.DictReader(f, delimiter=",")
    for d in reader:
        list_dicts.append(d)

In [11]:
# 2.
list_new_dicts = []
for d in list_dicts:
    avg_price = (float(d["Open"]) + float(d["Close"]))/2
    d["Avg"] = "{:.2f}".format(avg_price)
    list_new_dicts.append(d)

In [12]:
# 3.
with open("../data/amazon_stock_data_new.csv", "w") as f:
      writer = csv.DictWriter(f, fieldnames=["Date", "Open", "High", "Low", "Close", "Volume", "Adj Close","Avg"],
                              delimiter=",")
      # write a header
      writer.writeheader()
      # loop through the new list of dictionaries
      for d in list_new_dicts:
            writer.writerow(d)