# Homework 3 Solutions

This homework is mostly an introduction to Numpy! Arguably the most useful library in all of scientific programming. We will see just a taste of what it is truly capable of. Next week we will explore how you can use it with other libraries to do some pretty spectacular things! But for now we will focus on the basics

In the cell directly below is where we typically write our `import` statements. We call this the `import` suite (later there will be a lot more things to import)

In [None]:
import numpy as np # import numpy as np alias

# Playing with Numpy Arrays (20 Points)

## Problem 1 (5 Points)

Create an $8 \times 8$ array with a checkerboard pattern of zeros and ones using a slicing + striding approach.

In [None]:
###Your Code Here###

## Problem 2 (5 Points)

Create a random $5\times5$ array using the function `np.random.rand(5, 5)` and print the result.

In [None]:
###Your Code Here###

## Problem 3 (5 Points)

Use `np.random.rand(5,5)` to create a random 5x5 array. Use boolean indexing to set the nine entries in a 3x3 block in the bottom right corner to 0. Print the result.

(Hint: Use `np.ones()` or `np.zeros()` with the dtype argument to get a 5x5 array full of `True`'s or `False`'s. Use this array to build a boolean filter, in order to set only the entries you want equal to 0.)

In [None]:
###Your Code Here###

## Problem 4 (5 Points)

Create a 1D numpy array that is equivalent to the following list:

                `numbers = [1,2,3,4,5,6,7,8,9,10]`
              
Once you have created this array, append the even values following the number 10 going all the way to 100. So your array should be the same as a list that looks like:

    `modified_nums = [1,2,3,4,5,6,7,8,9,10,12,14,16...96,98,100]`
    
Hint 1: Try using a for loop. You can make use of the `range` function in python that will give you a range of numbers to iterate over`range(5)` will give me an iterable object that is like a list `[0,1,2,3,4]`. I encourage you to google how to use it so you can start your range at a different number other than 0 as well. 

Hint 2: numpy has a slightly different method to append numbers to an array than python's standard syntax. I recommend googling to figure out the difference so that you dont get lots of errors. 

In [None]:
###Your Code Here###

# Challenge Problem

## Problem 5: Binomial Coefficients (20 Points)

[Adapted from Newman, Exercise 2.11 and Physics 77] The binomial coefficient $n \choose k$ is an integer equal to

$$ {n \choose k} = \frac{n!}{k!(n-k)!} = \frac{n \times (n-1) \times (n-2) \times \cdots \times (n-k + 1)}{1 \times 2 \times \cdots \times k} $$

when $k \geq 1$, or ${n \choose 0} = 1$ when $k=0$. (The special case $k=0$ can be included in the general definition by using the conventional definition $0! \equiv 1$.)

1. Write a function `factorial(n)` that takes an integer $n$ and returns $n!$ as an integer. It should yield $1$ when $n=0$. You may assume that the argument will also be an integer greater than or equal to 0.

1. Using the form of the binomial coefficient given above, write a function `binomial(n,k)` that calculates the binomial coefficient for given $n$ and $k$. Make sure your function returns the answer in the form of an integer (not a float) and gives the correct value of 1 for the case where $k=0$. (Hint: Use your `factorial` function from Part 1.)

1. Using your `binomial` function, write a function `pascals_triangle(N)` to print out the first $N$ lines of "Pascal's triangle" (starting with the $0$th line). The $n$th line of Pascal's triangle contains $n+1$ numbers, which are the coefficients $n \choose 0$, $n \choose 1$, and so on up to $n \choose n$. Thus the first few lines are
        1
        1 1
        1 2 1
        1 3 3 1
        1 4 6 4 1     
This would be the result of `pascals_triangle(5)`. Print the first 15 rows of Pascal's triangle.
        
1. The probability that an ubiased coin, tossed $n$ times, will come up heads $k$ times is ${n \choose k} / 2^n$. (Or instead of coins, perhaps you'd prefer to think of spins measured in a [Stern-Gerlach experiment](https://en.wikipedia.org/wiki/Stern%E2%80%93Gerlach_experiment).)
    - Write a function `heads_exactly(n,k)` to calculate the probability that a coin tossed $n$ times comes up heads exactly $k$ times.
    - Write a function `heads_atleast(n,k)` to calculate the probability that a coin tossed $n$ times comes up heads $k$ or more times.
    - Print the probabilities (to three decimal places) that a coin tossed 100 times comes up heads exactly 60 times, and at least 60 times. You should print corresponding statements with the numbers so it is clear what they each mean.

#### Output

To summarize, your program should output the following things:

1. The first 15 rows of Pascal's triangle
1. The probabilities (to three decimal places) that a coin tossed 100 times comes up heads exactly 60 times, and at least 60 times, with corresponding statements so it is clear what each number signifies.

#### Reminder

Remember to write informative doc strings, comment your code, and use descriptive function and variable names so others (and future you) can understand what you're doing!

**Since this is a challenge problem, don't worry about getting the whole thing right; an honest attempt will get you full points or a couple points docked.**

In [None]:
###Your Code Here###

# Data Manipulation (File IO)

### Note: The following introduction was adapted from Physics 77 and modified for our needs.

#### ASCII Files or .txt Files

Think of ASCII files as text files. You can open them using a text editor (like vim or emacs in Unix, or Notepad in Windows) and read the information they contain directly. There are a few ways to produce these files, and to read them once they've been produced. In Python, the simplest way is to use file objects. 

Let's give it a try. We create an abstract file object by calling the function `open( filename, access_mode )` and assigning its return value to a variable (usually `f`). The argument `filename` just specifices the name of the file we're interested in, and `access_mode` tells Python what we plan to do with that file:  

    'r': read the file  
    'w': write to the file (creates a new file, or clears an existing file)
    'a': append the file  
     
Note that both arguments should be strings.
For full syntax and special arguments, see documentation at https://docs.python.org/2/library/functions.html#open

In [None]:
f = open( 'welcome.txt', 'w' )

**A note of caution**: as soon as you call `open()`, Python creates a new file with the name you pass to it if you open it in write mode (`'w'`). Python will overwrite existing files if you open a file of the same name in write ('`w`') mode.

Now we can write to the file using `f.write( thing_to_write )`. We can write anything we want, but it must be formatted as a string.

In [None]:
topics = ['Data types', 'Loops', 'Functions', 'Arrays']

In [None]:
f.write( 'Welcome to the Python DeCal, Fall 2020\n' ) # the newline command \n tells Python to start a new line
f.write( 'Topics we will learn about include:\n' )
for top in topics:
    f.write( top + '\n')
f.close()                                         # don't forget this part!

## Problem 6 (10 Points)

### Part a)

Use the syntax you have just learned to create an ASCII file titled "`cosine.txt`" with two columns containing 20 x and 20 y values. The x values should range from $0$ to $2\pi$ - you can use `np.linspace()` to generate these values (as many as you want). The y values should be $y = cos(x)$ (you can use `np.cos()`) for this. Then, use a `for` loop as above to write a new line for each pair of x and y values. To make sure that each x,y pair is on a new line, remember to add `\n` to the end of each line like above. To separate the values by a tab so that the columns are nicely aligned, you can use the "character" `\t`.  So `\n` inserts a new line and `\t` inserts a tab. You may wish to use some kind of string formatting to decimals from running too far. Here is an example with just one data point:

    x = 0.5 * np.pi
    y = np.cos(x)
    print("{0:0.5f} \t {1:0.5f}".format(x,y))

Pay close attention to the fact that when you use the `write` function, the argument that you pass to it needs to be a string.

In [None]:
###Your Code Here###

### Part b)

We can use the code in cell beneath this one to read our values back into the jupyter notebook from our `welcome.txt` file. (If you are curious about what the `.strip()` does.... Remove it and see what happens)

In [None]:
f = open( 'welcome.txt', 'r' )
for line in f:
    print(line.strip())
f.close()

Your task for this part b) is to read in the values of `cosine.txt` the same way we just did for the `welcome.txt` file above. 

In [None]:
###Your Code Here###

Suppose we wanted to skip the first two lines of `welcome.txt` and print only the list of topics `('Data types', 'Loops', 'Functions', 'Arrays')`. We can use `readline()` to "read" the first two lines but not store their value, thereby ignoring them.

In [None]:
f = open( 'welcome.txt', 'r' )
f.readline()
f.readline() # skip the first two lines
topicList = []
for line in f:
    topicList.append(line.strip())
f.close()
print(topicList)

# The Easy Way

The above method is how you use native python with no libraries to read in a text file.... I don't know about you but that sucked. It was long and tedious and a lot to remember. Luckily, the people who created the `numpy` library thought this too. They figured out how to do almost all the stuff we just did in 1 line of code. This can be done using `numpy.loadtxt()` and `numpy.loadtxt()`


Below is an import statement of a `.csv` file. This is basically the same thing as a `.txt` file but much easier to work with. It stands for Comma Seperated Value.... meaning internally in the file itself everything is seperated by a comma. 

The `.loadtxt` function takes in the file path/name as a string (make sure you keep track of its directory path and if your notebook is in the same directory then don't worry about including the directory) we set the delimiter parameter to a comma so numpy knows how to seperate everything the way we want. Finally we use the skiprows argument and we set it to 1 to skip the first row of the file because the first row has the names of each column. 

In [None]:
data = np.loadtxt('sample_data.csv', delimiter=',', skiprows = 1)

In [None]:
data

Hazzah! This is just a 2D numpy array isn't that neat! This can be a little combersome to seperate neatly so let's add an extra argument called `unpack` and we set it equal to `True`. This essentially tells python to split each column up into a 1D array that we can treat as a list. 

Now we can assign names to each column's corresponding values and treat them as variables. This sample data happens to be me measuring the constant of gravity on earth ($g_{earth}$) from my Dorm.

In [None]:
trials, values = np.loadtxt('sample_data.csv', unpack=True, delimiter=',', skiprows = 1)

In [None]:
trials

In [None]:
values

## Problem 7 (10 Points)

In lecture we discussed a couple neat functions that numpy has built in ready to go for our use. 

**Part a)** Your task for this last problem is to calculate:

- The Minimum
- The Maximum
- The Median
- The Mean
- The Standard Deviation

Of the dataset. Print the values in a well formatted string with 3 sig figs past the decimal. The data is reported in SI units. 

In [None]:
###Your Code Here###

**Part b)** Now have python figure out which trial corresponds to the maximum and minumum data value. Do not just look at the data, use what you know to figure out which trial corresponds to which data point.

In [None]:
###Your Code Here###