# CMSE 202 Midterm (Section 002)

<img src="https://cdn.mos.cms.futurecdn.net/YctSHrw8bPz94HrLyfyruB-970-80.jpg" width=300px align="right" style="margin-left: 20px" alt="Image credit: http://www.jnslp.com">

The goal of this midterm is to give you the opportunity to test out some of the skills that you've developed thus far this semester. In particular, you'll practice setting up a GitHub repository, committing and pushing repository changes, downloading files with command line tools, using documentation to learn about an unfamiliar function, and understand and edit a python class. You should find that you have all of the skills necessary to complete this exam with even just eight weeks of CMSE 202 under your belt!

You are encouraged to look through the entire exam before you get started so that you can appropriately budget your time and understand the broad goals of the exam.  If you get stuck on a problem, move on to the next section and return later.

At the end of the exam, upload your solutions (**including relevant figures and generated code**) to D2L and also push your changes to your git repository. 

**Important note about using online resources**: This exam is "open internet". That means that you can look up documentation, google how to accomplish certain Python tasks, etc. Being able to effectively use the internet for computational modeling and data science is a very important skill, so we want to make sure you have the opportunity to exercise that skill. **However**: The use of any person-to-person communication software is absolutely not acceptable. If you are seen accessing your email, using a chat program (e.g. Slack), or any sort of collaborative cloud storage or document software (e.g. Google Documents), you will be at risk for receiving a zero on the exam.

**Keep your eyes on your screen!** Unfortunately, there isn't enough space in the room for everyone to sit at their own table so please do your best to keep your eyes on your own screen. This exam is designed to give *you* the opportunity to show the instructor what you can do and you should hold yourself accountable for maintaining a high level of academic integrity. If any of the instructors observe suspicious behavior, you will, again, risk receiving a zero.

## Part 1: Setting up a repository for tracking changes (20 points)

Before you get too far along in the assignment, you need to set up a new folder in your **private** GitHub repository that you created for the course. You will store this notebook in that folder and track the changes as you make them. For this section you should:

1. Navigate to your local copy of your "`cmse202-f19-turnin`" repository.
2. In the repository, create a new directory called "`midterm`".
3. Move this notebook into that new directory within your repository.
4. Add the notebook file to your git repository with the appropriate git command.
5. Commit the addition to your repository using the appropriate git command.
6. Finally, to test that everything is working, "git push" the file so that it ends up in your GitHub repository.

**Make sure that your instructor and TA have access to the repository (this should already be the case). Their GitHub usernames are "cmse202-repo" (for punch) and "Luis-Polanco" (for Luis, obviously).**

From this point on you will occasionally be asked to save the state of your notebook, commit the changes, and push it to your new repository.

**Note**: If you're struggling with getting the Git repository set up correctly with this new directory, you can always just work on the notebook as is and try to come back and figure out the repository component later. You may lose some points though since you won't have periodic commits as you make progress along the way, but it will be better than not working on the other parts of the exam!

###  During the exam you will generate plots that you save to `.png` files, as well as new `.py` files. When you create these new files, don't forget to add them to your repository. You must also turn it in to D2L (See "Finish Up" at the end)

# Working with $\pi$ (in total 80 points)

The number $\pi$ is a unusual number in that it is a "public" number, books and movies are based on it, and that it has fascinated people for millenium. In case you didn't know (how could that be), the value $\pi$ is the ratio of the circumference of a circle divided by the length of its diamater. 

<img src="https://i.imgur.com/34NPdBn.png" width = 200>

It is an *irrational* number (it cannot be represented by a fraction) and thus has an infinite number of digits. It is also a *transcendental* number, cannot be the root of any non-zero polynomial equation. As it is irrational, people for centuries have tried various ways to estimate, in modern times to an enormous number of digits, the values of the digits in $\pi$. We are going to use $\pi$ as a background to test our abilities in Python.

# Part 1, Monte Carlo estimate (20 points)

One way to estimate the size of $\pi$ uses a Monte Carlo algorithm. The idea is shown in the figure below.

<img src="https://i.imgur.com/AWYQnJR.gif" width=200>

We create a box that is 1 unit on each side, and then inscribe a circle within that box (implying that the radius of the circle is also one). If we were to randomly drop points inside the box, then an estimate of $\pi$ would be approximated by the total number of points inside the circle, divided by the total number of points in the box. To make it easier, we could create the 1x1 box in the upper right quadrant and inscribe a quarter circle inside the box. The ratio of inside/total would then be $pi/4$. This is the situation shown in the figure above.

Write code below that takes as input from the user the number of points to use, and prints as a value the estimate of $\pi$. Print the difference between your estimate and `numpy.pi`

Be aware that this converges **very slowly**. Millions of points would be needed for a decent estimate so just make sure it works, don't worry about the accuracy.

Speaking of which, Pythagorem theory typically requires a square root calculation. Is that required here?

In [None]:
# write your code here

## Part 2, Infinite fractions (20 pts)

There are a number of infinite series that can be used to estimate the value of $\pi$. The more terms used, the better the accuracy of the estimate. Here is one such series:

$$ \pi = 3 + \frac{4}{2*3*4} - \frac{4}{4*5*6} + \frac{4}{6*7*8} - \frac{4}{8*9*10} + \ldots $$

Do the following:
- implement the calculation of $\pi$ listed above
- calculate the value of $\pi$ for different numbers of terms in the series from 100 to 1000 by 100
- plot the error of the series calculation versus the value of `numpy.pi`.  Because the error drops quickly, do a **semi-log plot** of the y axis.
- save the plot as `fraction.png` 
- add the image to both to your repository and D2L

In [None]:
# write your code here

## Part 3, Normality of $\pi$, 20 pts

The term "normal" when used in the context of an irrational number is a statement about the distribution of the digits in that number. If a number has been proven to be "normal base 10", then the digits should be distributed evenly for the digits 0-9. That is, if we were to tally all the digits (at a statistically significant number), then each digit should occur nearly the same number of times. The important word here is "prove". It has not been proven that $\pi$ is normal, though it appears to be true when we look at the actual digits. 

A file called `pi-million.txt` is provided that has a text representation of the first million digits of $\pi$. Look at it to understand its format.

Your goal is to read in all the digits in that million digit file and tally how many time each digit occurs. Report as an HTML table the results. Report the results in digit order, that is report the digit 0 and its count, then 1 and its count, etc.

In [None]:
# write your code here

## Visualizing Normality (20 pts)

One way to determine (not prove, but at least observe) whether the distribution of digits is normal is to visualize it. Lets try to visualize the distribution of the digits, at least in a simple way

**Important** Write this solution as a script and save it to a file named `visual-pi.py`

- We will again use digits from the file `pi-million.txt`
- create a 2D numpy array of shape 1000 x 1000 with a white color as the value
- Read the first 30,000 digits from the file (start after the decimal point)
- Read digits in as groups of 6 from the file:
   - the first three represent an x location
   - the second three represent a y location
- in the array at location  x y, change the color to black
- create an image from the array
- save the image as the file  `normal.png`
- place both the image and the script in both your repository and D2L

In [None]:
# write as a script and turn in. Do **NOT** put your solution here in the notebook

## Finish Up

Grading will be done using the files you turn into D2L but you must also push all your work to your git repository. Make sure you do **both**. You need to turn in 4 files to D2L under "Midterm":
- This notebook
- Two images: `normal.png` and `fraction.png`
- the script `visual-pi.py`