# Dartboard Model
## Professor Sterne
## Rita Kurban
## CS110
## 10.15.2017

In this simulation, I assume that the area of the square is 1 (width = 1, length = 1).
The circle is inscribed in this square and thus has the radius of 0.5.
The area of a circle is pi*r^2, which is in our case pi/4.
If we divide the number of darts thrown into the circle by the total number of darts we get an estimate of the ratio of the circle area to the square area.
As the area of the square is 1, we actually get an estimate of pi/4 as num_darts approaches infinity. By multiplying it by 4, we get an approximate value of pi.

In [47]:
import random
import math
from scipy import stats
import numpy as np

from datetime import datetime


def pi_estimator(num_darts):
    count = 1
    circle_hits = 0
    square_hits = 0
    dist = 0
    x = 0
    y = 0
    circlex = []
    circley =[]
    #While loop runs the simulation num_darts times
    while count <= num_darts:
        #Chooses the coordinate for each dart
        x = random.random()
        y = random.random()
        
        #To determine the distance between the point and the center, I will use the Pythogorean theorem:
        dist = math.sqrt((x - 0.5)**2 + (y - 0.5)**2)
        if dist <= 0.5:
            circle_hits += 1
            square_hits += 1
        else:
            square_hits += 1
        count += 1
    pi = 4*(1.*circle_hits/square_hits)
    return pi
start=datetime.now()
print pi_estimator(100000)
print datetime.now()-start

3.1496
0:00:00.098958


2. Standard error equals to standard deviation divided by the sqrt(sample_size). To calculate it, I chose a fixed number of throws (1000) and drew samples of different sizes from 10 to 1000. I set the final approximation to be the mean of all the approximations in the sample. The resulting list of standard errors is added to the plt.errorbar() function which plots green standard error bars on the graph. Error bars depend on the sample size because the bigger this number is, the better our approximation gets. The graph below demonstrates this trend: the approximations of pi for small sample sizes are further from the real value of pi and have higher error bars. Regarding the asymptotic notation, the standard error has O(n^(−1/2)) (as we divide by the sqrt(n)) as n --> ∞.

In [27]:
x =[]
y=[]
errors = []
#Generates multiple samples with different sample sizes
for sample_size in range(10, 1000, 100):
    sample = []
    x.append(sample_size)
    for value in range(sample_size):
        sample.append(pi_estimator(1000))
    #Finds the estimate and the standard error    
    estimate = np.mean(sample)
    y.append(estimate)
    std_error = np.std(sample, ddof = 1)/math.sqrt(sample_size)
    #Generates a list of errors that will be plotted as error bars in the graph
    errors.append(std_error)

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt

plt.figure(figsize=(15,5))
plt.scatter(x, y,s=15)
plt.errorbar(x,y,yerr=errors, linestyle="None", color = "green", elinewidth = 2)
plt.axhline(y = math.pi, color="red")
plt.ylabel('Pi Estimate')
plt.xlabel('Sample Size')
plt.show()

NameError: name 'x' is not defined

<matplotlib.figure.Figure at 0x107551f50>

    4(1). I tried to make the code simple and readable by choosing descriptive variable names and writing comments. I also separated different pieces of code (the simulation itself, the standard error and the plot) into different cells for the viewer to focus on a specific part without being distracted by the rest of the code.
    
    4(2). There are two ways to improve the accuracy of the approximation. First, we can increase the sample size for a specific number of throws as I did in the second question. The mean of a bigger number of approximations for a fixed sample size gives us a better approximation of the real value. Even if the number of throws is very low, we can repeat the simulation many times and, on average, the mean will be close to pi because the model is randomized (not biased) and, on average, it will produce higher estimates as often as lower estimates. 

    Another way to increase the accuracy, if the number of samples is fixed, is to increase the number of throws. As the number of throws approaches infinity, we get a far better estimate of the ratio of the circle area to the square area. If the number of throws is infinitely big, even a single run of the simulation will give us a very reliable, if not perfect, estimate of the actual number. However, as the computational power and time are limited, this is almost impossible in real life. 

5  To get one more decimal digit of accuracy we need to have a standard error which is 1/10 as large, which requires a 10^2-fold increase in computation. To get three more digits of accuracy requires 1000^2 times as much computation.  For example, if the researcher needs at least 10 digits, the standard error has to be less than 10^-10. As O(n^(-1/2)), n has to be larger than 10^20. I know that it takes my computer ≅0.09 sec to calculate the estimate when the number of throws is 10^5, so, by multiplying it by 10^15 I get 9*10^13 sec or approx. 3000000 years. It shows that this model is not so good for cases which require high precision.

HC modeling - I created a model that empirically calculates the value of pi and analyzed the accuracy of this model using standard errors.

HC dataviz - I plotted the pi estimates with error bars for different sample sizes. The figure demonstrates how the error decreases as the sample size gets bigger as well as illustrates that the approximates are getting better.

HC sampling - I generated multiple samples to calculate the standard error for the approximations.