### Gaussian Distribution Formulas
Probability Density function


$\large f(x|\mu,\sigma^2)$ =$\large \frac{1}{\sqrt{2\pi\sigma^2}}$$\large e^-{\large \frac{(x-\mu)^2}{2\sigma^2}}$


where:

$\mu$ : mean
<br>
$\sigma$ : standard deviation
<br>
$\sigma^2$ : variance



### Binomial Distribution Formulas

* **mean** 

$\large \mu = n * p $

> fair coin has a probability of a positive outcome (heads) p = 0.5. If you flip a coin 20 times, the mean would be 20 * 0.5 = 10; you'd expect to get 10 heads

<br>

* **variance**

$\large \sigma^2 = n * p * (1-p)$

> Continuing with the coin example, n would be the number of coin tosses and p would be the probability of getting heads.

<br>

* **standard deviation**

$\large \sigma = \sqrt{n*p*(1-p)}$

<br>

* **probability density function**

$\large f(k,n,p) = \frac{n!}{k!(n-k)!}p^k(1-p)^{(n-k)}$

Assume the average weight of an American adult male is 180 pounds with a standard deviation of 34 pounds. The distribution of weights follows a normal distribution. What is the probability that a man weighs exactly 185 pounds?

> Great job! When finding the probabilities using a continuous distribution, the probability of obtaining an exact value is zero. If the question had been what is the probability that a man's weight is between 184.99 and 185.01, then the answer would be a small but positive value of 0.0002.

Like in the previous question, assume the average weight of an American adult male is 180 pounds with a standard deviation of 34 pounds. The distribution of weights follows a normal distribution. What is the probability that a man weighs somewhere between 120 and 155 pounds?

> particular Gaussian distribution between 120 and 155 would be 0.19. The area under the Gaussian curve represents the probability.

Now consider a Binomial distribution. Assume that 15% of the population is allergic to cats. If you randomly select 60 people for a medical trial, what is the probability that 7 of those people are allergic to cats?

> 0.12

In [2]:
import numpy as np

In [5]:
import math
import matplotlib.pyplot as plt

class Gaussian():
    def __init__(self,mu=0,sigma=1):
        self.mean = mu
        self.stdev = sigma
        self.data = []
        
    def calculate_mean(self):
        avg = 1.0 * sum(self.data)/len(self.data)
        self.mean = avg
        return self.mean
    
    def calculate_stdev(self,sample=True):
        if sample:
            n = len(self.data) - 1
        else:
            n = len(self.data)
            
        mean = self.mean
        sigma = 0
        for d in self.data:
            sigma += (d-mean) ** 2
            
        sigma = math.sqrt(sigma/n)
        self.stdev = sigma
        return self.stdev
    
    
    def read_data(self,file_name,sample=True):
        
        with open(file_name) as file:
            data_list = []
            line = file.readline()
            while line:
                data_list.append(int(line))
                line = file.readline()
        file.close()
        
        self.data = data_list
        self.mean = self.calculate_mean()
        self.stdev = self.calculate_stdev(sample)
        
    def plot_histogram(self):
        plt.hist(self.data)
        plt.title('Histogram of Data')
        plt.xlabel('data')
        plt.ylabel('count')
        
    def pdf(self,x):
        
        return (1.0/(self.stdev * math.sqrt(2*math.pi))) * math.exp(-0.5*((x - self.mean)/self.stdev)**2)
    
    
    def plot_histogram_pdf(self,n_spaces=50):
        mu = self.mean
        sigma = self.stdev
        min_range = min(self.data)
        max_range = max(self.data)
        
        interval = 1.0 * (max_range-min_range)/n_spaces
        
        x = []
        y = []
        
        for i in range(n_spaces):
            tmp = min_range + interval*i
            x.append(tmp)
            y.append(self.pdf(tmp))
            
        fig,axes = plt.subplots(2,sharex=True)
        fig.subplots_adjust(hspace=.5)
        axes[0].hist(self.data,density=True)
        axes[0].set_title('Normed Histogram of Data')
        axes[0].set_ylabel('Density')
        
        axes[1].plot(x,y)
        axes[1].set_title('Normal Distribution for \n Sample Mean and Sample Standard Deviation')
        axes[0].set_ylabel('Density')
        
        plt.show()
        
        return x,y
    
    

In [3]:
gaussian_one = Gaussian()
gaussian_one.read_data_file('.txt')