### Uniform Distribution
Lets start with generating some fake random data. You can get a random number between 0 and 1 using the python random module as follow:

In [1]:
import random
x=random.random()
print("The Value of x is", x)

The Value of x is 0.4221584444119929


Everytime you call random, you will get a new number.

*Exercise 1:* Using random, write a function `generate_uniform(N, mymin, mymax)`, that returns a python list containing N random numbers between specified minimum and maximum value. Note that you may want to quickly work out on paper how to turn numbers between 0 and 1 to between other values. 

In [2]:
# Skeleton
def generate_uniform(N,x_min,x_max):
    out = []
    for i in range(N):
        value = random.random()
        scaled_value = value * (x_max - x_min) + x_min
        out.append(scaled_value)
    return out

In [3]:
# Test your solution here
data=generate_uniform(1000,-10,10)
print ("Data Type:", type(data))
print ("Data Length:", len(data))
if len(data)>0: 
    print ("Type of Data Contents:", type(data[0]))
    print ("Data Minimum:", min(data))
    print ("Data Maximum:", max(data))

Data Type: <class 'list'>
Data Length: 1000
Type of Data Contents: <class 'float'>
Data Minimum: -9.999238081057033
Data Maximum: 9.971674561312401


*Exercise 2a:* 
Write a function that computes the mean of values in a list. Recall the equation for the mean of a random variable $\bf{x}$ computed on a data set of $n$ values $\{ x_i \} = \{x_1, x_2, ..., x_n\}$  is ${\bf\bar{x}} = \frac{1}{n} \sum_i^n x_i$.

In [4]:
# Skeleton
def mean(Data):
    m=0.
    
    total = sum(Data)
    m = total / len(Data)
    
    return m

In [5]:
# Test your solution here
print ("Mean of Data:", mean(data))

Mean of Data: -0.0381457767569186


*Exercise 2b:* 
Write a function that computes the variance of values in a list. Recall the equation for the variance of a random variable $\bf{x}$ computed on a data set of $n$ values $\{ x_i \} = \{x_1, x_2, ..., x_n\}$  is ${\bf\langle x \rangle} = \frac{1}{n} \sum_i^n (x_i - {\bf\bar{x}})$.

In [16]:
# Skeleton
def variance(Data):
    m = sum((x - mean(Data)) ** 2 for x in Data) / (len(Data))
    return m

In [17]:
# Test your solution here
print ("Variance of Data:", variance(data))

Variance of Data: 32.93775118033848


## Histogramming

*Exercise 3:* Write a function that bins the data so that you can create a histogram. An example of how to implement histogramming is the following logic:

* User inputs a list of values `x` and optionally `n_bins` which defaults to 10.
* If not supplied, find the minimum and maximum (`x_min`,`x_max`) of the values in x.
* Determine the bin size (`bin_size`) by dividing the range of the function by the number of bins.
* Create an empty list of zeros of size `n_bins`, call it `hist`.
* Loop over the values in `x`
    * Loop over the values in `hist` with index `i`:
        * If x is between `x_min+i*bin_size` and `x_min+(i+1)*bin_size`, increment `hist[i].` 
        * For efficiency, try to use continue to goto the next bin and data point.
* Return `hist` and the list corresponding of the bin edges (i.e. of `x_min+i*bin_size`).    

In [18]:
# Solution
def histogram(x,n_bins=10,x_min=None,x_max=None):
    x_min = min(x)
    x_max = max(x)
    bin_size = (x_max - x_min) / n_bins
    hist = [0] * n_bins
    bin_edges = [x_min + i*bin_size for i in range(n_bins)]
    for j in x:
        for i in range(n_bins):
            if j >= bin_edges[i] and j < bin_edges[i] + bin_size:
                hist[i] += 1
                continue
    return hist,bin_edges

In [19]:
# Test your solution here
h,b=histogram(data,100)
print(h)

[11, 11, 9, 7, 10, 9, 10, 17, 12, 13, 11, 10, 11, 4, 6, 8, 10, 9, 11, 5, 19, 12, 8, 8, 8, 9, 5, 9, 11, 14, 11, 11, 8, 14, 10, 5, 9, 11, 10, 11, 10, 7, 15, 12, 9, 9, 8, 11, 7, 12, 15, 7, 16, 8, 9, 10, 8, 7, 8, 6, 10, 12, 11, 20, 8, 7, 7, 18, 13, 11, 8, 9, 7, 16, 8, 9, 13, 9, 5, 15, 10, 10, 4, 8, 18, 11, 9, 8, 9, 13, 7, 13, 11, 7, 10, 6, 12, 8, 8, 10]


*Exercise 4:* Write a function that uses the histogram function in the previous exercise to create a text-based "graph". For example the output could look like the following:
```
[  0,  1] : ######
[  1,  2] : #####
[  2,  3] : ######
[  3,  4] : ####
[  4,  5] : ####
[  5,  6] : ######
[  6,  7] : #####
[  7,  8] : ######
[  8,  9] : ####
[  9, 10] : #####
```

Where each line corresponds to a bin and the number of `#`'s are proportional to the value of the data in the bin. 

In [25]:
# Solution
def draw_histogram(x,n_bins,x_min=None,x_max=None,character="#",max_character_per_line=20):
    hist, bin_edges = histogram(data, n_bins)
    for i in range(n_bins):
        lower_edge = bin_edges[i]
        upper_edge = bin_edges[i+1] if i < n_bins-1 else bin_edges[i]+(bin_edges[i]-bin_edges[i-1])
        print("[{:>2}, {:<2}] : {}".format(int(lower_edge), int(upper_edge), "#" * hist[i]))


In [26]:
# Test your solution here
h,b=histogram(data,20)
draw_histogram(data,20)

[-9, -9] : ################################################
[-9, -8] : #############################################################
[-8, -7] : ##########################################
[-7, -6] : ###########################################
[-6, -5] : #######################################################
[-5, -4] : ################################################
[-4, -3] : ######################################################
[-3, -2] : ##############################################
[-2, -1] : #####################################################
[-1, 0 ] : ###############################################
[ 0, 0 ] : #######################################################
[ 0, 1 ] : #######################################
[ 1, 2 ] : #############################################################
[ 2, 3 ] : ########################################################
[ 3, 4 ] : ################################################
[ 4, 5 ] : ###################################################
[

## Functional Programming

*Exercise 5:* Write a function the applies a booling function (that returns true/false) to every element in data, and return a list of indices of elements where the result was true. Use this function to find the indices of entries greater than 0.5. 

In [27]:
def where(mylist,myfunc):
    out= []
    for i, value in enumerate(mylist):
        if myfunc(value):
            out.append(i)
    return out

def greater_than_half(value):
    return value > 0.5

In [30]:
# Test your solution here
k = [0.5, 0.2, 0.1, 0.9, 0.8, 0.1]
true_indices = where(k, greater_than_half)
print(true_indices)

[3, 4]


*Exercise 6:* The inrange(mymin,mymax) function below returns a function that tests if it's input is between the specified values. Write corresponding functions that test:
* Even
* Odd
* Greater than
* Less than
* Equal
* Divisible by

In [32]:
def inrange(mymin,mymax):
    def testrange(x):
        return x<mymax and x>=mymin
    return testrange

# Examples:
F1=inrange(0,10)
F2=inrange(10,20)

# Test of in_range
print (F1(0), F1(1), F1(10), F1(15), F1(20))
print (F2(0), F2(1), F2(10), F2(15), F2(20))

print ("Number of Entries passing F1:", len(where(data,F1)))
print ("Number of Entries passing F2:", len(where(data,F2)))

True True False False False
False False True True False
Number of Entries passing F1: 502
Number of Entries passing F2: 0


In [33]:
### BEGIN SOLUTION

def is_even(num):
    return num % 2 == 0
def is_odd(num):
    return num % 2 != 0
def is_greater_than(num, limit):
    return num > limit
def is_less_than(num, limit):
    return num < limit
def is_equal_to(num, value):
    return num == value
def is_divisible_by(num, divisor):
    return num % divisor == 0
      
    
### END SOLUTION

In [41]:
# Test your solution
print(is_even(26),is_odd(38),is_greater_than(35,26),is_less_than(35,26),is_equal_to(35,26),is_divisible_by(35,26))

True False True False False False


*Exercise 7:* Repeat the previous exercise using `lambda` and the built-in python functions sum and map instead of your solution above. 

In [44]:
### BEGIN SOLUTION

is_even = lambda num: num % 2 == 0
is_odd = lambda num: num % 2 != 0
is_greater_than = lambda num, limit: num > limit
is_less_than = lambda num, limit: num < limit
is_equal_to = lambda num, value: num == value
is_div_by = lambda num, divisor: num % divisor == 0
    
### END SOLUTION

In [45]:
is_div_by(21,7)

True

## Monte Carlo

*Exercise 7:* Write a "generator" function called `generate_function(func,x_min,x_max,N)`, that instead of generating a flat distribution, generates a distribution with functional form coded in `func`. Note that `func` will always be > 0.  

Use the test function below and your histogramming functions above to demonstrate that your generator is working properly.

Hint: A simple, but slow, solution is to a draw random number test_x within the specified range and another number p between the min and max of the function (which you will have to determine). If p<=function(test_x), then place test_x on the output. If not, repeat the process, drawing two new numbers. Repeat until you have the specified number of generated numbers, N. For this problem, it's OK to determine the min and max by numerically sampling the function.  

In [77]:
def generate_function(func,x_min,x_max,N=1000):
    out = list()
    
    x_range = x_max - x_min
    
    p_min = p_max = func(x_min)
    for i in range(10000):
        x = x_min + random.uniform(0, 1) * x_range
        p = func(x)
        if p < p_min:
            p_min = p
        elif p > p_max:
            p_max = p
    
    
    while len(out) < N:
        test_x = random.uniform(x_min, x_max)
        test_p = random.uniform(p_min, p_max)
        if test_p <= func(test_x):
            out.append(test_x)
    
    return out

In [78]:
# A test function
def test_func(x,a=1,b=1):
    return abs(a*x+b)

In [82]:
#draw_histogram(data,20)

d=generate_function(test_func,-5,5)
h, b = histogram(d, 20)
draw_histogram(d,20)

[-9, -9] : ################################################
[-9, -8] : #############################################################
[-8, -7] : ##########################################
[-7, -6] : ###########################################
[-6, -5] : #######################################################
[-5, -4] : ################################################
[-4, -3] : ######################################################
[-3, -2] : ##############################################
[-2, -1] : #####################################################
[-1, 0 ] : ###############################################
[ 0, 0 ] : #######################################################
[ 0, 1 ] : #######################################
[ 1, 2 ] : #############################################################
[ 2, 3 ] : ########################################################
[ 3, 4 ] : ################################################
[ 4, 5 ] : ###################################################
[

*Exercise 8:* Use your function to generate 1000 numbers that are normal distributed, using the `gaussian` function below. Confirm the mean and variance of the data is close to the mean and variance you specify when building the Gaussian. Histogram the data. 

In [58]:
import math

def gaussian(mean, sigma):
    def f(x):
        return math.exp(-((x-mean)**2)/(2*sigma**2))/math.sqrt(math.pi*sigma)
    return f

# Example Instantiation
g1=gaussian(0,1)
g2=gaussian(10,3)
samples = generate_function(g1, -5, 5, 10000)
h, b = histogram(samples, 50)
draw_histogram(samples,50)


[-9, -9] : ######################
[-9, -9] : ################
[-9, -8] : ###################
[-8, -8] : ###########################
[-8, -8] : #########################
[-8, -7] : #####################
[-7, -7] : ###############
[-7, -6] : ##############
[-6, -6] : ###################
[-6, -6] : ################
[-6, -5] : ###############################
[-5, -5] : ################
[-5, -4] : #################
[-4, -4] : ##############
[-4, -4] : #########################
[-4, -3] : ######################
[-3, -3] : ######################
[-3, -2] : ###############
[-2, -2] : ####################
[-2, -2] : #####################
[-2, -1] : #################
[-1, -1] : ###########################
[-1, 0 ] : ##################
[ 0, 0 ] : ###################
[ 0, 0 ] : ###################
[ 0, 0 ] : ######################
[ 0, 0 ] : ########################
[ 0, 1 ] : ###################
[ 1, 1 ] : ###############
[ 1, 1 ] : ##############
[ 1, 2 ] : ######################
[ 2, 2 ] : ####

*Exercise 9:* Combine your `generate_function`, `where`, and `in_range` functions above to create an integrate function. Use your integrate function to show that approximately 68% of Normal distribution is within one variance.

In [65]:
def integrate(func, x_min, x_max, N=10000):
    in_range = lambda x: x_min <= x <= x_max
    x_values = generate_function(func, x_min, x_max, N)
    y_values = map(func, x_values)
    area = sum(y_values) * (x_max - x_min) / float(N)
    count = len(list(filter(in_range, x_values)))
    integral = area * count / float(N)
    return integral

In [69]:
from math import exp, sqrt, pi

def gaussian(x):
    return exp(-x**2/2)/sqrt(2*pi)

integral = integrate(gaussian, -1, 1, 100000)
print("Proportion within one standard deviation:", integral)


Proportion within one standard deviation: 0.6828240474959013
