### Uniform Distribution
Lets start with generating some fake random data. You can get a random number between 0 and 1 using the python random module as follow:

In [72]:
import random
x=random.random()
print("The Value of x is", x)

The Value of x is 0.12475315681614196


Everytime you call random, you will get a new number.

*Exercise 1:* Using random, write a function `generate_uniform(N, mymin, mymax)`, that returns a python list containing N random numbers between specified minimum and maximum value. Note that you may want to quickly work out on paper how to turn numbers between 0 and 1 to between other values. 

In [73]:
# Skeleton
def generate_uniform(N,x_min,x_max):
    out = []
    
    ### BEGIN SOLUTION
    
    for n in range(N):
        x = random.random()*(x_max-x_min)+x_min
        out.append(x)
        
    ### END SOLUTION
    
    return out

In [74]:
# Test your solution here
data=generate_uniform(1000,-10,10)
print ("Data Type:", type(data))
print ("Data Length:", len(data))
if len(data)>0: 
    print ("Type of Data Contents:", type(data[0]))
    print ("Data Minimum:", min(data))
    print ("Data Maximum:", max(data))

Data Type: <class 'list'>
Data Length: 1000
Type of Data Contents: <class 'float'>
Data Minimum: -9.994796377673584
Data Maximum: 9.984748954024802


*Exercise 2a:* 
Write a function that computes the mean of values in a list. Recall the equation for the mean of a random variable $\bf{x}$ computed on a data set of $n$ values $\{ x_i \} = \{x_1, x_2, ..., x_n\}$  is ${\bf\bar{x}} = \frac{1}{n} \sum_i^n x_i$.

In [75]:
# Skeleton
def mean(Data):
    m=0.
    
    ### BEGIN SOLUTION  
    
    total=sum(Data)
    n = len(Data)
    m=total/(n-1)
    
    ### END SOLUTION
    
    return m

In [76]:
# Test your solution here
print ("Mean of Data:", mean(data))

Mean of Data: 0.2696941463437307


*Exercise 2b:* 
Write a function that computes the variance of values in a list. Recall the equation for the variance of a random variable $\bf{x}$ computed on a data set of $n$ values $\{ x_i \} = \{x_1, x_2, ..., x_n\}$  is ${\bf\langle x \rangle} = \frac{1}{n} \sum_i^n (x_i - {\bf\bar{x}})$.

In [77]:
# Skeleton
def variance(Data):
    m=0.
    
    ### BEGIN SOLUTION

    total=sum(Data)
    n=len(Data)
    m=total/n
    sq_diff = sum((x-m)**2 for x in Data)
    var=sq_diff/(n-1)
    
    ### END SOLUTION
    
    return var

In [78]:
# Test your solution here
print ("Variance of Data:", variance(data))

Variance of Data: 34.559585972759336


## Histogramming

*Exercise 3:* Write a function that bins the data so that you can create a histogram. An example of how to implement histogramming is the following logic:

* User inputs a list of values `x` and optionally `n_bins` which defaults to 10.
* If not supplied, find the minimum and maximum (`x_min`,`x_max`) of the values in x.
* Determine the bin size (`bin_size`) by dividing the range of the function by the number of bins.
* Create an empty list of zeros of size `n_bins`, call it `hist`.
* Loop over the values in `x`
    * Loop over the values in `hist` with index `i`:
        * If x is between `x_min+i*bin_size` and `x_min+(i+1)*bin_size`, increment `hist[i].` 
        * For efficiency, try to use continue to goto the next bin and data point.
* Return `hist` and the list corresponding of the bin edges (i.e. of `x_min+i*bin_size`).    

In [79]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

In [80]:
# Solution
def histogram(x,n_bins=10,x_min=None,x_max=None):
    ### BEGIN SOLUTION

    if x_min==None:
        x_min=min(x)
    if x_max==None:
        x_max=max(x)
        
    bin_size=(x_max-x_min)/n_bins
    bin_edges=[x_min+i * bin_size for i in range(n_bins+1)]
    hist=[0]*n_bins
    
    for value in x:
        for i in range(n_bins):
            if bin_edges[i] <= value < bin_edges[i+1]:
                hist[i]+=1
                break
    
    ### END SOLUTION

    return hist,bin_edges

In [81]:
# Test your solution here
h,b=histogram(data,100)
print(h)

[6, 11, 10, 7, 10, 15, 10, 6, 7, 15, 16, 7, 5, 7, 5, 10, 17, 10, 15, 13, 10, 12, 9, 6, 10, 10, 3, 8, 8, 11, 9, 6, 10, 10, 11, 12, 11, 9, 11, 11, 10, 12, 12, 8, 7, 9, 7, 5, 13, 10, 12, 14, 10, 10, 7, 8, 2, 7, 9, 11, 9, 5, 17, 13, 10, 11, 10, 8, 9, 14, 14, 7, 5, 12, 13, 9, 11, 13, 8, 5, 11, 11, 10, 13, 10, 9, 9, 9, 14, 15, 12, 10, 10, 6, 9, 8, 16, 14, 12, 16]


*Exercise 4:* Write a function that uses the histogram function in the previous exercise to create a text-based "graph". For example the output could look like the following:
```
[  0,  1] : ######
[  1,  2] : #####
[  2,  3] : ######
[  3,  4] : ####
[  4,  5] : ####
[  5,  6] : ######
[  6,  7] : #####
[  7,  8] : ######
[  8,  9] : ####
[  9, 10] : #####
```

Where each line corresponds to a bin and the number of `#`'s are proportional to the value of the data in the bin. 

In [82]:
# Solution
def draw_histogram(x,n_bins,x_min=None,x_max=None,character="#",max_character_per_line=20):
    ### BEGIN SOLUTION

    hist, bin_edges=histogram(x, n_bins)
    
    for i in range(n_bins):
        bin_range=f"[{bin_edges[i]:}, {bin_edges[i+1]:}]"
        bar='#' * hist[i]
        print(f"{bin_range}:{bar}")
    
    ### END SOLUTION

    return hist,bin_edges

In [83]:
# Test your solution here
h,b=draw_histogram(data,20)
print(h,b)

[-9.994796377673584, -8.995819111088664]:############################################
[-8.995819111088664, -7.996841844503745]:#####################################################
[-7.996841844503745, -6.997864577918826]:########################################
[-6.997864577918826, -5.998887311333907]:#################################################################
[-5.998887311333907, -4.9999100447489875]:###############################################
[-4.9999100447489875, -4.000932778164069]:########################################
[-4.000932778164069, -3.0019555115791494]:##############################################
[-3.0019555115791494, -2.00297824499423]:######################################################
[-2.00297824499423, -1.0040009784093105]:#################################################
[-1.0040009784093105, -0.005023711824390986]:############################################
[-0.005023711824390986, 0.9939535547605285]:###############################################

## Functional Programming

*Exercise 5:* Write a function the applies a booling function (that returns true/false) to every element in data, and return a list of indices of elements where the result was true. Use this function to find the indices of entries greater than 0.5. 

In [84]:
def where(mylist,myfunc):
    out= []
    
    ### BEGIN SOLUTION

    out = [value for value in mylist if myfunc(value)]
    
    ### END SOLUTION
    
    return out
filtered_values=where(data, lambda x: x>0.5)

In [85]:
# Test your solution here
filtered_values

[7.45305789086952,
 7.4892070841424925,
 1.13148807870685,
 2.6025836325731095,
 9.179728891917541,
 3.247848336324605,
 8.904162170181756,
 6.403511672623036,
 4.99042373146148,
 1.97222314262911,
 2.6372466043191896,
 3.094309896257778,
 9.830508813007665,
 3.5217171130309133,
 7.953687213353067,
 7.276106229081115,
 9.674756516209385,
 0.532419936776023,
 9.826172945959158,
 4.760061270798033,
 6.396260806178745,
 1.2412629666212531,
 1.7775702723694398,
 5.635134516530924,
 1.6536868555447555,
 7.970979278089821,
 7.652153319896328,
 0.8049042475422379,
 4.149456992857047,
 8.335691747371158,
 3.781881850572262,
 2.1950784482515733,
 4.931194765493094,
 5.282588955497937,
 5.810393823018924,
 6.2872494107272665,
 2.1361370307351706,
 7.915833073516584,
 9.96933170452371,
 5.706878445186469,
 7.995936523051871,
 2.9872376533270124,
 3.7265088388826655,
 9.786470124396814,
 9.63566171171886,
 4.362452295100255,
 0.6473688463186171,
 3.0411160863172686,
 6.682051436895339,
 5.40541963

*Exercise 6:* The `inrange(mymin,mymax)` function below returns a function that tests if it's input is between the specified values. Write corresponding functions that test:
* Even
* Odd
* Greater than
* Less than
* Equal
* Divisible by

In [86]:
def in_range(mymin,mymax):
    def testrange(x):
        return x<mymax and x>=mymin
    return testrange

# Examples:
F1=inrange(0,10)
F2=inrange(10,20)

# Test of in_range
print (F1(0), F1(1), F1(10), F1(15), F1(20))
print (F2(0), F2(1), F2(10), F2(15), F2(20))

print ("Number of Entries passing F1:", len(where(data,F1)))
print ("Number of Entries passing F2:", len(where(data,F2)))

NameError: name 'inrange' is not defined

In [None]:
### BEGIN SOLUTION

def even(x):
    if x%2==0:
        return f"True, {x} is even"
    else:
        return f"False, {x} is not even"

def odd(x):
    if x%2!=0:
        return f"True, {x} is odd"
    else:
        return f"False, {x} is not odd"

def greater_than(value,x):
    if x>value:
        return f"True, {x} is greater than {value}"
    else:
        return f"False, {x} is not greater than {value}"

def less_than(value, x):
    if x<value:
        return f"True, {x} is less than {value}"
    else:
        return f"False, {x} is not less than {value}"

def equal(value, x):
    if x==value:
        return f"True, {x} is equal to {value}"
    else:
        return f"False, {x} is not equal to {value}"

def divisible_by(value, x):
    if x%value==0:
        return f"True, {x} is divisible by {value}"
    else:
        return f"False, {x} is not divisible by {value}"
    
### END SOLUTION

In [None]:
# Test your solution
print(even(6))
print(odd(3))
print(greater_than(4, 2))
print(less_than(7, 9))
print(equal(1, 1))
print(divisible_by(3, 9))

*Exercise 7:* Repeat the previous exercise using `lambda` and the built-in python functions sum and map instead of your solution above. 

In [None]:
### BEGIN SOLUTION

def even():
    return lambda x: x%2==0

def odd():
    return lambda x: x%2!=0

def greater_than(value):
    return lambda x: x>value

def less_than(value):
    return lambda x: x<value

def equal(value):
    return lambda x: x==value

def divisible_by(value):
    return lambda x: x%value==0        
    
### END SOLUTION

## Monte Carlo

*Exercise 7:* Write a "generator" function called `generate_function(func,x_min,x_max,N)`, that instead of generating a flat distribution, generates a distribution with functional form coded in `func`. Note that `func` will always be > 0.  

Use the test function below and your histogramming functions above to demonstrate that your generator is working properly.

Hint: A simple, but slow, solution is to a draw random number `test_x` within the specified range and another number `p` between the `min` and `max` of the function (which you will have to determine). If `p<=function(test_x)`, then place `test_x` on the output. If not, repeat the process, drawing two new numbers. Repeat until you have the specified number of generated numbers, `N`. For this problem, it's OK to determine the `min` and `max` by numerically sampling the function.  

In [None]:
def generate_function(func,x_min,x_max,N=1000):
    out = list()
    ### BEGIN SOLUTION
    
    min_func=float('inf')
    max_func=float('-inf')

    for x in range(x_min, x_max+1):
        value=func(x)
        if value<min_func:
            min_func=value
        if value>max_func:
            max_func=value
    
    while len(out)<N:
        test_x=random.uniform(x_min, x_max)
        p=random.uniform(min_func, max_func)
        if p<=func(test_x):
            out.append(test_x)
    
    ### END SOLUTION
    
    return out

In [None]:
# A test function
def test_func(x,a=1,b=1):
    return abs(a*x+b)
data=generate_function(test_func, 10, 20)
h,b=draw_histogram(data,20)
print(h,b)

*Exercise 8:* Use your function to generate 1000 numbers that are normal distributed, using the `gaussian` function below. Confirm the mean and variance of the data is close to the mean and variance you specify when building the Gaussian. Histogram the data. 

In [None]:
import math

def gaussian(mean, sigma):
    def f(x):
        return math.exp(-((x-mean)**2)/(2*sigma**2))/math.sqrt(math.pi*sigma)
    return f

# Example Instantiation
g1=gaussian(0,1)
g2=gaussian(10,3)

*Exercise 9:* Combine your `generate_function`, `where`, and `in_range` functions above to create an integrate function. Use your integrate function to show that approximately 68% of Normal distribution is within one variance.

In [89]:
def integrate(func, x_min, x_max, n_points=1000):
    out=generate_function(func, x_min, x_max, n_points)
    range_count=len(where(out, lambda x: in_range(x_min, x_max)))
    integral=(range_count/n_points)*(x_max-x_min)
    return integral

In [92]:
g1=gaussian(0,1)
integral=integrate(g1, -1, 1)
print(integral)

2.0
