# Lecture 21

## Scaling/Shifting

In the beginning of Lab 4 you are asked to take random numbers between 0 and 1 and scale and shift them to be between $x_{min}$ and $x_{max}$. The formula is pretty basic. If $x_0$ is between 0 and 1 then $x$ computed as:
$$
x= (x_{max}-x_{min}) x_0 + x_{min}
$$
will be between $x_{min}$ and $x_{max}$. 

In your solution, you'll most likely generate $x_0$ one by one, compute $x$, and store $x$ into a list to be returned from your function.


## Mean/Variance

Also for lab 4, remember the equations for mean/variance. If you have a data sample ${x_1, x_2, ..., x_N}$ the mean is:

$$ 
\bar{x} = \frac{1}{N}\sum_{i=1}^{N} x_i
$$

and the variance is:

$$
<x^2> = \frac{1}{N-1} \sum_{i=1}^{N} (x_i - \bar{x})^2
$$


## Min, Max, ArgMin, ArgMax

Consider a list of random numbers:

In [1]:
import random
data = [random.random() for _ in range(100)]

In [9]:
data

[0.3714741752528551,
 0.37020741561247006,
 0.09718712005733443,
 0.09728770147950605,
 0.5470211906309793,
 0.259093984499544,
 0.5091774103972622,
 0.920652063608099,
 0.3537106521831658,
 0.8535299241329874,
 0.47748365120492586,
 0.14978876272204655,
 0.9909624004708119,
 0.35808175734042025,
 0.36243527395838815,
 0.15613107448960362,
 0.698668470127755,
 0.9220726934442105,
 0.29242856159332087,
 0.32374396322373067,
 0.7532203624830164,
 0.5762385322802838,
 0.655406750338706,
 0.5445177420468859,
 0.961064591288397,
 0.6695614408021253,
 0.8584022777348199,
 0.9848181680704408,
 0.15368876578762924,
 0.8006577423638463,
 0.10662674509841952,
 0.13821264326949279,
 0.17868802707303,
 0.6941254048327877,
 0.752222492655567,
 0.3402314262904865,
 0.44143233599976495,
 0.01364884184799331,
 0.5870669169344229,
 0.28367240818207773,
 0.4127628265636737,
 0.7602866876586336,
 0.10472217169202647,
 0.3529531225918847,
 0.2263129461468424,
 0.1708973387416557,
 0.4778513111465653,
 0.8

You find the largest and smallest numbers in the list:

In [2]:
max(data),min(data)

(0.9909624004708119, 0.004395744060134099)

It is convenient that `max` and `min` are available in python, but let's think about how we would implement one of these functions:

In [3]:
def find_max(d):
    a_max=d[0]
    for e in d:
        if e>a_max:
            a_max=e
    return a_max

In [4]:
find_max(data)

0.9909624004708119

While `max` gives us the largest value, we may instead be interested to know which element in the list is the largest (i.e. what is the index of the largest value)... this is where `argmax` comes in:

In [5]:
def find_argmax(d):
    a_max=d[0]
    i_max=0
    for i,e in enumerate(d):
        if e>a_max:
            a_max=e
            i_max=i
    return i_max

In [6]:
find_argmax(data)

12

## Numerical Manipulation of Mathematical Functions 

Recall that we can easily make a list of sequential intergers using `range`.

In [10]:
list(range(5,20,3))

[5, 8, 11, 14, 17]

What if we wanted to do something similar but with non-intergers, for example in step size of 1/2:

In [12]:
list(range(5.,20.,.5))

TypeError: 'float' object cannot be interpreted as an integer

Let's implement what we need:

In [7]:
def arange(x_min,x_max,step_size=1.):
    if x_max<x_min:
        return list()
    
    x=x_min
    out = list()
    while x<x_max:
        out.append(x)
        x+=step_size
    return out

An alternative similar function is:

In [13]:
def linspace(x_min,x_max,steps=10):
    step_size=(x_max-x_min)/steps
    x=x_min
    out = list()
    for i in range(steps):
        out.append(x)
        x+=step_size
    return out

Now lets use what we wrote to investigate a mathematical function:

In [8]:
def a_function(x):
    return (1+x)**2

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
def find_min(f,x_min,x_max,steps=10):
    step_size=(x_max-x_min)/steps
    x=x_min
    y_min=f(x_min)
    x_min_val=x_min

    for i in range(steps):
        y=f(x)
        if y<y_min:
            x_min_val=x
            y_min=y
        x+=step_size
    
    return x_min_val

In [None]:
find_min(lambda x: (1+x)**2,-10.,10.)

## Histogram

In Lab 4 you are asked to write a histogram function:

* User inputs a list of values `x` and optionally `n_bins` which defaults to 10.
* If not supplied, find the minimum and maximum (`x_min`,`x_max`) of the values in x.
* Determine the bin size (`bin_size`) by dividing the range of the function by the number of bins.
* Create an empty list of zeros of size `n_bins`, call it `hist`.
* Loop over the values in `x`
    * Loop over the values in `hist` with index `i`:
        * If x is between `x_min+i*bin_size` and `x_min+(i+1)*bin_size`, increment `hist[i].` 
        * For efficiency, try to use continue to goto the next bin and data point.
* Return `hist` and the list corresponding of the bin edges (i.e. of `x_min+i*bin_size`).    




## Functional Programming

In lab 3 you built a tic-tac-toe game by implementing a series of functions that performed various tasks, which you then combined in various ways to implement the game logic. What you wrote was a *structured program*, which consist of sequences of instructions, utilizing control flow (if/then/else), repetition (while and for), block structures, and function calls. 

*Functional Programming* is another style of programming that is not well suited to writing games, but is well suited to manipulating data. A functional program performs computation by evaluating mathematical functions, where the output only depend on the input. Data passes through as inputs/outputs of functions, but is otherwise never changed. This paradigm is often used in data science because manipulation of data can othen be viewed as composition of functions:

$$
D_{result} = f_n(f_{n-1}(...(f_0(D_{input}))))
$$

Consider the `find_min` example:

In [15]:
def a_function(x):
    return (1+x)**2

def find_min_0(f,x_min,x_max,steps=10):
    step_size=(x_max-x_min)/steps
    x=x_min
    y_min=f(x_min)
    x_min_val=x_min

    for i in range(steps):
        y=f(x)
        if y<y_min:
            x_min_val=x
            y_min=y
        x+=step_size
    
    return x_min_val

In [22]:
find_min_0(a_function,-10,10,100)

-1.000000000000002

Lets write the same thing in a more functional way by realizing that we can perform the same task as a set of composition of functions:

In [14]:
def linspace(x_min,x_max,steps=10):
    x=x_min
    step_size=(x_max-x_min)/steps
    out=list()
    while x<x_max:
        out.append(x)
        x+=step_size
    return out

def arg_min(lst):
    min_val=lst[0]
    min_index=0
    for i,val in enumerate(lst):
        if val<min_val:
            min_val=val
            min_index=i
            
    return min_index

def linspace(f,x_min,x_max,steps=10):
    x_vals=a_range(x_min,x_max,steps)
    y_vals=list(map(f,x_vals))
    index=arg_min(y_vals)
    return x_vals[index]



In [45]:
find_min(a_function,-10,10,100)

-1.000000000000002

Note that `find_min` can be as a single evaluation:

In [20]:
def find_min(f,x_min,x_max,steps=10):
    return a_range(x_min,x_max,steps)[arg_min(list(map(f,a_range(x_min,x_max,steps))))]


We could have implemented `a_range` and `arg_min` the same way, but instead of while loops use recursion:

In [35]:
def a_range(x_min,x_max,steps=10):
    if steps>1:
        return [x_min] + a_range(x_min+((x_max-x_min)/steps),x_max,steps-1)
    else:
        return [x_min]
        

We are not going to write functions this way, but the idea is to get familiar with seeing data manipulations as a composition of functions.