# Project 4 - Polynomial Regression from Scratch

In this notebook, you will write code that will:
-  generate a dataset, 
-  model it using a polynomial regression, 
-  and find the optimal model parameters for it (writing your own optimization routine).

All of the parts of this are meant to be done "from scratch", so do not use sklearn. (numpy's ok)  
Note: this is a continuation of the exercise we started in class previously.  

In [148]:
import numpy as np
import sys as sys

## Dataset
Create a dataset with 1,000 1-D X and corresponding y values, each w/ a small amount of added uniform random noise (max val = 0.1).  (So unlike example in class which had a surface and 2 features, this is like the book example with a curve with 1 feature.)

-  make X go from -5 to 5 (plus noise)
-  make $y = 5 + 7x + 2x^2 - 0.5x^3$   (plus noise)

In [183]:
#Note: I'm making this pretty verbose because it helps a lot with debugging.
#Create our blank X dataset
x_values = np.linspace(-5.0,5.0,1000)

#be verbose
print("first 10 values of x_values:")
print("")
print(x_values[0:10])
print("")

#This function adds a number between -0.1 and 0.1 to the value at i in x_values. I had to pull in the smallest possible
#number that the host machine can support in python, because random.uniform doesn't include the end number
#and the MAX number is 0.1. That was the only way to get it in there that I could think of.
def add_noise(numInput, verbose = False):
    
    lowest_add = -(0.1)
    highest_add = (0.1 + sys.float_info.min)
    
    add_num = np.random.uniform(lowest_add, highest_add)
    
    if(verbose):
        print("added " + str(add_num) + " as noise")
        
    output = numInput + add_num
    return output

#add noise to our data, and only be verbose for part of the set.
for i in range(0, len(x_values)):
    
    if(i < 5):
        x_values[i] = add_noise(x_values[i], verbose=True)#be verbose
    else:
        x_values[i] = add_noise(x_values[i])

#be verbose
print("")
print("first 10 noisy x_values:")
print("")
print(x_values[0:10])
print("")
       
#This calculates an exact y value given an x. We'll add noise in a bit.
def calculate_y(x):
    y = ((5) + (7*x) + (2*(x**2)) + (-0.5*(x**3)))#maybe a little overkill on the parentheses
    return y
    
#Create a blank array that's the same length as x_values, and fill with zeroes for now.
y_values = np.zeros(len(x_values))

#calculate the value of y that goes in each slot of the array.
for i in range(0, len(x_values)):
    y_values[i] += calculate_y(x_values[i])
    
#be verbose
print("first 10 values of y_values:")
print("")
print(y_values[0:10])
print("")

#Add noise to our y_values. This won't affect much, as y values tend to be a LOT bigger than x values.
for i in range(0, len(y_values)):
    
    if(i < 5):
        y_values[i] = add_noise(y_values[i], verbose=True)#be verbose
    else:
        y_values[i] = add_noise(y_values[i])
        
#be verbose
print("")
print("first 10 noisy y_values;")
print("")
print(y_values[0:10])
print("")
        
#make our container for everything! Also append our arrays to it.       
matrix = []
matrix.append(x_values)
matrix.append(y_values)

#Be verbose
print("")
print("matrix shape is: " + str(np.shape(matrix)) + " and the [0][0] value is: " + str(matrix[0][0]))
print("")

print("now transposing matrix")
print("")

#put it in column-standard configuration, and print out an index that should stay the same to make sure it worked
matrix = np.transpose(matrix)

print("matrix shape is: " + str(np.shape(matrix)) + " and the [0][0] value is: " + str(matrix[0][0]))
print("")

first 10 values of x_values:

[-5.         -4.98998999 -4.97997998 -4.96996997 -4.95995996 -4.94994995
 -4.93993994 -4.92992993 -4.91991992 -4.90990991]

added -0.04014074904760592 as noise
added 0.03384012863295921 as noise
added 0.04750877934612252 as noise
added 0.013394016114333399 as noise
added 0.07863653426609693 as noise

first 10 noisy x_values:

[-5.04014075 -4.95614986 -4.9324712  -4.95657595 -4.88132343 -5.00756595
 -5.01264848 -4.87451333 -5.01612164 -4.82082778]

first 10 values of y_values:

[84.54244732 80.30379277 79.13296298 80.32495838 76.6397965  82.8826247
 83.1402691  76.31153078 83.31661402 73.75390232]

added -0.022398825036718015 as noise
added 0.010068683787437901 as noise
added 0.021359856879399897 as noise
added 0.06717950372088366 as noise
added -0.07940396424722139 as noise

first 10 noisy y_values;

[84.5200485  80.31386145 79.15432284 80.39213789 76.56039253 82.86775276
 83.19319144 76.21617264 83.3143578  73.78396463]


matrix shape is: (2, 1000) and th

## Augment X
Write a function that adds new columns to the dataset of powers of X, up to and including $X^5$ (and don't forget the ones for the $\theta_0$ term).

In [179]:
#my interpretation is that I am adding columns of x in which x is raised to the n'th power. These columns will
#be placeholders that will get multiplied by our fifth-degree polynomial's weights later for testing. I don't see this 
#actually happening in real use-cases, because it seems a bit expensive for large data sets.

#Grab our x_values array for easy refrence and use here.
x_values = matrix[:,0]

#raises all values in an array to the specified power, and returns the new array
def raise_array_values_to_power(array, power):
    return_array = np.zeros(len(array))#make a new array to put everything in to stay safe
    for i in range(0, len(array)):
        return_array[i] += array[i]**power
    return array

#raises all x values to n. Be careful not to corrupt refrence array!
x_to_zero = raise_array_values_to_power(x_values, 0)
x_to_one = raise_array_values_to_power(x_values, 1)
x_to_two = raise_array_values_to_power(x_values, 2)
x_to_three = raise_array_values_to_power(x_values, 3)
x_to_four = raise_array_values_to_power(x_values, 4)
x_to_five = raise_array_values_to_power(x_values, 5)

#be verbose
print("first 5 values of x_values:")
print(x_values[0:5])
print("")
print("first 5 values of x_to_zero")
print(x_to_zero[0:5])#should all be 1

first 5 values of x_values:
[-4.95929687 -5.0471254  -4.94798953 -5.02139682 -5.02132325]

first 5 values of x_to_zero
[-4.95929687 -5.0471254  -4.94798953 -5.02139682 -5.02132325]


## Fit a Polynomial Regression model to the training data
Assume that we have a polynomial regression model
\begin{equation*}
y(X;\theta) = \theta_0 + \theta_1 x + \theta_2 x^2 +\theta_3 x^3 +\theta_4 x^4 +\theta_5 x^5 
\end{equation*}

Assume that we're using a mean squared error function.

-  Find the optimal value of theta (ie $\theta_0$ through $\theta_5$). Note: Refer to Ch. 4, Eqn's 4.6 and 4.7.

-  Try this for a variety of different values of alpha.  Note how long it takes for the optimization to converge (or if it doesn't).


## Plot the Final Model
Make a plot showing the (X,y) data points of the training set, and superimpose the line for the model on the same plot.

## Different Model Degrees
Try the model for different degrees of n, specifically n = (2, 5, 10).  Plot the resulting models.