# Project 4 - Polynomial Regression from Scratch

In this notebook, you will write code that will:
-  generate a dataset, 
-  model it using a polynomial regression, 
-  and find the optimal model parameters for it (writing your own optimization routine).

All of the parts of this are meant to be done "from scratch", so do not use sklearn. (numpy's ok)  
Note: this is a continuation of the exercise we started in class previously.  

In [13]:
import numpy as np
import sys as sys
import matplotlib.pyplot as plt

## Dataset
Create a dataset with 1,000 1-D X and corresponding y values, each w/ a small amount of added uniform random noise (max val = 0.1).  (So unlike example in class which had a surface and 2 features, this is like the book example with a curve with 1 feature.)

-  make X go from -5 to 5 (plus noise)
-  make $y = 5 + 7x + 2x^2 - 0.5x^3$   (plus noise)

In [5]:
#Note: I'm making this pretty verbose because it helps a lot with debugging.
#Create our blank X dataset
x_values = np.linspace(-5.0,5.0,1000)

#be verbose
print("first 10 values of x_values:")
print("")
print(x_values[0:10])
print("")

#This function adds a number between -0.1 and 0.1 to the value at i in x_values. I had to pull in the smallest possible
#number that the host machine can support in python, because random.uniform doesn't include the end number
#and the MAX number is 0.1. That was the only way to get it in there that I could think of.
def add_noise(numInput, verbose = False):
    
    lowest_add = -(0.1)
    highest_add = (0.1 + sys.float_info.min)
    
    add_num = np.random.uniform(lowest_add, highest_add)
    
    if(verbose):
        print("added " + str(add_num) + " as noise")
        
    output = numInput + add_num
    return output

#add noise to our data, and only be verbose for part of the set.
for i in range(0, len(x_values)):
    
    if(i < 5):
        x_values[i] = add_noise(x_values[i], verbose=True)#be verbose
    else:
        x_values[i] = add_noise(x_values[i])

#be verbose
print("")
print("first 10 noisy x_values:")
print("")
print(x_values[0:10])
print("")
       
#This calculates an exact y value given an x. We'll add noise in a bit.
def calculate_y(x):
    y = ((5) + (7*x) + (2*(x**2)) + (-0.5*(x**3)))#maybe a little overkill on the parentheses
    return y
    
#Create a blank array that's the same length as x_values, and fill with zeroes for now.
y_values = np.zeros(len(x_values))

#calculate the value of y that goes in each slot of the array.
for i in range(0, len(x_values)):
    y_values[i] += calculate_y(x_values[i])
    
#be verbose
print("first 10 values of y_values:")
print("")
print(y_values[0:10])
print("")

#Add noise to our y_values. This won't affect much, as y values tend to be a LOT bigger than x values.
for i in range(0, len(y_values)):
    
    if(i < 5):
        y_values[i] = add_noise(y_values[i], verbose=True)#be verbose
    else:
        y_values[i] = add_noise(y_values[i])
        
#be verbose
print("")
print("first 10 noisy y_values;")
print("")
print(y_values[0:10])
print("")
        
#make our container for everything! Also append our arrays to it.       
matrix = []
matrix.append(x_values)
matrix.append(y_values)

#Be verbose
print("")
print("matrix shape is: " + str(np.shape(matrix)) + " and the [0][0] value is: " + str(matrix[0][0]))
print("")

print("now transposing matrix")
print("")

#put it in column-standard configuration, and print out an index that should stay the same to make sure it worked
matrix = np.transpose(matrix)

print("matrix shape is: " + str(np.shape(matrix)) + " and the [0][0] value is: " + str(matrix[0][0]))
print("")

first 10 values of x_values:

[-5.         -4.98998999 -4.97997998 -4.96996997 -4.95995996 -4.94994995
 -4.93993994 -4.92992993 -4.91991992 -4.90990991]

added 0.05687665735347158 as noise
added 0.09759005128513398 as noise
added 0.01023281253616451 as noise
added -0.007461573855120779 as noise
added -0.01504691882345463 as noise

first 10 noisy x_values:

[-4.94312334 -4.89239994 -4.96974717 -4.97743154 -4.97500688 -4.9369689
 -4.98281958 -4.98737766 -4.83075106 -4.92528407]

first 10 values of y_values:

[79.65836887 77.17556287 80.98091283 81.3651259  81.24377381 79.35454685
 81.63519021 81.86408446 74.22263395 78.77967058]

added 0.09684158252833383 as noise
added -0.0933985821782894 as noise
added 0.019314937686318714 as noise
added 0.05113621269848806 as noise
added 0.02712874030098894 as noise

first 10 noisy y_values;

[79.75521045 77.08216428 81.00022777 81.41626211 81.27090255 79.31211411
 81.67561844 81.76584321 74.29679342 78.80415005]


matrix shape is: (2, 1000) and the [

## Augment X
Write a function that adds new columns to the dataset of powers of X, up to and including $X^5$ (and don't forget the ones for the $\theta_0$ term).

In [9]:
#my interpretation is that I am adding columns of x in which x is raised to the n'th power. These columns will
#be placeholders that will get multiplied by our fifth-degree polynomial's weights later for testing. I don't see this 
#actually happening in real use-cases, because it seems a bit expensive for large data sets.

#Grab our x_values array for easy refrence and use here.
x_values = matrix[:,0]
                  
print("matrix shape:")
print(str(np.shape(matrix)))
print("")

#raises all values in an array to the specified power, and returns the new array
def raise_array_values_to_power(array, power):
    return_array = np.zeros(len(array))#make a new array to put everything in to stay safe
    for i in range(0, len(array)):
        return_array[i] += array[i]**power
    return return_array

#raises all x values to n. Be careful not to corrupt refrence array! I'm using all this as a test of the power raising
#function. the real function that can add n power columns will come below.
x_to_zero = raise_array_values_to_power(x_values, 0)
x_to_one = raise_array_values_to_power(x_values, 1)
x_to_two = raise_array_values_to_power(x_values, 2)
x_to_three = raise_array_values_to_power(x_values, 3)
x_to_four = raise_array_values_to_power(x_values, 4)
x_to_five = raise_array_values_to_power(x_values, 5)

#be verbose
print("first 5 values of y_values:")#our labels
print("")
print(y_values[0:5])
print("")
print("first 5 values of x_values:")
print(x_values[0:5])
print("")
print("first 5 values of x_to_zero")
print(x_to_zero[0:5])#should all be 1
print("")
print("first 5 values of x_to_one")
print(x_to_one[0:5])

#Now, if I wasn't using a function after this, I'd add these as COLUMNS on the "right" side of the matrix
#Desired Matrix Column Layout: x_values, y_values, x_to_zero, x_to_one, x_to_two, x_to_three, x_to_four, x_to_five

def add_column_powers(reference_array_to_raise, matrix_to_add_to, highest_power):
    
    for i in range(0, highest_power+1):#for each power, make a new array and slap it on the end.
        array = reference_array_to_raise
        array = raise_array_values_to_power(array, i)#make the array
        array = array[:, np.newaxis]#make it a column vector!
        matrix_to_add_to = np.concatenate([matrix_to_add_to, array], axis=1)#Stick it on the end of the matrix
   
    return matrix_to_add_to
    
newMatrix = add_column_powers(x_values, matrix, 5)

#be verbose
print("")
print("newMatrix shape")
print(np.shape(newMatrix))
print("")
print("first row of matrix:")
print("")
print(newMatrix[0])

matrix shape:
(1000, 8)

first 5 values of y_values:

[79.75521045 77.08216428 81.00022777 81.41626211 81.27090255]

first 5 values of x_values:
[-4.94312334 -4.89239994 -4.96974717 -4.97743154 -4.97500688]

first 5 values of x_to_zero
[1. 1. 1. 1. 1.]

first 5 values of x_to_one
[-4.94312334 -4.89239994 -4.96974717 -4.97743154 -4.97500688]

newMatrix shape
(1000, 14)

first row of matrix:

[-4.94312334e+00  7.97552105e+01  1.00000000e+00 -4.94312334e+00
  2.44344684e+01 -1.20782591e+02  5.97043245e+02 -2.95125840e+03
  1.00000000e+00 -4.94312334e+00  2.44344684e+01 -1.20782591e+02
  5.97043245e+02 -2.95125840e+03]


## Fit a Polynomial Regression model to the training data
Assume that we have a polynomial regression model
\begin{equation*}
y(X;\theta) = \theta_0 + \theta_1 x + \theta_2 x^2 +\theta_3 x^3 +\theta_4 x^4 +\theta_5 x^5 
\end{equation*}

Assume that we're using a mean squared error function.

-  Find the optimal value of theta (ie $\theta_0$ through $\theta_5$). Note: Refer to Ch. 4, Eqn's 4.6 and 4.7.

-  Try this for a variety of different values of alpha.  Note how long it takes for the optimization to converge (or if it doesn't).


In [12]:
#calculate the mean squared error between two vectors. (arrays.)
def mse(guess, truth):
    error = 0
    for i in range(0, len(guess)):
        error += (guess[i] - truth[i])
    error = error*(1/len(guess))
    return error

#This will create an array of evenly spaced valuef from which to start searching.
def get_starting_locations_1D(matrix, split_num):

    x_values = matrix[:,0]

    #Find out how long the splits are 
    split_length = len(x_values)/split_num 

    #create an array of values from which to start evenly spaced
    x_splits = np.zeros(split_num)
    
    for i in range(0, split_num):
        x_splits[i] = x_values[i*split_length]
        
    return x_splits
            

#assumes that the input matrix is in our format, and returns best theta weights for the powers of x.
def calculate_best_theta(matrix):
    
        x_values = matrix[:,0]
        y_values = matrix[:,1]
        
        #this is because of the x and y columns
        amount_of_powers = len(matrix[0] - 2)
        
        #make our theta array and populate it with zeros.
        theta = np.zeros(amount_of_powers) 
        
        #get our starting locations. I chose to have five.
        x_splits = get_starting_locations_1D(matrix, 5)
        
        
        
            
    

## Plot the Final Model
Make a plot showing the (X,y) data points of the training set, and superimpose the line for the model on the same plot.

## Different Model Degrees
Try the model for different degrees of n, specifically n = (2, 5, 10).  Plot the resulting models.