<a href="https://colab.research.google.com/github/iaagulo/Machine-Learning-Basics/blob/main/Ch02_MultivariateLR.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**2.0 Gradient Descent**

##**2.5 Functions to be used for this chapter**

This function computes for the cost function, $J(\theta_0,\theta_1,...,\theta_n)$ with $x$, $y$, and $\theta$ as input variables and $Jcost$ as the output variable.

In [2]:
def costfunction(x,y,theta):
  m = len(y)
  Jcost = (1/(2*m))*np.sum(np.power((np.dot(x,theta)-y.reshape(m,1)),2))
  return Jcost

This function computes for the new values of the parameter $\theta$ using the gradient descent method.

In [3]:
def gradient_descent(x,y,theta,a,epoch):
  J = np.zeros(epochs,dtype=float)
  m = len(y)
  for k in range(epoch):
    grad_theta = (1/m)*np.dot(np.transpose(x),((np.dot(x,theta)-y.reshape(m,1))))
    theta = theta - a*grad_theta
    J[k] = costfunction(x,y,theta)
  return grad_theta,J

This function rescales the elements of the data set.

In [4]:
def FeatureScale(X):
  Xmean = np.mean(X)
  Xstd = np.std(X)
  Xfeat = (X - Xmean)/Xstd
  return Xfeat

##**2.4 Multivariate Linear Regression**

####**2.4.1 Importing the modules**

We first import the necessary modules.

In [5]:
import numpy as np
import matplotlib.pyplot as plt

import pandas as pd

from sklearn import linear_model

####**2.4.2 Importing the Data**

The data that we need is stored in the Google Drive. So, we need to allow Google Colab to access the data.

In [None]:
from google.colab import drive
drive.mount('/content/gdrive/', force_remount=True)

In [None]:
ls

In [None]:
cd gdrive/MyDrive/'Colab Notebooks'/Book

In [None]:
ls

####**2.4.3 Removal of non-numerical elements**

We now read the data using the pandas module.

In [10]:
 data = pd.read_csv('carbig.csv', index_col=None)

The data shown describes how the miles per gallon (MPG) relates to the various characteristics of a car, such as the number of cylinders, the displacement (cubic in.), the horsepower, the weight (lbs.), and the acceleration (from 0 to 60 mph in seconds). Thus, each of these characteristics are the independent variable and the MPG is the dependent variable. We now have a data set with multiple features, and these five features dictate how the car performance in terms of MPG is.<br>
This command displays the first five rows of the data set.

In [None]:
 data.head()

Let's clean the data and remove the columns with non-numerical values.

In [None]:
data.drop(['Model','Origin','Model_Year','cyl4','org','when','Mfg'], axis=1, inplace=True)
data.head()

We rearrange the columns.

In [None]:
cols = ['Acceleration','Cylinders','Displacement','Horsepower','Weight','MPG']
data = data[cols]
data.head()

Now, let's remove rows with values of NaN. We also determine the number of rows and columns of the data set and assign these numbers to variables $m$ and $n$, respectively.

In [14]:
data = data.dropna().reset_index(drop=True)
[m,n] = data.shape

Plot the MPG agains the 5 features.

In [None]:
fig, axs = plt.subplots(1,5,figsize=(20,4))

axs[0].plot(data.Acceleration,data.MPG,'o')
axs[0].set_xlabel('Acceleration')
axs[0].set_ylabel('MPG')

axs[1].plot(data.Cylinders,data.MPG,'o')
axs[1].set_xlabel('Cylinders')
axs[1].set_ylabel('MPG')

axs[2].plot(data.Displacement,data.MPG,'o')
axs[2].set_xlabel('Displacement')
axs[2].set_ylabel('MPG')

axs[3].plot(data.Horsepower,data.MPG,'o')
axs[3].set_xlabel('Horsepower')
axs[3].set_ylabel('MPG')

axs[4].plot(data.Weight,data.MPG,'o')
axs[4].set_xlabel('Weight')
axs[4].set_ylabel('MPG')

####**2.4.4 Feature Scaling**

In [None]:
columns = ['Acceleration','Cylinders','Displacement','Horsepower','Weight']
data[columns] = data[columns].apply(FeatureScale)
print(data)

####**2.4.5 Adding the bias column**

Let's add the column corresponding to the coefficient of $\theta_0$. We'll call this column the Bias column, for reasons that will be obvious later when we discuss neural networks. The values of this column, $x_0$, are all ones. It needs to have the same number of rows as the variable cars_clean_data. 

In [None]:
data.insert(loc=0,column='Bias',value=1)
print(data)

####**2.4.6 Implementation of the Gradient Descent Method**

We now define an inital value for the parameter, $\theta$. We also set the first six columns as the input data set and the last column as the expected output. Furthermore, let's define the options for the gradient descent, i.e. the learning rate and the number of epochs.<br>
Let's run the function $gradient\_descent$. This function first solves for the gradient based on the parameter, $\theta$, and then uses this gradient to obtain new values of the $\theta$. It does this for several epochs. The final value of  after all epochs is assigned to the variable $opt\_theta$ and displayed. Finally, the cost function per epoch is also displayed.

In [None]:
theta = np.array([[20],[1],[-1],[2],[3],[0.1]])
X = data[data.columns[0:6]].to_numpy()
y = data[data.columns[6:]].to_numpy()
epochs = 5000
learning_rate = np.array([0.001])

[opt_theta,Jcost] = gradient_descent(X,y,theta,learning_rate,epochs)
print(opt_theta)

In [None]:
plt.plot(Jcost)

####**2.4.7 Expected Result**

We solve for the expected result for $\theta$ using the $LinearRegression$ function from the $sklearn module$.

In [None]:
model = linear_model.LinearRegression()
model.fit(X,y)
print(model.intercept_)
print(model.coef_)

####**2.4.8 Implementation using Machine Learning**

In [None]:
from keras.models import Sequential
from keras.layers import Dense

# create model
model = Sequential()
model.add(Dense(128, activation="tanh", input_dim=6, kernel_initializer="uniform"))
model.add(Dense(64, activation="tanh"))
model.add(Dense(1, activation="linear", kernel_initializer="uniform"))

# Compile model
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])

# Fit the model
history = model.fit(X, y, epochs=100, batch_size=10,  verbose=0)

plt.plot(history.history['loss'])