#  **Simple and Multiple Linear Regression**
Lab Exercises - Week 2

----------

## **Notebook Contents:**
1. Inroduction to NumPy functions. (Scope: Lab Exercise)
2. Intoduction to Scikit-Learn functions. (Scope: Lab Exercise)
3. Simple Linear Regresion using NumPy functions.
4. Simple Linear Regresion using Scikit-Learn.
5. Multiple Linear Regression using NumPy functions.
6. Multiple Linear Regression using Scikit-Learn.
7. Exercise questions

----------

### **Python Libraries:**

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn import linear_model as lm

### **1. Inroduction to NumPy:**

In [None]:
#Declaring an array
arr = np.array([[1,2,3],[4,5,6]])

print("Array dimensions:\n", arr.shape)
print("Array preview:\n", arr)

In [None]:
#Function to generate a Matrix with all values as 1.
identityMatrix = np.ones((2,2))
print("Identity Matrix:\n",identityMatrix)

#Function to stack so as to make a single Matrix horizontally.
x = np.hstack((identityMatrix,arr))
print("Stacking arrays:\n",x)

In [None]:
#Dot Product 
#Calculation: [[7*11+8*13, 7*12+8*14],[9*11+10*13, 9*12+10*14]]
a = np.array([[7,8],[9,10]]) 
b = np.array([[11,12],[13,14]]) 
print(np.dot(a,b))

In [None]:
#Transpose
mat = np.array([[7,8],[9,10],[11,12],[13,14]]) 

print("Original Matrix:\n", mat)
print("Tranposed Matrix:\n", np.transpose(mat))

In [None]:
# Function to calculate the inverse of a matrix
mat = np.array([[7,8],[9,10]])
print("Matrix Inverse:\n", np.linalg.inv(mat))

### **2. Intoduction to Scikit-Learn:**

Scikit-learn also known as sklearn will be used to create your models. Scikit-learn is one of the most popular libraries for modeling the types of data typically stored in DataFrames.

The steps to building and using a model are:<br>

1. **Define**: Identifying the type of model that will be used. <br>
2. **Fit**: Capture patterns from provided data. This is the heart of modeling.<br>
3. **Predict**: Model created will be used to make predictions.<br>
4. **Evaluate**: Determine how accurate the model's predictions are.<br>

Below is an example of the four steps described for Linear Regression.

**Import model:** <br>
from sklearn import linear_model as lm

**Define model:** <br>
model = lm.LinearRegression()

**Fit model:** <br>
model.fit(X, y)

**Predict:** <br>
model.predict(X.head())

### **3. Simple Linear Regresion using NumPy functions:**

In [None]:
# 'x1' functions as an independent variable and 'y' as a dependent variable 
y = np.array([[1.55],[0.42],[1.29],[0.73],[0.76],[-1.09],[1.41],[-0.32]])
x1 = np.array([[1.13],[-0.73],[0.12],[0.52],[-0.54],[-1.15],[0.20],[-1.09]])

**Input Dataframe:**
![Matrix Transpose](https://i.ibb.co/m50j7HW/MLP1.png)
 **Regression Coefficients:**
![Regression Coefficients](https://i.ibb.co/19j2rjt/MLP2.png)

In [None]:
#Generating regression coefficients
id = np.ones((8,1))
x = np.hstack((id,x1))
beta=(np.dot(np.dot(np.linalg.inv(np.dot(x.transpose(),x)),x.transpose()),y))
print(beta)

In [None]:
#Result - Calculation
yp1 = beta[0]+beta[1]*x1
print(np.hstack((x1,y,yp1)))

### **4. Simple Linear Regresion using Scikit-Learn:**

In [None]:
#Input Dataframe
d = pd.DataFrame(np.hstack((x1,y)))
d.columns = ["x1","y"]
print(d)

In [None]:
#Linear Regression - model fitting
model = lm.LinearRegression()
results = model.fit(x1,y)
print(model.intercept_, model.coef_)

In [None]:
#Result: Scikit-Learn
yp2 = model.predict(x1)
print(yp2)

In [None]:
#Linear Regression representation using scatter plot
plt.scatter(x1,y)
plt.plot(x1,yp2, color="blue")
plt.show()

In [None]:
#Prediction for new values
x1new = pd.DataFrame(np.hstack(np.array([[1],[0],[-0.12],[0.52]])))
x1new.columns=["x1"]
yp2new = model.predict(x1new)
print(yp2new)

### **5. Multiple Linear Regression using NumPy functions:**

In [None]:
# Input Dataframe
y = np.array([[1.55],[0.42],[1.29],[0.73],[0.76],[-1.09],[1.41],[-0.32]])
x1 = np.array([[1.13],[-0.73],[0.12],[0.52],[-0.54],[-1.15],[0.20],[-1.09]])
x2 = np.array([[1],[0],[1],[1],[0],[1],[0],[1]])

In [None]:
id = np.ones((8,1))
x = np.hstack((id,x1,x2))
print(x)

In [None]:
# Calculating regression coefficients 
beta=(np.dot(np.dot(np.linalg.inv(np.dot(x.transpose(),x)),x.transpose()),y))
print(beta)

In [None]:
#Result - Calculation
yp1 = beta[0]+beta[1]*x1+beta[2]*x2
print(np.hstack((x,y,yp1)))

### **6. Multiple Linear Regression using Scikit-Learn:**

In [None]:
#Input Dataframe
d = pd.DataFrame(np.hstack((x1,x2,y)))
d.columns = ["x1","x2","y"]
print(d)

In [None]:
#Multiple Linear Regression - Model Fitting
inputDF = d[["x1","x2"]]
model = lm.LinearRegression()
results = model.fit(inputDF,y)

print(model.intercept_, model.coef_)

In [None]:
#Result: Scikit-Learn
yp2 = model.predict(inputDF)
yp2

In [None]:
#Prediction for new values
x1new = pd.DataFrame(np.hstack((np.array([[1],[0],[-0.12],[0.52]]),np.array([[1],[-1],[2],[0.77]]))))
x1new.columns=["x1","x2"]
yp2new = model.predict(x1new)
print(np.hstack((x1new,yp2new)))

### **7. Exercise Questions:**

Q1. Using **survey.csv**, build a simple linear regression based model using "Height" as a dependent variable and "WrHnd" as independent variable.

In [None]:
d=pd.read_csv("../input/survey.csv")
d=d.rename(index=str,columns={"Wr.Hnd":"WrHnd"})
d = d[["WrHnd","Height"]]
#.head()
print(d.isnull().values.any())
print(d.isnull().sum())

In [None]:
#Checking for Null/NaN values
d = d.dropna()
print("Check for NaN/null values:\n",d.isnull().values.any())
print("Number of NaN/null values:\n",d.isnull().sum())

In [None]:
# Simple Linear Regression 
inputDF = d[["WrHnd"]]
outcomeDF = d[["Height"]]
model = lm.LinearRegression()
results = model.fit(inputDF,outcomeDF)

print(model.intercept_, model.coef_)

Q2. Using **clock.csv**, build a multiple linear regression based model using "Price" as a dependent variable and, "Bidders" and "Age" as independent variables.

In [None]:
d = pd.read_csv("../input/clock.csv")
print(d.head())
print("Check for NaN/null values:\n",d.isnull().values.any())
print("Number of NaN/null values:\n",d.isnull().sum())

In [None]:
#Multiple Linear Regression
inputDF = d[["Bidders","Age"]]
outputDF = d[["Price"]]

model = lm.LinearRegression()
results = model.fit(inputDF,outputDF)

print(model.intercept_, model.coef_)

**Voilà! This is the end of the lab session for week 2.** <br>
Do not forget to commit your notebook and set the access to private. Share the notebook with Prof. Karim (Kaggle id: karimshaikh) and Manish Varma (Kaggle id: manishvarma).