# Multiple Linear Regression

## Importing the libraries

In [9]:
# as np, as plt, are aliases (just like in SQL) which cuts your typing time down
import numpy as np 
# this library allows you to work with arrays
import matplotlib.pyplot as plt 
#chart plotting
import pandas as pd 
#allow us to not only import the dataset, but also create the matrix of features, 
#and the dependent variable vector.

## Importing the dataset

In [10]:
dataset = pd.read_csv('50_Startups.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

## Encoding categorical data

Feature Scaling is not needed for multiple linear regressions, the Coefficient takes care of it

In [11]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(),[3])], remainder='passthrough')
# 4th column needs to ber encoded
X = np.array(ct.fit_transform(X))

## Splitting the dataset into the Training set and Test set

In [4]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split (X, y, test_size = .2, random_state =0)

## Training the Multiple Linear Regression model on the Training set

The class that we're about to call to build our Multiple Linear Regression model will automatically identify the best features, that have the highest P values that are the most statistically significant to figure out how to predict the dependent variable (profit) with the highest accuracy

It will be trained to understand the correlations between all your features all your features and the profit,which is your dependent variable, and then it will take care of the dummy variable trap and it will also take care of selecting the best
features that are the most statistically significant.

In [5]:
from sklearn.linear_model import LinearRegression 
# LinearRegression is the CLASS and a class, has not only instructions, but also tools called METHODS that complete actions
# like training a model on a certain training set or predicting some future results on the test set.
rg = LinearRegression() # no parameters needed, you can use the default
# to call a METHOD we take the object and add a . then the METHOD itself
# the fit method will train the simple linear regression model on the training set
# X_train contains the independent variables of the training set(features) and y_train the dependent variable vector
rg.fit(X_train, y_train)


LinearRegression()

## Predicting the Test set results

In [6]:
y_pred = rg.predict(X_test)
# to  visualize the training set results and the test set results first we put the predictions into a new variable (y_pred)

np.set_printoptions(precision=2)   #displays any numerical value w only 2 decimal points
# to display the vectors profits and predicted profits next to each other use concatenate (which is a function on np)

print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1)) 

#concatenate the 2 vectors of real profits and predicted profits so they are next to each other 
#Concatenate is a function of NumPy that allows to concatenate either vertically or horizontally, two vectors or even arrays
# that are the same shape.  #first argument, is the tuple of arrays you want to concatenate, in this case y_pred and 
# y_test   .reshape is an atribute function that reshapes vectors or arrays. The input for that fucntions is the number
#of elements in y_pred so use the len fucntion to figure that out and the 1 tell reshape there's only 
#1 coulmn (len(y_pred),1) Then repeat #for y-test.  
#this is the first argument for concatenate ((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)) the second 
# is always 0 or 1   0 means that we wanna do a vertical concatenation and 1 means that we wanna do a horizontal
#concatenation.


[[103015.2  103282.38]
 [132582.28 144259.4 ]
 [132447.74 146121.95]
 [ 71976.1   77798.83]
 [178537.48 191050.39]
 [116161.24 105008.31]
 [ 67851.69  81229.06]
 [ 98791.73  97483.56]
 [113969.44 110352.25]
 [167921.07 166187.94]]
