# Multiple Linear Regression

### Multiple Linear Regression also follows a straight line equation

Unlike Simple Linear Regression, it has more than one independent Variables

**y = b0 + b1*x1 + b2*x2 + ... + bn*xn**

Our data contains information regarding 50 anonymous startups containing their spends(3), location and profits

## Importing the libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## Importing the dataset

In [2]:
dataset = pd.read_csv('50_Startups.csv')
x = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

## Encoding categorical data

In [3]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers = [('encoder', OneHotEncoder(), [3])], remainder = 'passthrough')
x = np.array(ct.fit_transform(x))

## Splitting the dataset into the Training set and Test set

In [4]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state = 0)

## Training the Multiple Linear Regression model on the Training set

In [5]:
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(x_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

## Predicting the Test set results

In [6]:
y_pred = regressor.predict(x_test)
np.set_printoptions(precision=2)
print(np.concatenate((y_pred.reshape(len(y_pred), 1), y_test.reshape(len(y_test), 1)), 1))

[[103015.2  103282.38]
 [132582.28 144259.4 ]
 [132447.74 146121.95]
 [ 71976.1   77798.83]
 [178537.48 191050.39]
 [116161.24 105008.31]
 [ 67851.69  81229.06]
 [ 98791.73  97483.56]
 [113969.44 110352.25]
 [167921.07 166187.94]]


## Making a single prediction
###### (for example the profit of a startup with R&D Spend = 160000, Administration Spend = 130000, Marketing Spend = 300000 and State = 'California')

In [7]:
print('\nThe profit if state is California, Spend for R&D = 160000, Administration = 130000, Marketing = 300000 is')
print(regressor.predict([[1, 0, 0, 160000, 130000, 300000]])[0])


The profit if state is California, Spend for R&D = 160000, Administration = 130000, Marketing = 300000 is
181566.92389385228


## Getting the final linear regression equation with the values of the coefficients

In [8]:
co = (regressor.coef_)
b = (regressor.intercept_)

print('\nEquation for this regression is')
print(f'Profit = {co[0]} x State 1 + {co[1]} x State 2 + {co[2]} x State 3 + {co[3]} x R&D + {co[4]} x Administration + {co[5]} x Marketing + {b}')


Equation for this regression is
Profit = 86.63836918478817 x State 1 + -872.6457908797435 x State 2 + 786.0074216944598 x State 3 + 0.7734671927326701 x R&D + 0.03288459753628347 x Administration + 0.03661002586393644 x Marketing + 42467.52924854249
