## Assumptions in Multilinear Regression
1. Linearity: The relationship between the predictors and the response is linear.
2. Independence: Observations are independent of each other.
3. Homoscedasticity: The residuals (Y - Y hat)) exhibit constant variance at all levels of the predictor.
4. Normal Distribution of Errors: The residuals of the model are normally distributed.
5. No multicollinearity: The independent variables should not be too highly correlated with each other.
Violations of these assumptions may lead to inefficiency in the regression parameters and unreliable predictions
The general formula for multiple linear regression is:

In [4]:
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf
from statsmodels.graphics.regressionplots import influence_plot
import numpy as np

In [6]:
cars = pd.read_csv("Cars.csv")
cars.head()

Unnamed: 0,HP,MPG,VOL,SP,WT
0,49,53.700681,89,104.185353,28.762059
1,55,50.013401,92,105.461264,30.466833
2,55,50.013401,92,105.461264,30.193597
3,70,45.696322,92,113.461264,30.632114
4,53,50.504232,92,104.461264,29.889149


In [14]:
cars = pd.DataFrame(cars, columns=["HP","VOL","SP","WT","MPG"])
cars.head()

Unnamed: 0,HP,VOL,SP,WT,MPG
0,49,89,104.185353,28.762059,53.700681
1,55,92,105.461264,30.466833,50.013401
2,55,92,105.461264,30.193597,50.013401
3,70,92,113.461264,30.632114,45.696322
4,53,92,104.461264,29.889149,50.504232


### Description of columns
- MPG: Milege of the car (Mile per Gallon) (This is Y-column to be predicted)
- HP: Horse Power of the car (X! column)
- VOL: Volume of the car (size) (X2 column)
- SP: Top speed of the car(Miles per Hour)(X3 column)
- WT: Weight of the car(Pounds) (X4 column)

In [17]:
cars.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 81 entries, 0 to 80
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   HP      81 non-null     int64  
 1   VOL     81 non-null     int64  
 2   SP      81 non-null     float64
 3   WT      81 non-null     float64
 4   MPG     81 non-null     float64
dtypes: float64(3), int64(2)
memory usage: 3.3 KB


In [19]:
cars.isna().sum()

HP     0
VOL    0
SP     0
WT     0
MPG    0
dtype: int64

### Observations
- There are no missing values
- Ther e