# **Mileage Prediction - Regression Analysis**

It seeks to develop a robust and interpretable regression model that accurately estimates vehicle mileage based on relevant attributes.

This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.The dataset was used in the 1983 American Statistical Association Exposition.

# **Import Library**

In [None]:
import pandas as pd

In [None]:
import numpy as np

In [None]:
import matplotlib.pyplot as plt

In [None]:
import seaborn as sns

# **Import Data**

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/YBI-Foundation/Dataset/main/MPG.csv')

In [None]:
df.head()

In [None]:
df.nunique()

# **Data Preprocessing**

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
df.corr()

# **Remove Missing Values**

In [None]:
df = df.dropna()


In [None]:
df.info()

# **Data Visualization**

In [None]:
sns.pairplot(df, x_vars = ['displacement', 'horsepower', 'weight', 'acceleration', 'mpg'], y_vars = 'mpg')

In [None]:
sns.regplot(x = 'displacement', y = 'mpg', data = df)

# **Define Target Variable y and Feature X**

In [None]:
df.columns

In [None]:
y = df['mpg']

In [None]:
y.shape

In [None]:
x = df[['displacement', 'horsepower', 'weight', 'acceleration']]

In [None]:
x.shape

# **Scaling Data**

In [None]:
from sklearn.preprocessing import StandardScaler

In [None]:
ss = StandardScaler()

In [None]:
x = ss.fit_transform(x)

In [None]:
pd.DataFrame(x).describe()

# **Train Test Split Data**

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
xtrain, xtest, ytrain, ytest = train_test_split(x, y, train_size = 0.7)

In [None]:
xtrain.shape, xtest.shape, ytrain.shape, ytest.shape

# **Modeling**

### **Linear Regression Model**

In [None]:
from sklearn.linear_model import LinearRegression

In [None]:
lr = LinearRegression()

### **Train or Fit model**

In [None]:
lr.fit(xtrain, ytrain)

In [None]:
lr.intercept_

In [None]:
lr.coef_

Mileage = 23.6 - 0.13, Displacemet = -1.42, Horsepower = -5.23, Weight = 0.22 Acceleration + error

# **Prediction**

### **Predict Test Data**

In [None]:
y_pred = lr.predict(xtest)

In [None]:
y_pred

# **Model Accuracy**

In [None]:
from sklearn.metrics import mean_absolute_error, r2_score, mean_absolute_percentage_error

In [None]:
mean_absolute_error(ytest, y_pred)

In [None]:
mean_absolute_percentage_error(ytest, y_pred)

In [None]:
r2_score(ytest, y_pred)

# **Polynomial Regression**

In [None]:
from sklearn.preprocessing import PolynomialFeatures

In [None]:
poly = PolynomialFeatures(degree = 2, interaction_only = True, include_bias = False)

In [None]:
x_train2 = poly.fit_transform(xtrain)

In [None]:
x_test2 = poly.fit_transform(xtest)

In [None]:
lr.fit(x_train2, ytrain)

In [None]:
lr.intercept_

In [None]:
lr.coef_

In [None]:
y_pred_poly = lr.predict(x_test2)

# **Model Accuracy**

In [None]:
mean_absolute_error(ytest, y_pred_poly)

In [None]:
mean_absolute_percentage_error(ytest, y_pred_poly)

In [None]:
r2_score(ytest, y_pred_poly)

# **Explaination**

The "Mileage Prediction - Regression Analysis" project is about creating a smart tool that can guess how many miles a vehicle can travel using a certain amount of fuel. Imagine you want to know how far different cars can go before they need more gas. This project uses clever calculations and data to make an educated guess. It looks at things like the car's size, weight, and how powerful its engine is. By doing this for lots of cars, we can help people choose cars that use fuel more efficiently. It's like having a mileage fortune teller that helps you pick a car that goes a long way on a single tank of gas!