# <font color=darkblue> Machine Learning model deployment with Flask framework</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with the help of the flask framework.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [1]:
# Data Manipulation 
import pandas as pd 
import numpy as np 
# Data Visualization 
import matplotlib.pyplot as plt 
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')

# Machine Learning Models 
from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import OneHotEncoder, StandardScaler 
from sklearn.linear_model import LinearRegression 
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor 
# Model Evaluation 
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score 
# Model Saving and Loading 
import joblib
import pickle

### 2. Load the dataset

In [2]:
df = pd.read_csv(r"C:\Users\canit\OneDrive\Desktop\car\car+data.csv")
df.head()

Unnamed: 0,Car_Name,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,ritz,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,sx4,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,ciaz,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,wagon r,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,swift,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


### 3. Check the shape and basic information of the dataset.

In [3]:
df.shape

(301, 9)

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 301 entries, 0 to 300
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Car_Name       301 non-null    object 
 1   Year           301 non-null    int64  
 2   Selling_Price  301 non-null    float64
 3   Present_Price  301 non-null    float64
 4   Kms_Driven     301 non-null    int64  
 5   Fuel_Type      301 non-null    object 
 6   Seller_Type    301 non-null    object 
 7   Transmission   301 non-null    object 
 8   Owner          301 non-null    int64  
dtypes: float64(2), int64(3), object(4)
memory usage: 21.3+ KB


In [5]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Year,301.0,2013.627907,2.891554,2003.0,2012.0,2014.0,2016.0,2018.0
Selling_Price,301.0,4.661296,5.082812,0.1,0.9,3.6,6.0,35.0
Present_Price,301.0,7.628472,8.644115,0.32,1.2,6.4,9.9,92.6
Kms_Driven,301.0,36947.20598,38886.883882,500.0,15000.0,32000.0,48767.0,500000.0
Owner,301.0,0.043189,0.247915,0.0,0.0,0.0,0.0,3.0


### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [6]:
len(df[df.duplicated()])

2

In [7]:
df.drop_duplicates(inplace= True)

In [8]:
df.shape

(299, 9)

### 5. Drop the columns which you think redundant for the analysis.

No column to be dropped

### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [9]:
current_year = pd.to_datetime('now').year
df['age_of_the_car'] = current_year-df['Year']
df= df.drop('Year', axis =1)
df.head()

Unnamed: 0,Car_Name,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_the_car
0,ritz,3.35,5.59,27000,Petrol,Dealer,Manual,0,10
1,sx4,4.75,9.54,43000,Diesel,Dealer,Manual,0,11
2,ciaz,7.25,9.85,6900,Petrol,Dealer,Manual,0,7
3,wagon r,2.85,4.15,5200,Petrol,Dealer,Manual,0,13
4,swift,4.6,6.87,42450,Diesel,Dealer,Manual,0,10


### 7. Encode the categorical columns

In [10]:
categorical_columns = df.select_dtypes(include=['object']).columns
print(categorical_columns)

Index(['Car_Name', 'Fuel_Type', 'Seller_Type', 'Transmission'], dtype='object')


In [11]:
from sklearn.preprocessing import LabelEncoder

In [12]:
#Encode the categorical variables in the dataset
le = LabelEncoder()
categorical_columns = ['Car_Name', 'Fuel_Type', 'Seller_Type','Transmission']
for col in categorical_columns:
    df[col] = le.fit_transform(df[col])

In [13]:
df.head(2)

Unnamed: 0,Car_Name,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_the_car
0,90,3.35,5.59,27000,2,0,1,0,10
1,93,4.75,9.54,43000,1,0,1,0,11


### 8. Separate the target and independent features.

In [14]:
# Separate the target variable and independent features 
X = df.drop('Selling_Price', axis=1) 
y = df['Selling_Price']
# Display the first few rows of independent features and target variable to verify the changes 
print("Independent Features (X):") 
print(X.head()) 
print("\nTarget Variable (y):") 
print(y.head())

Independent Features (X):
   Car_Name  Present_Price  Kms_Driven  Fuel_Type  Seller_Type  Transmission  \
0        90           5.59       27000          2            0             1   
1        93           9.54       43000          1            0             1   
2        68           9.85        6900          2            0             1   
3        96           4.15        5200          2            0             1   
4        92           6.87       42450          1            0             1   

   Owner  age_of_the_car  
0      0              10  
1      0              11  
2      0               7  
3      0              13  
4      0              10  

Target Variable (y):
0    3.35
1    4.75
2    7.25
3    2.85
4    4.60
Name: Selling_Price, dtype: float64


### 9. Split the data into train and test.

In [15]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print("Training set shape: ", X_train.shape, y_train.shape) 
print("Testing set shape: ", X_test.shape, y_test.shape)

Training set shape:  (239, 8) (239,)
Testing set shape:  (60, 8) (60,)


### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [16]:
# Build the Random Forest Regressor model 
model = RandomForestRegressor(random_state=42) 
model.fit(X_train, y_train)

In [17]:
# Predict and calculate the R² score for train and test sets 
train_predictions = model.predict(X_train) 
test_predictions = model.predict(X_test) 
r2_train = r2_score(y_train, train_predictions) 
r2_test = r2_score(y_test, test_predictions)

In [18]:
print(f"R² score for the training set: {r2_train:.4f}") 
print(f"R² score for the testing set: {r2_test:.4f}")

R² score for the training set: 0.9863
R² score for the testing set: 0.6480


### 11. Create a pickle file with an extension as .pkl

In [22]:
with open('model.pkl', 'wb') as file: 
    pickle.dump(model, file) 
print("model.pkl")

model.pkl


### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

<html>

<head>
    <title>Car Details Form</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            background-color: #f8f9fa;
            margin: 0;
            padding: 20px;
        }

        form {
            max-width: 500px;
            margin: auto;
            padding: 20px;
            background-color: #fff;
            border-radius: 5px;
            box-shadow: 0 0 10px rgba(0, 0, 0, 0.1);
        }

        label {
            display: block;
            margin-bottom: 10px;
        }

        input,
        select {
            width: 100%;
            padding: 10px;
            margin-bottom: 20px;
            border-radius: 5px;
            border: 1px solid #ccc;
        }

        button {
            display: block;
            width: 100%;
            padding: 10px;
            background-color: #007bff;
            border: none;
            border-radius: 5px;
            color: #fff;
            font-size: 16px;
            cursor: pointer;
        }

        button:hover {
            background-color: #0056b3;
        }
    </style>
</head>

<body bgcolor="#d4a3ae">
    <center>
        <h2>Car Details Form</h2>
        <form action="/submit" method="post">
            <label for="car-make">Car Make</label>
            <input type="text" id="car-make" name="car-make" required>
            <label for="car-model">Car Model</label>
            <input type="text" id="car-model" name="car-model" required>
            <label for="year">Year</label>
            <input type="number" id="year" name="year" required min="1886" max="2024">
            <label for="color">Color</label>
            <input type="text" id="color" name="color">
            <label for="price">Price</label>
            <input type="number" id="price" name="price" step="0.01">
            <button type="submit">Submit</button>
        </form>
    </center>
</body>

</html>

### b) Create app.py file and write the predict function

from flask import Flask, render_template, request
import pickle
import numpy as np

model = pickle.load(open('model.pkl', 'rb'))

app = Flask(__name__)



@app.route('/')
def index():
    return render_template('index.html')


@app.route('/predict', methods=['POST'])
def home():
    data1 = request.form['Car_Name']
    data2 = request.form['Fuel_Type']
    arr = np.array([[data1, data2]])
    pred = model.predict(arr)
    return render_template('after.html', data=pred)


if __name__ == "__main__":
    app.run(debug=True)

### 13. Run the app.py python file which will render to index html page then enter the input values and get the prediction.

python app.py

### Happy Learning :)