# <font color=darkblue> Machine Learning model deployment with Flask framework</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with the help of the flask framework.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import pickle

import joblib
from flask import Flask, request, jsonify

import warnings
warnings.filterwarnings('ignore')

### 2. Load the dataset

In [2]:
df = pd.read_csv('car+data.csv')
df.head(5)

Unnamed: 0,Car_Name,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,ritz,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,sx4,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,ciaz,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,wagon r,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,swift,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


### 3. Check the shape and basic information of the dataset.

In [3]:
#the shape of the data
df.shape

(301, 9)

In [4]:
#basic inforamtion
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 301 entries, 0 to 300
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Car_Name       301 non-null    object 
 1   Year           301 non-null    int64  
 2   Selling_Price  301 non-null    float64
 3   Present_Price  301 non-null    float64
 4   Kms_Driven     301 non-null    int64  
 5   Fuel_Type      301 non-null    object 
 6   Seller_Type    301 non-null    object 
 7   Transmission   301 non-null    object 
 8   Owner          301 non-null    int64  
dtypes: float64(2), int64(3), object(4)
memory usage: 21.3+ KB


### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [5]:
# Check for duplicate 
len(df[df.duplicated()])

2

In [6]:
# Drop duplicate rows
df.drop_duplicates(inplace=True)

In [7]:
len(df[df.duplicated()])

0

### 5. Drop the columns which you think redundant for the analysis.

In [8]:
df.drop('Car_Name', axis=1, inplace=True)
df.head()

Unnamed: 0,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [9]:
#Extract the current year

current_year = 2024

# Create the 'age_of_the_car' features

df['age_of_the_car'] = current_year - df['Year']

df = df.drop('Year', axis = 1)

In [10]:
df.head()

Unnamed: 0,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_the_car
0,3.35,5.59,27000,Petrol,Dealer,Manual,0,10
1,4.75,9.54,43000,Diesel,Dealer,Manual,0,11
2,7.25,9.85,6900,Petrol,Dealer,Manual,0,7
3,2.85,4.15,5200,Petrol,Dealer,Manual,0,13
4,4.6,6.87,42450,Diesel,Dealer,Manual,0,10


### 7. Encode the categorical columns

In [11]:
#Create a list of categorical columns

categorical_cols = ['Fuel_Type', 'Seller_Type', 'Transmission']

# Create a columnTransform object

ct = ColumnTransformer(transformers=[('encode', OneHotEncoder(), categorical_cols)], remainder = 'passthrough')

#Transform the data
X = ct.fit_transform(df)

encode_cols = ct.transformers_[0][1].get_feature_names_out(categorical_cols)
df = pd.DataFrame(X, columns=np.r_[encode_cols, df.columns.drop(categorical_cols)])

df.head(5)

Unnamed: 0,Fuel_Type_CNG,Fuel_Type_Diesel,Fuel_Type_Petrol,Seller_Type_Dealer,Seller_Type_Individual,Transmission_Automatic,Transmission_Manual,Selling_Price,Present_Price,Kms_Driven,Owner,age_of_the_car
0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,3.35,5.59,27000.0,0.0,10.0
1,0.0,1.0,0.0,1.0,0.0,0.0,1.0,4.75,9.54,43000.0,0.0,11.0
2,0.0,0.0,1.0,1.0,0.0,0.0,1.0,7.25,9.85,6900.0,0.0,7.0
3,0.0,0.0,1.0,1.0,0.0,0.0,1.0,2.85,4.15,5200.0,0.0,13.0
4,0.0,1.0,0.0,1.0,0.0,0.0,1.0,4.6,6.87,42450.0,0.0,10.0


### 8. Separate the target and independent features.

In [12]:
X = df.drop('Selling_Price', axis=1)
y = df['Selling_Price']

print("Independet Features (X):")
print(X.head(5))

print("\nTarget Variable (y):")
print(y.head(5))

Independet Features (X):
   Fuel_Type_CNG  Fuel_Type_Diesel  Fuel_Type_Petrol  Seller_Type_Dealer  \
0            0.0               0.0               1.0                 1.0   
1            0.0               1.0               0.0                 1.0   
2            0.0               0.0               1.0                 1.0   
3            0.0               0.0               1.0                 1.0   
4            0.0               1.0               0.0                 1.0   

   Seller_Type_Individual  Transmission_Automatic  Transmission_Manual  \
0                     0.0                     0.0                  1.0   
1                     0.0                     0.0                  1.0   
2                     0.0                     0.0                  1.0   
3                     0.0                     0.0                  1.0   
4                     0.0                     0.0                  1.0   

   Present_Price  Kms_Driven  Owner  age_of_the_car  
0           5.59   

### 9. Split the data into train and test.

In [13]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=42)

print("X_train shape:", X_train.shape)
print("X_test shape:", X_test.shape)
print("y_train shape:", y_train.shape)
print("y_test shape:", y_test.shape)

X_train shape: (239, 11)
X_test shape: (60, 11)
y_train shape: (239,)
y_test shape: (60,)


### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [14]:
## Create a Random Forest Regressor model
rf = RandomForestRegressor()

# Fit the model to the training data
rf.fit(X_train, y_train)

In [15]:
# Make predictions on the training and test sets

y_train_pred = rf.predict(X_train)
y_test_pred = rf.predict(X_test)

In [16]:
# Calculate r2-score for train and test sets

r2_train = r2_score(y_train, y_train_pred)
r2_test = r2_score(y_test, y_test_pred)

In [17]:
print("R-squared (train):", r2_train)
print("R-squared (test):", r2_test)

R-squared (train): 0.9852488652677954
R-squared (test): 0.5856411805994808


### 11. Create a pickle file with an extension as .pkl

In [20]:
# Save the trained model as a pickle file
pickle.dump(rf, open('random_forest_model.pkl','wb'))

### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

In [None]:
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Car Price Prediction</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            background-color: #f0f0f0; /* Light gray background */
            color: #333;
            text-align: center;
            height: 100vh; /* Ensures full viewport height coverage */
            margin: 0; /* Removes default margin */
            padding: 0; /* Removes default padding */
        }

        h1 {
            margin-top: 50px;
            font-size: 2.5em;
            color: #333;
            text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5);
        }

        form {
            background: rgba(255, 255, 255, 0.8);
            padding: 20px;
            margin: 40px auto;
            border-radius: 10px;
            width: 350px;
            box-shadow: 0 0 10px rgba(0, 0, 0, 0.5);
        }

        label {
            display: block;
            margin: 10px 0 5px;
            color: #333;
        }

        input[type="number"],
        select {
            width: 100%;
            padding: 8px;
            margin-bottom: 10px;
            border: 1px solid #ccc;
            border-radius: 5px;
        }

        input[type="submit"] {
            background: #007BFF;
            color: #fff;
            padding: 10px 15px;
            border: none;
            border-radius: 5px;
            cursor: pointer;
            transition: background 0.3s;
        }

        input[type="submit"]:hover {
            background: #0056b3;
        }
    </style>
</head>
<body>
    <h1>Car Price Prediction</h1>
    <form action="/predict" method="post">
        <label for="Present_Price">Present Price (in lakhs):</label>
        <input type="number" step="0.01" id="Present_Price" name="Present_Price"><br><br>

        <label for="Kms_Driven">Kilometers Driven:</label>
        <input type="number" id="Kms_Driven" name="Kms_Driven"><br><br>

        <label for="Age_of_the_car">Age_of_the_car:</label>
        <input type="number" id="Age_of_the_car" name="Age_of_the_car"><br><br>

        <label for="Fuel_Type">Fuel Type:</label>
        <select id="Fuel_Type" name="Fuel_Type">
            <option value="Petrol">Petrol</option>
            <option value="Diesel">Diesel</option>
            <option value="CNG">CNG</option>
        </select><br><br>

        <label for="Seller_Type">Seller Type:</label>
        <select id="Seller_Type" name="Seller_Type">
            <option value="Dealer">Dealer</option>
            <option value="Individual">Individual</option>
        </select><br><br>

        <label for="Transmission">Transmission:</label>
        <select id="Transmission" name="Transmission">
            <option value="Manual">Manual</option>
            <option value="Automatic">Automatic</option>
        </select><br><br>

        <label for="Owner">Owner:</label>
        <select id="Owner" name="Owner">
            <option value="0">1</option>
            <option value="1">2</option>
            <option value="2">3</option>
        </select><br><br>

        <input type="submit" value="Predict">
    </form>
</body>
</html>


### b) Create app.py file and write the predict function

In [None]:
from flask import Flask, request, jsonify, render_template
import pickle
import pandas as pd
import sklearn

app = Flask(__name__)

# Load the trained model
with open('random_forest_model.pkl', 'rb') as file:
    model = pickle.load(file)


@app.route('/', methods=['GET'])
def home():
    return render_template('index.html')


@app.route('/predict', methods=['POST'])
def predict():
    if request.method == 'POST':
    # Extract data from form
        present_price = float(request.form['Present_Price'])
        kms_driven = int(request.form['Kms_Driven'])
        owner = int(request.form['Owner'])
        fuel_type = request.form['Fuel_Type']
        Age_of_the_car = request.form['Age_of_the_car']
        seller_type = request.form['Seller_Type']
        transmission = request.form['Transmission']


        prediction=model.predict([[present_price, kms_driven,owner,fuel_type, Age_of_the_car, seller_type, transmission]])
        output= round(prediction[0], 2)
        return render_template('index.html', prediction_text="you can sell you car at {} lakhs".format(output))


if __name__ == '__main__':
    app.run(debug=True)


### 13. Run the app.py python file which will render to index html page then enter the input values and get the prediction.

### Happy Learning :)