# <font color=darkblue> Machine Learning model deployment with Flask framework</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with the help of the flask framework.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [1]:
## importing libraries 

import numpy as np
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style = "ticks")
import warnings
warnings.filterwarnings('ignore')

import sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import accuracy_score, r2_score
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier

import pickle

### 2. Load the dataset

In [3]:
## reading dataset

data = pd.read_csv('car+data.csv')

## setting max columns to none
pd.set_option('display.max_columns', None)


data.head()

Unnamed: 0,Car_Name,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,ritz,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,sx4,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,ciaz,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,wagon r,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,swift,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


### 3. Check the shape and basic information of the dataset.

In [4]:
## checking the shape of dataset

data.shape

(301, 9)

In [5]:
## getting general information of the dataset using info

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 301 entries, 0 to 300
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Car_Name       301 non-null    object 
 1   Year           301 non-null    int64  
 2   Selling_Price  301 non-null    float64
 3   Present_Price  301 non-null    float64
 4   Kms_Driven     301 non-null    int64  
 5   Fuel_Type      301 non-null    object 
 6   Seller_Type    301 non-null    object 
 7   Transmission   301 non-null    object 
 8   Owner          301 non-null    int64  
dtypes: float64(2), int64(3), object(4)
memory usage: 21.3+ KB


### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [6]:
## getting duplicate records in the dataset

len(data[data.duplicated()])

2

In [7]:
## dropping duplicates

data.drop_duplicates(keep='first', inplace=True)

In [8]:
len(data[data.duplicated()])

0

### 5. Drop the columns which you think redundant for the analysis.

In [9]:
## getting all columns

data.columns

Index(['Car_Name', 'Year', 'Selling_Price', 'Present_Price', 'Kms_Driven',
       'Fuel_Type', 'Seller_Type', 'Transmission', 'Owner'],
      dtype='object')

In [10]:
## dropping the redundant columns

data.drop('Car_Name', axis=1, inplace=True)

data.head()

Unnamed: 0,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [11]:
## Calculating age using 'Year' :: current-year - given-year and Dropping the 'Year' column

from datetime import date

data['age_of_the_car'] = date.today().year - data['Year']
data.drop('Year', axis=1, inplace=True)

data.head()

Unnamed: 0,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_the_car
0,3.35,5.59,27000,Petrol,Dealer,Manual,0,10
1,4.75,9.54,43000,Diesel,Dealer,Manual,0,11
2,7.25,9.85,6900,Petrol,Dealer,Manual,0,7
3,2.85,4.15,5200,Petrol,Dealer,Manual,0,13
4,4.6,6.87,42450,Diesel,Dealer,Manual,0,10


### 7. Encode the categorical columns

In [12]:
## perform label encoding for 'Object' type columns

lbl_encoder = LabelEncoder()
for i in data.select_dtypes('object'):
    data[i] = lbl_encoder.fit_transform(data[i])
    
data.head()

Unnamed: 0,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_the_car
0,3.35,5.59,27000,2,0,1,0,10
1,4.75,9.54,43000,1,0,1,0,11
2,7.25,9.85,6900,2,0,1,0,7
3,2.85,4.15,5200,2,0,1,0,13
4,4.6,6.87,42450,1,0,1,0,10


### 8. Separate the target and independent features.

In [None]:
## storing traget column 'Selling_Price' in variable Y and other independant features in variable X

X = data.drop('Selling_Price', axis = 1)
Y = data['Selling_Price']

In [14]:
X.head()

Unnamed: 0,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_the_car
0,5.59,27000,2,0,1,0,10
1,9.54,43000,1,0,1,0,11
2,9.85,6900,2,0,1,0,7
3,4.15,5200,2,0,1,0,13
4,6.87,42450,1,0,1,0,10


In [15]:
Y.head()

0    3.35
1    4.75
2    7.25
3    2.85
4    4.60
Name: Selling_Price, dtype: float64

### 9. Split the data into train and test.

In [16]:
## splitting the data

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.30, random_state=1)
print(X_train.shape, X_test.shape)
print(Y_train.shape, Y_test.shape)

(209, 7) (90, 7)
(209,) (90,)


In [17]:
## scaling the data using StandardScaler

ss = StandardScaler()

X_train = ss.fit_transform(X_train.iloc[:,:])
X_test = ss.fit_transform(X_test.iloc[:,:])

### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [19]:
## defining a function for returning the accuracy score for a model and its data-inputs

def fit_predict_print(model, X_train, X_test, Y_train, Y_test):
    model.fit(X_train, Y_train)                                   # fit the model using training data
    prediction = model.predict(X_test)                            # make predictions for the model using test data
    accuracy = r2_score(Y_test, prediction)                       # compute r2 score
    return accuracy                                               # return accuracy metric


In [20]:
## declaring Randomforest Regressor training method

rf = RandomForestRegressor()
rs = pd.DataFrame()


## training the model and evaluating their r2_score

result_ = fit_predict_print(rf, X_train, X_test, Y_train, Y_test)
result_

0.8235926075415646

In [21]:
## assigning and printing the r2_score

rs['random_forest'] = pd.Series(result_)
rs

Unnamed: 0,random_forest
0,0.823593


### 11. Create a pickle file with an extension as .pkl

In [22]:
## saving model to disk by creating a pickle file

pickle.dump(rf, open('model.pk', 'wb'))

### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

In [None]:
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Lab 5 Solution - Python for DS</title>

    <style>
        body {
            margin: 2px 25px;
            text-align: center;
            min-width: 650px;
            font-family: Sans-serif;
        }

        /* Header css property */
        header {
            background-color: springgreen;
            font-size: 1.3em;
            min-height: 100px;
            max-height: 120px;
        }
        header h1 {
            margin: 0%;
            padding: 2% 5%;
            font-style: italic;
            text-shadow: 1px 2px #f5f5f5;
        }

        /* Prediction text css property */
        h3 {
            font-weight: bold;
            font-size: 25px;
            color: steelblue;
        }

        /* Input fields css property */
        input {
            border: none;
            border-bottom: 2px solid rgb(181, 180, 180);
            padding: 15px;
            margin-bottom: 6px;
            width: 40%;
        }
        select {
            margin-bottom: 6px;
            padding: 15px;
            width: 45%;
        }

        table {
            width: 60%;
            margin: 0px auto;
        }

        /* Submit button css property */
        .submit {
            background-color: steelblue;
            color: white;
            width: 16%;
            font-weight: bold;
            font-size: 15px;
            cursor: pointer;
            border-radius: 5px;
        }
    </style>
</head>
<body>
    <header><h1>Car Price Predictor</h1></header>
    <div>
        <h3>{{ prediction_text }}</h3>
    </div>
    <div>
        <form method="POST" action="{{ url_for('predict')}}" >
            <table>
                <tr>
                    <td><p>Age of the car</p></td>
                    <td><input placeholder="Age of the car" type="text" name="age_of_the_car" required="required"></td>
                </tr>
                <tr>
                    <td><p>Present price of car</p></td>
                    <td><input placeholder="Present price of car" type="text" name="Present_Price" required="required"></td>
                </tr>
                <tr>
                    <td><p>Kms Driven</p></td>
                    <td><input placeholder="Kms Driven" type="text" name="Kms_Driven" required="required"></td>
                </tr>
                <tr>
                    <td><p>Fuel Type</p></td>
                    <td><select name="Fuel_Type" id="fuel" required="required">
                            <option value="0">Petrol</option>
                            <option value="1">Diesel</option>
                        </select>
                    </td>
                </tr>
                <tr>
                    <td><p>Seller Type</p></td>
                    <td><select name="Seller_Type" id="seller" required="required">
                            <option value="0">Dealer</option>
                            <option value="1">Individual</option>
                        </select>
                    </td>
                </tr>
                <tr>
                    <td><p>Transmission Type</p></td>
                    <td><select name="Transmission" id="transmission" required="required">
                            <option value="0">Manual car</option>
                            <option value="1">Automatic car</option>
                        </select>
                    </td>
                </tr>
                <tr>
                    <td><p>Previous Owner count</p></td>
                    <td><input placeholder="Previous Owner count" type="text" name="Owner" required="required"></td>
                </tr>
            </table>

            <p><input class="submit" type="submit" value="Predict Selling Price"></p>
        </form>
    </div>
</body>
</html>

### b) Create app.py file and write the predict function

In [None]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
from flask import Flask, request, jsonify, render_template
import pickle

app = Flask(__name__)
# open and load the pickle file provided in read mode.
model = pickle.load(open('model.pk', 'rb'))


@app.route('/', methods=['GET'])
def home():
    return render_template('index.html')
# In the above code, are returning the output of the function "render_template()".
# This function accepts an input of an HTML page.
# This function looks for the mentioned page under templates folder.
# This is the reason we created this index.html form under templates folder.


# Predict function to read the values from the UI and predict the price value.
@app.route('/predict', methods=['POST'])
def predict():
    if request.method == 'POST':
        age_of_the_car = request.form['age_of_the_car']
        Present_Price = float(request.form['Present_Price'])
        Kms_Driven = int(request.form['Kms_Driven'])
        Fuel_Type = request.form['Fuel_Type']
        Seller_Type = request.form['Seller_Type']
        Transmission = request.form['Transmission']
        Owner = request.form['Owner']

        prediction = model.predict([[age_of_the_car, Present_Price, Kms_Driven, Fuel_Type, Seller_Type, Transmission, Owner]])
        output = round(prediction[0], 2)
        return render_template('index.html', prediction_text='You can sell your car at {} lakhs'.format(output))


if __name__ == "__main__":
    app.run(debug=True)
    app.config['TEMPLATES_AUTO_RELOAD'] = True

### 13. Run the app.py python file which will render to index html page then enter the input values and get the prediction.

#### By following the below mentioned steps, you can run the application, give inputs and receive the prediction.
* Run the Application: Execute the 'app.py' file to launch the application.
* Access the Web Page: Open your web browser and visit the application at http://127.0.0.1:5000/ or http://localhost:5000/. You will be directed to a web page titled 'Used Car Price Predictor'.
* Enter Input Values: On the web page, provide the necessary input values as required.
* Get Prediction Result: After entering the input values, submit the form. The application will process your input and provide you with the prediction result on the same web page.

### Happy Learning :)