# <font color=darkblue> Machine Learning model deployment with Flask framework</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with the help of the flask framework.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import KFold,cross_validate
from sklearn.metrics import r2_score
import warnings
warnings.filterwarnings(action='ignore')

In [None]:
#from google.colab import drive
#drive.mount('/content/drive')

### 2. Load the dataset

Loading dataset and displaying top 5 rows of it.

In [29]:
df = pd.read_csv('cardata.csv')
df.head()

Unnamed: 0,Car_Name,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,ritz,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,sx4,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,ciaz,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,wagon r,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,swift,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


### 3. Check the shape and basic information of the dataset.

In [3]:
#checking the shape of the data
df.shape

(301, 9)

data has 301 rows and 9 columns in it.

In [4]:
#checking on the basic information
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 301 entries, 0 to 300
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Car_Name       301 non-null    object 
 1   Year           301 non-null    int64  
 2   Selling_Price  301 non-null    float64
 3   Present_Price  301 non-null    float64
 4   Kms_Driven     301 non-null    int64  
 5   Fuel_Type      301 non-null    object 
 6   Seller_Type    301 non-null    object 
 7   Transmission   301 non-null    object 
 8   Owner          301 non-null    int64  
dtypes: float64(2), int64(3), object(4)
memory usage: 21.3+ KB


There arent any null values it seems

data has both numerical and categorical datatrypes.



### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [5]:
#Checking on the duplicated values
df.duplicated().sum()

2

there are 2 duplacated values in it.

In [6]:
#dropping duplicated values
df.drop_duplicates(inplace=True)

In [None]:
df.duplicated().sum()

0

**No Duplicated values present now**

### 5. Drop the columns which you think redundant for the analysis.

In [7]:
## dropping the column "Car_Namer" as it doesnt have any relevance.
df.drop('Car_Name',axis=1,inplace=True)

### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [8]:
##Substracting the year of purchase with current year and extracting the age of the car
df['age_of_car'] = 2024-df['Year']

In [9]:
#Dropping the year column
df.drop('Year',axis=1,inplace=True)

In [10]:
df.head()

Unnamed: 0,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_car
0,3.35,5.59,27000,Petrol,Dealer,Manual,0,10
1,4.75,9.54,43000,Diesel,Dealer,Manual,0,11
2,7.25,9.85,6900,Petrol,Dealer,Manual,0,7
3,2.85,4.15,5200,Petrol,Dealer,Manual,0,13
4,4.6,6.87,42450,Diesel,Dealer,Manual,0,10


### 7. Encode the categorical columns

In [11]:
#will check the unique values on each categorical column
df['Fuel_Type'].unique()

array(['Petrol', 'Diesel', 'CNG'], dtype=object)

In [12]:
df['Seller_Type'].unique()

array(['Dealer', 'Individual'], dtype=object)

In [13]:
df['Transmission'].unique()

array(['Manual', 'Automatic'], dtype=object)

In [14]:
#Manual encoding
df['Fuel_Type'] = df['Fuel_Type'].replace({'Petrol':0, 'Diesel':1, 'CNG':2})
df['Seller_Type'] = df['Seller_Type'].replace({'Dealer':0, 'Individual':1})
df['Transmission'] = df['Transmission'].replace({'Manual':0, 'Automatic':1})

**Checking the variables after manually encoding them**

In [15]:
df['Fuel_Type'].unique()

array([0, 1, 2])

In [16]:
df['Seller_Type'].unique()

array([0, 1])

In [17]:
df['Transmission'].unique()

array([0, 1])

In [18]:
#checking the data first five rows again
df.head()

Unnamed: 0,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_car
0,3.35,5.59,27000,0,0,0,0,10
1,4.75,9.54,43000,1,0,0,0,11
2,7.25,9.85,6900,0,0,0,0,7
3,2.85,4.15,5200,0,0,0,0,13
4,4.6,6.87,42450,1,0,0,0,10


In [19]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 299 entries, 0 to 300
Data columns (total 8 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Selling_Price  299 non-null    float64
 1   Present_Price  299 non-null    float64
 2   Kms_Driven     299 non-null    int64  
 3   Fuel_Type      299 non-null    int64  
 4   Seller_Type    299 non-null    int64  
 5   Transmission   299 non-null    int64  
 6   Owner          299 non-null    int64  
 7   age_of_car     299 non-null    int64  
dtypes: float64(2), int64(6)
memory usage: 21.0 KB


### 8. Separate the target and independent features.

In [20]:
#separating the target and independent variables
X = df.drop('Selling_Price', axis=1)
y = df['Selling_Price']

### 9. Split the data into train and test.

In [21]:
#splting the data in test and train in 70:30
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=0)

print(X_train.shape, X_test.shape)
print(y_train.shape, y_test.shape)

(209, 7) (90, 7)
(209,) (90,)


### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [22]:
## building a random forest regressor model
rf = RandomForestRegressor()
rf.fit(X_train,y_train)

In [23]:
## will check the r2-score to see hows our model is performing

y_train_pred = rf.predict(X_train)
y_test_pred = rf.predict(X_test)

r2_train = r2_score(y_train,y_train_pred)
r2_test = r2_score(y_test,y_test_pred)

print('r2-score train:',r2_train)
print('r2-score test',r2_test)

r2-score train: 0.9777644878789427
r2-score test 0.8959823806041962


### 11. Create a pickle file with an extension as .pkl

In [32]:
import pickle
# Saving model to disk
pickle.dump(rf, open('model.pkl','wb'))


### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

In [None]:
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name= "viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
  </head>

  <body>
  <div class="hero-image">
  <div class="hero-text">
      <h1 style="font-size:50px">Used car Price Predictor</h1>
      <br><br><h3>{{prediction_text}}<h3>
  </div>
  </div>

   <style>
       body, html{
           height: 100%;
           margin: 0;
           font-family: Arial, Helvetica, sans-serif;
       }

       .hero-image{
           background-image: linear-gradient(rgba(0,0,0.5), rgba(0,0,0.5)), url("/static/image.jpg");
           height: 50%;
           background-position: bottom;
           background-repeat: no-repeat;
           background-size: cover;
           position: relative;
       }

       .hero-text{
           text-align: center;
           position: absolute;
           top: 50%;
           left: 50%;
           transform: translate(-50%, -50%);
           color: aqua;
       }
   </style>

  <div style="...">
      <form action="{{ url_for('predict'}}" method="post">
          <h2> Enter Car details: </h2>
          <h3> Age of the Car(In years)</h3>
          <input id="first" name="Age_of_Car" type="number">
          <h3> Present Showroom price(in lakhs)</h3><br><input id="second" name="Present_Price" required="required">
          <h3>Owner type(0/1/3)</h3><br><input id="third" name="kms_Driven" required="required">
          <h3>Fuel type</h3><br><select name="Fule_Type" id="fuel" required="required">
          <option value="0">Petrol</option>
          <option value="1">Diesel</option>
          <option value="2">CNG</option>
      </select>
          <h3>Seller Type</h3><br><select name="Seller_Type" id="resea" required="required"
          <option value="0">Dealer</option>
          <option value="1">Individuel</option>
          <select/>
      <select>
          <h3>Transmission type</h3><br><select name="Transmission" id="research" required="required"
          <option value="0">Manual Car</option>
          <option value="1">Automatic Car</option>
          <select/>
          <br><br><button id="sub" type="submit">predict selling price</button>
          <br>

      </form>
  </div>
  <style>
      body{
          background-color: 101, 10,20;
          text-align: center;
          padding: 0px;
          font-family: Helvetica;
      }
      #research {
          font-size: 18px;
          width: 200px;
          height: 23px;
          text-align: center;
      }
      #second{
          border-radius: 14px;
          height: 25px;
          font-size: 20px;
          text-align: center;
      }
      #third{
          border-radius: 14px;
          height: 25px;
          font-size:20px;
          text-align: center;
      }
      #fourth{
          border-radius: 14px;
          height: 25px;
          font-size: 20px;
          text-align: center;
      }
      </style>
  </body>
</html>


### b) Create app.py file and write the predict function

In [31]:
from flask import Flask, render_template, request, jsonify
import pickle
import numpy as np
import pandas as pd
import sklearn

app = Flask(__name__)
model = pickle.load(open('model.pkl', 'rb'))

@app.route('/',methods=['GET'])
def Home():
    return render_template('Index.html')


@app.route('/predict',methods=['POST'])
def predict():
    if request.method == 'POST':
        Present_Price = float(request.form['Present_Price'])
        kms_Driven = int(request.form['kms_Driven'])
        Owner= int(request.form['Owner'])
        Fuel_Type= (request.form['FuelType'])
        Age_of_Car=request.form['Age_of_Car']
        Seller_Type= request.form['Seller_Type']
        Transmission= request.form['Transmission']

        Prediction= model.predict([[Present_Price,kms_Driven,Owner,Fuel_Type,Age_of_Car,Seller_Type,Transmission]])
        output= render_template(Prediction[0],2)
         return render_template('index.html', prediction_text="You can sell your car at {} lakhs".format(output))


        if__name__=__main__;
        app.run(debug=True)


IndentationError: unexpected indent (<ipython-input-31-2624c23090f5>, line 28)

### 13. Run the app.py python file which will render to index html page then enter the input values and get the prediction.

In [None]:
http://127.0.0.1:5000

### Happy Learning :)