# <font color=darkblue> Machine Learning model deployment with Flask framework on Heroku</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with flask framework on heroku.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [28]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.linear_model import LinearRegression,LogisticRegression
from sklearn.metrics import accuracy_score,confusion_matrix
from sklearn.preprocessing import LabelEncoder
import warnings
warnings.filterwarnings('ignore')

from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import StackingClassifier
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier
from sklearn.tree import DecisionTreeClassifier

from sklearn.impute import KNNImputer
from sklearn.metrics import r2_score

### 2. Load the dataset

In [2]:
df = pd.read_csv('C:/Users/Debaditya Chatterjee/Desktop/Python/for print out/lab 13.11.2022/car data.csv')

In [3]:
df.head()

Unnamed: 0,Car_Name,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,ritz,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,sx4,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,ciaz,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,wagon r,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,swift,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


### 3. Check the shape and basic information of the dataset.

In [4]:
df.shape

(301, 9)

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 301 entries, 0 to 300
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Car_Name       301 non-null    object 
 1   Year           301 non-null    int64  
 2   Selling_Price  301 non-null    float64
 3   Present_Price  301 non-null    float64
 4   Kms_Driven     301 non-null    int64  
 5   Fuel_Type      301 non-null    object 
 6   Seller_Type    301 non-null    object 
 7   Transmission   301 non-null    object 
 8   Owner          301 non-null    int64  
dtypes: float64(2), int64(3), object(4)
memory usage: 21.3+ KB


### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [6]:
len(df[df.duplicated()])

2

In [7]:
df.drop_duplicates(keep='first',inplace=True)

In [8]:
len(df[df.duplicated()])

0

### 5. Drop the columns which you think redundant for the analysis.

In [9]:
df = df.drop(columns=['Car_Name'],axis=1)
df.head()

Unnamed: 0,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [None]:
#df['age_of_the_car']=dt.datetime.today().year-df['Year']

In [12]:
df['age_of_the_car']= 2022 -df['Year']


In [14]:
df = df.drop(columns=['Year'],axis=1)
df.head()

Unnamed: 0,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_the_car
0,3.35,5.59,27000,Petrol,Dealer,Manual,0,8
1,4.75,9.54,43000,Diesel,Dealer,Manual,0,9
2,7.25,9.85,6900,Petrol,Dealer,Manual,0,5
3,2.85,4.15,5200,Petrol,Dealer,Manual,0,11
4,4.6,6.87,42450,Diesel,Dealer,Manual,0,8


### 7. Encode the categorical columns

In [16]:
df.describe(include ='O')

Unnamed: 0,Fuel_Type,Seller_Type,Transmission
count,299,299,299
unique,3,2,2
top,Petrol,Dealer,Manual
freq,239,193,260


In [None]:
# for col in df.select_dtypes('object').columns:
# le=LabelEncoder()
# df[col]=le.fit_transform(df[col]


In [18]:
object_type_variables = [i for i in df[['Fuel_Type','Seller_Type','Transmission']] if df.dtypes[i] == object]
object_type_variables 


le = LabelEncoder()

def encoder(df):
    for i in object_type_variables:
        q = le.fit_transform(df[i].astype(str))  
        df[i] = q                               
        df[i] = df[i].astype(int)
encoder(df)

### 8. Separate the target and independent features.

In [19]:
df.columns

Index(['Selling_Price', 'Present_Price', 'Kms_Driven', 'Fuel_Type',
       'Seller_Type', 'Transmission', 'Owner', 'age_of_the_car'],
      dtype='object')

In [20]:
x = df.drop('Selling_Price',axis=1)
y = df['Selling_Price']

### 9. Split the data into train and test.

In [21]:
x_train, x_test,y_train, y_test =train_test_split(x,y,test_size=0.30,random_state=1)

In [49]:
x_train.shape

(209, 7)

### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [50]:
def fit_n_predict(model,x_train,x_test,y_train,y_test):
    
    # Fit the model with train data
    model.fit(x_train,y_train)
    
    # Making prediction on test data
    pred=model.predict(x_test)
    
    # Calculate the accuracy score
    accuracy=r2_score(y_test,pred)
    
    return accuracy

In [51]:
# Testing the above function
from sklearn.ensemble import RandomForestRegressor
rf=RandomForestRegressor()

In [52]:
rs = pd.DataFrame()

In [53]:
result_ = fit_n_predict(rf,x_train,x_test,y_train,y_test)

In [54]:
result_

0.9213397009669374

In [55]:
rs['random_forest']= pd.Series(result_)

In [56]:
rs

Unnamed: 0,random_forest
0,0.92134


### 11. Create a pickle file with an extension as .pkl

In [58]:
import pickle
pickle.dump(rf,open('model.pkl','wb'))

### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

In [None]:
<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="UTF-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <meta http-equiv="X-UA-Compatible" content="ie=edge" />
  <link rel="stylesheet" href="style.css" />
  <title>Predict the selling price of carstitle>
head>

<body>
  <div class="hero-image">
    <div class="hero-text">
      <img src="../static/logo.jpeg" alt="Car" class="logo" width="400" height="200" />
      <h1 style="font: size 50px;">Used car price predictorh1>
      <br> <br>
      <h3>{{ prediction_text }}h3>
    div>
  div>
  <div style="color: rgb(0,0,0)">
    <form action="{{ url_for('predict')}}" method="POST">
      <h2>Enter car details : h2>
      <h3>Age of the car (in Years)h3>
      <input id="first" name="Age_of_the_car" type="number">
      <h3>Present showroom price (In lakhs)h3> <br> <input id="second" name="Present_Price" required="required">
      <h3>Kilometers Drivenh3> <br> <input id="third" name="Kms_Driven" required="required">
      <h3>Owner type (0/1/2)h3> <br> <input id="fourth" name="Owner" required="required">
      <h3>Fuel Typeh3> <br> <select name="Fuel_Type" id="fuel" required="required">
        <option value="0">Petroloption>
        <option value="1">Dieseloption>
        <option value="2">CNGoption>
      select>
      <h3>Seller Typeh3> <br> <select name="Seller_Type" id="resea" required="required">
        <option value="0">Dealeroption>
        <option value="1">Individualoption>
      select>
      <h3>Transmission Typeh3> <br> <select name="Transmission" id="research" required="required">
        <option value="0">Manual caroption>
        <option value="1">Automatic caroption>
      select>
      <br> <br> <button id="sub" type="submit"> Predict Selling Pricebutton>
    form>
  div>
  <style>
    body,
    html {
      height: 100%;
      margin: 0;
      font-family: Verdana, Geneva, Tahoma, sans-serif;
    }

    .hero-image {
      background-image: linear-gradient(rgba(0, 0, 0, 0.5), rgba(0, 0, 0, 0.5));
      height: 50%;
      background-position: bottom;
      background-repeat: no-repeat;
      background-size: cover;
      position: relative;
    }

    .hero-text {
      text-align: center;
      position: absolute;
      top: 50%;
      left: 50%;
      transform: translate(-50%, -50%);
      color: rgb(53, 215, 183);
    }

    body {
      background-color: 101, 10, 20;
      text-align: center;
      padding: 0px;
      font-family: Verdana, Geneva, Tahoma, sans-serif;
    }

    #research {
      font-size: 18px;
      width: 200px;
      height: 23px;
      top: 23px;
    }

    #box {
      border-radius: 60px;
      border-color: 45px;
      border-style: solid;
      text-align: center;
      background-color: white;
      font-size: medium;
      position: absolute;
      width: 700px;
      bottom: 9%;
      height: 850px;
      right: 30%;
      padding: 0px;
      margin: 0px;
      font-size: 14px;
    }

    #fuel {
      width: 83px;
      height: 43px;
      text-align: center;
      border-radius: 14px;
      font-size: 20px;
    }

    #fuel:hover {
      background-color: aqua;
    }

    #research {
      width: 150px;
      height: 43px;
      text-align: center;
      border-radius: 14px;
      font-size: 18px;
    }

    #research:hover {
      background-color: brown;
    }

    #resea {
      width: 99px;
      height: 43px;
      text-align: center;
      border-radius: 14px;
      font-size: 18px;
    }

    #resea:hover {
      background-color: blanchedalmond;
    }

    #sub {
      background-color: purple;
      font-family: Verdana, Geneva, Tahoma, sans-serif;
      font-weight: bold;
      width: 180px;
      color: aquamarine;
      border-radius: 20px;
      height: 60px;
      font-size: 18px;
      text-align: center;

    }

    #sub:hover {
      background-color: greenyellow;
    }

    #first {
      border-radius: 14px;
      height: 25px;
      font-size: 20px;
      text-align: center;
    }

    #second {
      border-radius: 14px;
      height: 25px;
      font-size: 20px;
      text-align: center;
    }

    #third {
      border-radius: 14px;
      height: 25px;
      font-size: 20px;
      text-align: center;
    }

    #fourth {
      border-radius: 14px;
      height: 25px;
      font-size: 20px;
      text-align: center;
    }
  style>
body>

html>

### b) Create app.py file and write the predict function

In [None]:
# importing necessary libraries and functions 
from flask import Flask,render_template,request,jsonify
import pickle
import numpy as np
import sklearn
from sqlalchemy import true

#Initialize the flask App
app=Flask(__name__)
# loading the trained model  
model=pickle.load(open('model.pkl','rb'))

@app.route('/') #,methods=['GET']
def Home():
    return render_template('index.html')

@app.route('/predict',methods=['POST']) 
def predict():
    if request.method == 'POST':
        Present_Price=float(request.form['Present_Price'])
        Kms_Driven=int(request.form['Kms_Driven'])
        Owner=int(request.form['Owner'])
        Fuel_Type=int(request.form['Fuel_Type'])
        Age_of_the_car=int(request.form['Age_of_the_car'])
        Seller_Type=int(request.form['Seller_Type'])
        Transmission=int(request.form['Transmission'])

        prediction=model.predict([np.array([Present_Price,Kms_Driven,Owner,Age_of_the_car,Fuel_Type,Seller_Type,Transmission])])
        output=round(prediction[0],2)
        return render_template('index.html',prediction_text="You can sell your car at {} (In lakhs)".format(output))

if __name__=="__main__":
    app.run(debug=True)

### 13. Deploy your app on Heroku. (write commands for deployment)

Deploy changes
git init

git add .

git commit -am "Additional changes"

git push heroku main

### 14. Paste the URL of the heroku application below, and while submitting the solution submit this notebook along with the source code.

ttps://git.heroku.com/carpredictionmodeldc.git  ## Due to version mismach it could not be uploaded.

### Happy Learning :)