# <font color=darkblue> Machine Learning model deployment with Flask framework on Heroku</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with flask framework on heroku.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [23]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.metrics import r2_score

import warnings
warnings.filterwarnings('ignore')

### 2. Load the dataset

In [2]:
df = pd.read_csv('/Users/vasu/Desktop/PDS_Lab_5/download.csv')
df.sample(5)

Unnamed: 0,Car_Name,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
259,amaze,2014,3.9,7.0,36054,Petrol,Dealer,Manual,0
64,fortuner,2017,33.0,36.23,6000,Diesel,Dealer,Automatic,0
69,corolla altis,2016,14.25,20.91,12000,Petrol,Dealer,Manual,0
62,fortuner,2014,18.75,35.96,78000,Diesel,Dealer,Automatic,0
92,innova,2005,3.51,13.7,75000,Petrol,Dealer,Manual,0


### 3. Check the shape and basic information of the dataset.

In [3]:
df.shape

(301, 9)

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 301 entries, 0 to 300
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Car_Name       301 non-null    object 
 1   Year           301 non-null    int64  
 2   Selling_Price  301 non-null    float64
 3   Present_Price  301 non-null    float64
 4   Kms_Driven     301 non-null    int64  
 5   Fuel_Type      301 non-null    object 
 6   Seller_Type    301 non-null    object 
 7   Transmission   301 non-null    object 
 8   Owner          301 non-null    int64  
dtypes: float64(2), int64(3), object(4)
memory usage: 21.3+ KB


### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [8]:
df.drop_duplicates(inplace=True)

In [9]:
len(df[df.duplicated()])

0

### 5. Drop the columns which you think redundant for the analysis.

In [10]:
df.drop('Car_Name', axis=1, inplace=True)

### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [11]:
df['age_of_the_car']=2023 - df['Year']

In [12]:
df.drop('Year',axis=1, inplace=True)

In [13]:
df.head()

Unnamed: 0,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,age_of_the_car
0,3.35,5.59,27000,Petrol,Dealer,Manual,0,9
1,4.75,9.54,43000,Diesel,Dealer,Manual,0,10
2,7.25,9.85,6900,Petrol,Dealer,Manual,0,6
3,2.85,4.15,5200,Petrol,Dealer,Manual,0,12
4,4.6,6.87,42450,Diesel,Dealer,Manual,0,9


### 7. Encode the categorical columns

In [14]:
cat_features=df.select_dtypes(include=['object'])
cat_features.head(5)

Unnamed: 0,Fuel_Type,Seller_Type,Transmission
0,Petrol,Dealer,Manual
1,Diesel,Dealer,Manual
2,Petrol,Dealer,Manual
3,Petrol,Dealer,Manual
4,Diesel,Dealer,Manual


In [15]:
encoded_df = pd.get_dummies(df, columns=cat_features.columns, drop_first=True)
encoded_df

Unnamed: 0,Selling_Price,Present_Price,Kms_Driven,Owner,age_of_the_car,Fuel_Type_Diesel,Fuel_Type_Petrol,Seller_Type_Individual,Transmission_Manual
0,3.35,5.59,27000,0,9,0,1,0,1
1,4.75,9.54,43000,0,10,1,0,0,1
2,7.25,9.85,6900,0,6,0,1,0,1
3,2.85,4.15,5200,0,12,0,1,0,1
4,4.60,6.87,42450,0,9,1,0,0,1
...,...,...,...,...,...,...,...,...,...
296,9.50,11.60,33988,0,7,1,0,0,1
297,4.00,5.90,60000,0,8,0,1,0,1
298,3.35,11.00,87934,0,14,0,1,0,1
299,11.50,12.50,9000,0,6,1,0,0,1


### 8. Separate the target and independent features.

In [19]:
X=encoded_df.drop('Selling_Price', axis=1)
Y=encoded_df['Selling_Price']

### 9. Split the data into train and test.

In [21]:
X_train,X_test,Y_train,Y_test = train_test_split (X,Y,test_size=0.3,random_state=0)
print (X_train.shape,X_test.shape)
print (Y_train.shape,Y_test.shape)

(209, 8) (90, 8)
(209,) (90,)


### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [25]:
rf = RandomForestRegressor()
rf.fit(X_train,Y_train)

In [26]:
Y_train_pred = rf.predict(X_train)
Y_test_pred = rf.predict(X_test)
r2_train = r2_score(Y_train, Y_train_pred)
r2_test = r2_score(Y_test, Y_test_pred)
print('r2-score train-', r2_train)
print('r2-score test-', r2_test)

r2-score train- 0.9812719927154625
r2-score test- 0.9032696041194817


### 11. Create a pickle file with an extension as .pkl

In [27]:
import pickle
pickle.dump(rf, open('model.pkl','wb'))

### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

In [None]:
<!DOCTYPE html>
<html >
<head>
  <meta charset="UTF-8">
  <title>ML Deployment</title>
  <link href='https://fonts.googleapis.com/css?family=Pacifico' rel='stylesheet' type='text/css'>
<link href='https://fonts.googleapis.com/css?family=Arimo' rel='stylesheet' type='text/css'>
<link href='https://fonts.googleapis.com/css?family=Hind:300' rel='stylesheet' type='text/css'>
<link href='https://fonts.googleapis.com/css?family=Open+Sans+Condensed:300' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="{{ url_for('static', filename='style.css') }}">
  
</head>
<body>
 <div>
    <form action="{{ url_for('predict')}}"method="post">
        <h2>Enter Car details:</h2>
        <h3>Age of the car:</h3>    
    	 
    	<input id="first" name="age_of_the_car"  type="number"/>
        <h3>Present Showroom Price:</h3><br><input id="second" name="Present_Price" required="required"/>
		<h3>Kms driven</h3><br><input id="third" name="Kms_Driven" required="required"/>
        <h3>Owner type (0/1/2)</h3><br><input id="fourth" name="Owner" required="required"/>
        <h3>Fuel type</h3><br><select id="fuel" name="Kms_Driven" required="required">
            <option value="0">Petrol</option>
            <option value="1">Diesel</option>
            <option value="2">CNG</option>
        </select>
        <h3>Seller type</h3><br><select id="selltype" name="Seller_Type" required="required">
            <option value="0">Dealer</option>
            <option value="1">Individual</option>
        </select>
         <h3>Transmission type</h3><br><select id="transtype" name="Transmisson" required="required">
            <option value="0">Manual</option>
            <option value="1">Automatic</option>
        </select>
            
           
        <button id='btn' type="submit" class="btn btn-primary btn-block btn-large">Predict selling price</button>
    </form>

   <br>
   <br>
   {{ prediction_text }}
 </div>
</body>
</html>

### b) Create app.py file and write the predict function

In [None]:
import numpy as np
from flask import Flask, request, jsonify, render_template
import pickle
import sklearn

app = Flask(__name__)  # Initialize the flask App
model = pickle.load(open('model.pkl', 'rb'))  # loading the trained model


@app.route('/',methods=['GET'])  
def home():
    return render_template('index.html')


@app.route('/predict', methods=['POST'])
def predict():
    if request.method == 'POST':
        Present_Price=float(request.form['Present_Price'])
        Kms_Driven=int(request.form['Kms_Driven'])
        Owner=int(request.form['Owner'])
        Fuel_Type=request.form['Fuel_Type']
        age_of_the_car=request.form['age_of_the_car']
        Seller_Type=request.form['Seller_Type']
        Transmission=request.form['Transmission']
       
        prediction = model.predict([[Present_Price,Kms_Driven,Owner,Fuel_Type,age_of_the_car,Seller_Type, Transmission]])  # making prediction
        output=round(predict[0],2)
        return render_template('index.html',
                           prediction_text="Selling price of the car is {} lakhs".format(output))  # rendering the predicted result


if __name__ == "__main__":
    app.run(debug=True)


### 13. Deploy your app on Heroku. (write commands for deployment)

### 14. Paste the URL of the heroku application below, and while submitting the solution submit this notebook along with the source code.

In [None]:
http://127.0.0.1:5000

### Happy Learning :)