# <font color=darkblue> Machine Learning model deployment with Flask framework on Heroku</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with flask framework on heroku.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [1]:
import numpy as np 
import pandas as pd 
import seaborn as sns 
import matplotlib.ticker as mtick 
import matplotlib.pyplot as plt
import os
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
import scipy.stats as zscore
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, BaggingClassifier, AdaBoostClassifier, GradientBoostingClassifier
from sklearn.ensemble import StackingClassifier

### 2. Load the dataset

In [2]:
df = pd.read_csv('car+data.csv', encoding='unicode_escape')
df.head()

Unnamed: 0,Car_Name,Year,Selling_Price,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner
0,ritz,2014,3.35,5.59,27000,Petrol,Dealer,Manual,0
1,sx4,2013,4.75,9.54,43000,Diesel,Dealer,Manual,0
2,ciaz,2017,7.25,9.85,6900,Petrol,Dealer,Manual,0
3,wagon r,2011,2.85,4.15,5200,Petrol,Dealer,Manual,0
4,swift,2014,4.6,6.87,42450,Diesel,Dealer,Manual,0


In [3]:
df['Car_Name'].value_counts()

city                        26
corolla altis               16
verna                       14
fortuner                    11
brio                        10
                            ..
Honda CB Trigger             1
Yamaha FZ S                  1
Bajaj Pulsar 135 LS          1
Activa 4g                    1
Bajaj Avenger Street 220     1
Name: Car_Name, Length: 98, dtype: int64

### 3. Check the shape and basic information of the dataset.

In [4]:
df.shape

(301, 9)

In [5]:
df.info

<bound method DataFrame.info of     Car_Name  Year  Selling_Price  Present_Price  Kms_Driven Fuel_Type  \
0       ritz  2014           3.35           5.59       27000    Petrol   
1        sx4  2013           4.75           9.54       43000    Diesel   
2       ciaz  2017           7.25           9.85        6900    Petrol   
3    wagon r  2011           2.85           4.15        5200    Petrol   
4      swift  2014           4.60           6.87       42450    Diesel   
..       ...   ...            ...            ...         ...       ...   
296     city  2016           9.50          11.60       33988    Diesel   
297     brio  2015           4.00           5.90       60000    Petrol   
298     city  2009           3.35          11.00       87934    Petrol   
299     city  2017          11.50          12.50        9000    Diesel   
300     brio  2016           5.30           5.90        5464    Petrol   

    Seller_Type Transmission  Owner  
0        Dealer       Manual      0  
1  

### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [5]:
print("Duplicate Records : ",len(df[df.duplicated()]))

Duplicate Records :  2


In [6]:
df.drop_duplicates(inplace=True)
df.shape

(299, 9)

### 5. Drop the columns which you think redundant for the analysis.

In [7]:
df = df.drop(columns=(["Car_Name","Present_Price"]), axis=1)
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 299 entries, 0 to 300
Data columns (total 7 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Year           299 non-null    int64  
 1   Selling_Price  299 non-null    float64
 2   Kms_Driven     299 non-null    int64  
 3   Fuel_Type      299 non-null    object 
 4   Seller_Type    299 non-null    object 
 5   Transmission   299 non-null    object 
 6   Owner          299 non-null    int64  
dtypes: float64(1), int64(3), object(3)
memory usage: 18.7+ KB


In [8]:
miss_val_per = df.isnull().sum()*100/len(df)
print(miss_val_per)

Year             0.0
Selling_Price    0.0
Kms_Driven       0.0
Fuel_Type        0.0
Seller_Type      0.0
Transmission     0.0
Owner            0.0
dtype: float64


### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [9]:
from datetime import date

df['Age_of_the_car']=date.today().year - df['Year']
df[['Year','Age_of_the_car']].head()

Unnamed: 0,Year,Age_of_the_car
0,2014,8
1,2013,9
2,2017,5
3,2011,11
4,2014,8


In [10]:
df = df.drop("Year", axis=1)
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 299 entries, 0 to 300
Data columns (total 7 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Selling_Price   299 non-null    float64
 1   Kms_Driven      299 non-null    int64  
 2   Fuel_Type       299 non-null    object 
 3   Seller_Type     299 non-null    object 
 4   Transmission    299 non-null    object 
 5   Owner           299 non-null    int64  
 6   Age_of_the_car  299 non-null    int64  
dtypes: float64(1), int64(3), object(3)
memory usage: 18.7+ KB


### 7. Encode the categorical columns

In [11]:
#Label Encoding

category = ['Fuel_Type','Seller_Type','Transmission']

lbl_encode=LabelEncoder()

for i in category:
    df[i]=df[[i]].apply(lbl_encode.fit_transform)

df.head()

Unnamed: 0,Selling_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner,Age_of_the_car
0,3.35,27000,2,0,1,0,8
1,4.75,43000,1,0,1,0,9
2,7.25,6900,2,0,1,0,5
3,2.85,5200,2,0,1,0,11
4,4.6,42450,1,0,1,0,8


In [13]:
# One-hot encode the data using pandas get_dummies
#features = pd.get_dummies(df)
# Display the first 5 rows of the last 12 columns
#features.iloc[:,:].head(5)

In [15]:
# Labels are the values we want to predict
labels = np.array(df['Selling_Price'])
# Remove the labels from the features
# axis 1 refers to the columns
##features= features.drop('Selling_Price', axis = 1)
# Saving feature names for later use
feature_list = list(df.columns)
# Convert to numpy array
features = np.array(df)

array([[3.3500e+00, 2.7000e+04, 2.0000e+00, ..., 0.0000e+00, 0.0000e+00,
        0.0000e+00],
       [4.7500e+00, 4.3000e+04, 1.0000e+00, ..., 0.0000e+00, 0.0000e+00,
        0.0000e+00],
       [7.2500e+00, 6.9000e+03, 2.0000e+00, ..., 0.0000e+00, 0.0000e+00,
        0.0000e+00],
       ...,
       [3.3500e+00, 8.7934e+04, 2.0000e+00, ..., 0.0000e+00, 0.0000e+00,
        0.0000e+00],
       [1.1500e+01, 9.0000e+03, 1.0000e+00, ..., 0.0000e+00, 0.0000e+00,
        0.0000e+00],
       [5.3000e+00, 5.4640e+03, 2.0000e+00, ..., 0.0000e+00, 0.0000e+00,
        0.0000e+00]])

### 8. Separate the target and independent features.

In [16]:
df.corrwith(df['Selling_Price'])
#features.corrwith(features['Selling_Price'])

Selling_Price     1.000000
Kms_Driven        0.028566
Fuel_Type        -0.500292
Seller_Type      -0.553851
Transmission     -0.348869
Owner            -0.087880
Age_of_the_car   -0.234369
dtype: float64

In [49]:
#x_data = df[['Car_Name', 'Kms_Driven', 'Fuel_Type', 'Seller_Type', 'Transmission', 'age_of_the_car']]
#y_data = df['Selling_Price']

### 9. Split the data into train and test.

In [17]:
#x_train, x_test, y_train, y_test = train_test_split(x_data, y_data ,test_size = 0.2, shuffle=False)

# Split the data into training and testing sets
train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size = 0.2, random_state = 42)
print("Training Features Shape : ",train_features.shape)
print("\nTraining Lebels Shape : ",train_labels.shape)
print("\nTesting Features Shape : ",test_features.shape)
print("\nTesting Labels Shape : ",test_labels.shape)

Training Features Shape :  (239, 7)

Training Lebels Shape :  (239,)

Testing Features Shape :  (60, 7)

Testing Labels Shape :  (60,)


In [18]:
#Establishing Baseline
# The baseline predictions are the historical averages
baseline_preds = test_features[:, feature_list.index('Selling_Price')]
# Baseline errors, and display average baseline error
baseline_errors = abs(baseline_preds - test_labels)
print('Average baseline error: ', round(np.mean(baseline_errors), 2))

Average baseline error:  0.0


### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [19]:
from sklearn.ensemble import RandomForestRegressor

# Instantiate model with 1000 decision trees
rf = RandomForestRegressor(n_estimators = 1000, random_state = 42)
# Train the model on training data
rf.fit(train_features, train_labels);

In [20]:
# Use the forest's predict method on the test data
predictions = rf.predict(test_features)
# Calculate the absolute errors
errors = abs(predictions - test_labels)
# Print out the mean absolute error (mae)
print('Mean Absolute Error:', round(np.mean(errors), 2), 'Prices')

Mean Absolute Error: 0.17 Prices


In [21]:
# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / test_labels)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Accuracy: 98.22 %.


### 11. Create a pickle file with an extension as .pkl

In [22]:
import pickle
# Saving model to disk
pickle.dump(rf, open('model.pkl','wb'))

# Loading model to compare the results
##model = pickle.load(open('model.pkl','rb'))


### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
</head>

<body>

    <div class="hero-image">
      <div class="hero-text">

        <h1 style="font-size:50px">Used Car Price Predictor</h1>
         <h3>{{ prediction_text }}</h3>
      </div>
    </div>

     <style>

        body, html {
          height: 100%;
          margin: 0;
          font-family: Arial, Helvetica, sans-serif;
        }

        .hero-image {
          background-image: linear-gradient(rgba(0, 0, 0, 0.5), rgba(0, 0, 0, 0.5)), url('/static/image.jpg');
          height: 25%; <!--50%-->
          background-position: bottom;
          background-repeat: no-repeat;
          background-size: cover;
          position: relative;
        }

        .hero-text {
          text-align: center;
          position: absolute;
          top: 50%;
          left: 50%;
          transform: translate(-50%, -50%);
          color: white;
        }

    </style>


    <div style="color:	rgb(0, 0, 0)">
        <form action="{{ url_for('predict')}}" method="post">
            <h2>Enter Car Details: </h2>
            <h3>Age of the car (In years) :
            <input id="first" name="Age_of_the_car" type="number"></h3>
            <h3>Present Showroom Price (In lakhs) : <input id="second" name="Present_Price" required="required"></h3>
            <h3>Kilometers Driven : <input id="third" name="Kms_Driven" required="required"></h3>
            <h3>Owner Type (0/1/3) : <input id="fourth" name="Owner" required="required"></h3>
            <h3>Fuel type : <select name="Fuel_Type" id="fuel" required="required"></h3>
                <option value="0">Petrol</option>
                <option value="1">Diesel</option>
                <option value="2">CNG</option>
            </select>
            <h3>Seller Type : <select name="Seller_Type" id="resea" required="required"></h3>
                <option value="0">Dealer</option>
                <option value="1">Individual</option>
            </select>
            <h3>Transmission type : <select name="Transmission" id="research" required="required"></h3>
                <option value="0">Manual Car</option>
                <option value="1">Automatic Car</option>
            </select>
            <br><br><button id="sub" type="submit ">Predict Selling Price</button>
            <br>


        </form>

    </div>

    <style>
	body {
            background-color: 101, 10, 20;
            text-align: center;
            padding: 0px;
	    font-family: Helvetica;
        }

        #research {
            font-size: 18px;
            width: 200px;
            height: 23px;
            top: 23px;
        }

        #box {
            border-radius: 60px;
            border-color: 45px;
            border-style: solid;
            text-align: center;
            background-color: white;
            font-size: medium;
            position: absolute;
            width: 700px;
            bottom: 9%;
            height: 800px; <!--850px;-->
            right: 30%;
            padding: 0px;
            margin: 0px;
            font-size: 14px;
        }

        #fuel {
            width: 83px;
            height: 43px;
            text-align: center;
            border-radius: 14px;
            font-size: 20px;
        }

        #fuel:hover {
            background-color: white;
        }

        #research {
            width: 150px;
            height: 43px;
            text-align: center;
            border-radius: 14px;
            font-size: 18px;
        }

        #research:hover {
            background-color: white;
        }

        #resea {
            width: 99px;
            height: 43px;
            text-align: center;
            border-radius: 14px;
            font-size: 18px;
        }

        #resea:hover {
            background-color: white;
        }

        #sub {
            background-color: Blue;
            font-family:'Helvetica' monospace;
            font-weight: bold;
            width: 180px;
            height: 60px;
            text-align: center;
            border-radius: 20px;
            font-size: 18px;
            color: Black;
        }

        #sub:hover {
            background-color: orange;
        }

        #first {
            border-radius: 14px;
            height: 25px;
            font-size: 20px;
            text-align: center;
        }

        #second {
            border-radius: 14px;
            height: 25px;
            font-size: 20px;
            text-align: center;
        }

        #third {
            border-radius: 14px;
            height: 25px;
            font-size: 20px;
            text-align: center;
        }

        #fourth {
            border-radius: 14px;
            height: 25px;
            font-size: 20px;
            text-align: center;
        }
    </style>
</body>

</html>

### b) Create app.py file and write the predict function


from flask import Flask, render_template, request, jsonify
import pickle
import numpy as np
import sklearn

app = Flask(__name__)
model = pickle.load(open('model.pkl','rb'))

@app.route('/', methods=['GET'])
def Home():
    return render_template('index.html')

@app.route("/predict", methods=['POST'])
def predict():
    if request.method == 'POST':
        Present_Price=float(request.form['Present_Price'])  
        Kms_Driven=int(request.form['Kms_Driven'])
        Owner=int(request.form['Owner'])
        Fuel_Type=request.form['Fuel_Type']
        Age_of_the_car=request.form['Age_of_the_car']
        Seller_Type=request.form['Seller_Type']
        Transmission=request.form['Transmission'] 
        
        prediction=model.predict([[Present_Price,Kms_Driven,Owner,Age_of_the_car,Fuel_Type,Seller_Type,Transmission]])
        output=round(prediction[0],2)
        return render_template('index.html',prediction_text="You can sell your car at {} lakhs".format(output))
    
if __name__=="__main__":
    app.run(debug=True)
        

### 13. Deploy your app on Heroku. (write commands for deployment)

login to heroku with id and password

create heroku used-car-check (will be seen in the heroku dash board)

create profile to run the application commands in the heroku platform

git init (initialize a git repository)

git add .  (add the files to the repository)

git commit -m "initial commit"

heroku git:remote -a used-car-check
    
git push heroku master  (push the files from the repository to heroku app)

### 14. Paste the URL of the heroku application below, and while submitting the solution submit this notebook along with the source code.

In [None]:
https://used-car-check.herokuapp.com/

### Happy Learning :)