# <font color=darkblue> Machine Learning model deployment with Flask framework</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with the help of the flask framework.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score
import pickle


### 2. Load the dataset

In [2]:
file_path = "C:\\Users\\anjal\\Downloads\\car+data.csv"
df = pd.read_csv(file_path)


### 3. Check the shape and basic information of the dataset.

In [3]:
print(df.shape)
print(df.info())


(301, 9)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 301 entries, 0 to 300
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Car_Name       301 non-null    object 
 1   Year           301 non-null    int64  
 2   Selling_Price  301 non-null    float64
 3   Present_Price  301 non-null    float64
 4   Kms_Driven     301 non-null    int64  
 5   Fuel_Type      301 non-null    object 
 6   Seller_Type    301 non-null    object 
 7   Transmission   301 non-null    object 
 8   Owner          301 non-null    int64  
dtypes: float64(2), int64(3), object(4)
memory usage: 21.3+ KB
None


### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [4]:
df.drop_duplicates(inplace=True)


### 5. Drop the columns which you think redundant for the analysis.

In [5]:
df.drop(['Owner'], axis=1, inplace=True)


### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [6]:
current_year = 2024
df['age_of_the_car'] = current_year - df['Year']
df.drop(['Year'], axis=1, inplace=True)


### 7. Encode the categorical columns

In [7]:
df = pd.get_dummies(df, columns=['Fuel_Type', 'Seller_Type', 'Transmission'], drop_first=True)


### 8. Separate the target and independent features.

In [9]:
X = df.drop('Selling_Price', axis=1)
y = df['Selling_Price']


### 9. Split the data into train and test.

In [10]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [18]:
# 1. Import required libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score

# 2. Load the dataset
file_path = "C:\\Users\\anjal\\Downloads\\car+data.csv"
df = pd.read_csv(file_path)

# 3. Check for duplicate records and drop them if present
df.drop_duplicates(inplace=True)

# 4. Drop unnecessary columns (e.g., 'Owner', 'Car_Name')
df.drop(['Owner', 'Car_Name'], axis=1, inplace=True)

# 5. Extract 'age_of_the_car' and drop 'Year'
current_year = 2024
df['age_of_the_car'] = current_year - df['Year']
df.drop(['Year'], axis=1, inplace=True)

# 6. Encode categorical columns using one-hot encoding
df = pd.get_dummies(df, columns=['Fuel_Type', 'Seller_Type', 'Transmission'], drop_first=True)

# 7. Separate target and independent features
X = df.drop('Selling_Price', axis=1)  # Features
y = df['Selling_Price']                 # Target variable

# 8. Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 9. Build a Random Forest Regressor model and check the R² score for train and test sets
model = RandomForestRegressor(n_estimators=100, random_state=42)

# Fit the model on the training data
model.fit(X_train, y_train)

# Make predictions on the training set
y_train_pred = model.predict(X_train)

# Make predictions on the testing set
y_test_pred = model.predict(X_test)

# Calculate R² score for training set
train_r2_score = r2_score(y_train, y_train_pred)
print("Train R² Score:", train_r2_score)

# Calculate R² score for testing set
test_r2_score = r2_score(y_test, y_test_pred)
print("Test R² Score:", test_r2_score)


Train R² Score: 0.985247376961392
Test R² Score: 0.5532706844056646


### 11. Create a pickle file with an extension as .pkl

In [22]:
# Import the pickle library
import pickle

# Assuming 'model' is your trained Random Forest Regressor
with open('model.pkl', 'wb') as file:
    pickle.dump(model, file)

print("Model saved as model.pkl")


Model saved as model.pkl


### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

In [20]:
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Used Car Price Prediction</title>
</head>
<body>
    <h1>Used Car Price Prediction</h1>
    <form action="/predict" method="POST">
        <!-- Add input fields for each feature -->
        <label for="kms_driven">Kms Driven:</label>
        <input type="text" id="kms_driven" name="kms_driven"><br>
        <label for="fuel_type">Fuel Type:</label>
        <select id="fuel_type" name="fuel_type">
            <option value="Petrol">Petrol</option>
            <option value="Diesel">Diesel</option>
            <option value="CNG">CNG</option>
        </select><br>
        <!-- Add more fields as necessary -->
        <input type="submit" value="Predict">
    </form>
</body>
</html>


SyntaxError: invalid syntax (1299110465.py, line 1)

### b) Create app.py file and write the predict function

In [25]:
!pip install Flask


Defaulting to user installation because normal site-packages is not writeable
Collecting Flask
  Downloading flask-3.1.0-py3-none-any.whl.metadata (2.7 kB)
Collecting Werkzeug>=3.1 (from Flask)
  Downloading werkzeug-3.1.3-py3-none-any.whl.metadata (3.7 kB)
Collecting blinker>=1.9 (from Flask)
  Downloading blinker-1.9.0-py3-none-any.whl.metadata (1.6 kB)
Downloading flask-3.1.0-py3-none-any.whl (102 kB)
Downloading blinker-1.9.0-py3-none-any.whl (8.5 kB)
Downloading werkzeug-3.1.3-py3-none-any.whl (224 kB)
Installing collected packages: Werkzeug, blinker, Flask
  Attempting uninstall: Werkzeug
    Found existing installation: Werkzeug 3.0.4
    Uninstalling Werkzeug-3.0.4:
      Successfully uninstalled Werkzeug-3.0.4
Successfully installed Flask-3.1.0 Werkzeug-3.1.3 blinker-1.9.0




In [37]:
from flask import Flask, request, render_template
import pickle
import numpy as np
import os

# Set the working directory to where your files are located
os.chdir(r"C:\Users\anjal\Downloads")

app = Flask(__name__)

# Load the model from disk
model = pickle.load(open('model.pkl', 'rb'))

@app.route('/')
def home():
    return render_template('index.html')

@app.route('/predict', methods=['POST'])
def predict():
    # Get input values from form
    kms_driven = float(request.form['kms_driven'])
    fuel_type = request.form['fuel_type']
    seller_type = request.form['seller_type']
    transmission = request.form['transmission']
    age_of_the_car = int(request.form['age_of_the_car'])

    # Convert categorical variables to numerical format using one-hot encoding
    fuel_type_diesel = 1 if fuel_type == 'Diesel' else 0
    fuel_type_petrol = 1 if fuel_type == 'Petrol' else 0
    seller_type_individual = 1 if seller_type == 'Individual' else 0
    transmission_manual = 1 if transmission == 'Manual' else 0

    # Prepare input data for prediction (ensure it matches model input shape)
    input_data = np.array([[kms_driven, fuel_type_diesel, fuel_type_petrol,
                            seller_type_individual, transmission_manual, age_of_the_car]])

    # Make prediction
    try:
        prediction = model.predict(input_data)
        return f'The predicted selling price is: {prediction[0]:.2f} lakhs'
    except ValueError as e:
        return f"Error in prediction: {e}"

if __name__ == '__main__':
    app.run(debug=True)


 * Serving Flask app '__main__'
 * Debug mode: on


 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
 * Restarting with stat


SystemExit: 1

### 13. Run the app.py python file which will render to index html page then enter the input values and get the prediction.

In [36]:
python app.py

SyntaxError: invalid syntax (945115591.py, line 1)

### Happy Learning :)