# <font color=darkblue> Machine Learning model deployment with Flask framework on Heroku</font>

## <font color=Blue>Used Cars Price Prediction Application</font>

### Objective:
1. To build a Machine learning regression model to predict the selling price of the used cars based on the different input features like fuel_type, kms_driven, type of transmission etc.
2. Deploy the machine learning model with flask framework on heroku.

### Dataset Information:
#### Dataset Source: https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=CAR+DETAILS+FROM+CAR+DEKHO.csv
This dataset contains information about used cars listed on www.cardekho.com
- **Car_Name**: Name of the car
- **Year**: Year of Purchase
- **Selling Price (target)**: Selling price of the car in lakhs
- **Present Price**: Present price of the car in lakhs
- **Kms_Driven**: kilometers driven
- **Fuel_Type**: Petrol/diesel/CNG
- **Seller_Type**: Dealer or Indiviual
- **Transmission**: Manual or Automatic
- **Owner**: first, second or third owner


### 1. Import required libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

### 2. Load the dataset

In [None]:
df = pd.read_csv('car+data.csv')
df.head()

### 3. Check the shape and basic information of the dataset.

In [None]:
df.shape

In [None]:
df.info()

### 4. Check for the presence of the duplicate records in the dataset? If present drop them

In [None]:
len(df[df.duplicated()])

In [None]:
df.drop_duplicates(inplace=True)

In [None]:
len(df[df.duplicated()])

### 5. Drop the columns which you think redundant for the analysis.

In [None]:
df.drop('Car_Name',axis=1,inplace=True)

### 6. Extract a new feature called 'age_of_the_car' from the feature 'year' and drop the feature year

In [None]:
df['age_of_the_car'] = 2023 - df['Year']

In [None]:
df.drop('Year',axis=1,inplace=True)

In [None]:
df.head(2)

### 7. Encode the categorical columns

In [None]:
from sklearn.preprocessing import LabelEncoder

# Initialize LabelEncoder
label_encoder = LabelEncoder()

# Encode categorical columns using label encoding
df['Fuel_Type'] = label_encoder.fit_transform(df['Fuel_Type'])
df['Seller_Type'] = label_encoder.fit_transform(df['Seller_Type'])
df['Transmission'] = label_encoder.fit_transform(df['Transmission'])
df['Owner'] = label_encoder.fit_transform(df['Owner'])

# Print the encoded dataset
print(df)

### 8. Separate the target and independent features.

In [None]:
X= df.drop('Selling_Price',axis=1)
y= df['Selling_Price']

### 9. Split the data into train and test.

In [None]:
X_train,X_test,y_train,y_test= train_test_split(X,y,test_size=0.3,random_state=0)

print(X_train.shape,X_test.shape)
print(y_train.shape,y_test.shape)

### 10. Build a Random forest Regressor model and check the r2-score for train and test.

In [None]:
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
import pandas as pd

# Encode categorical columns using one-hot encoding
df_encoded = pd.get_dummies(df, drop_first=True)


# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Random Forest Regressor model
rf_model = RandomForestRegressor()

# Fit the model on the training data
rf_model.fit(X_train, y_train)

# Make predictions on the training and testing data
train_predictions = rf_model.predict(X_train)
test_predictions = rf_model.predict(X_test)

# Calculate R2 score for train and test
train_r2_score = r2_score(y_train, train_predictions)
test_r2_score = r2_score(y_test, test_predictions)

# Print the R2 scores
print("R2 score for train:", train_r2_score)
print("R2 score for test:", test_r2_score)

### 11. Create a pickle file with an extension as .pkl

In [None]:
##uploaded seperately

### 12. Create new folder/new project in visual studio/pycharm that should contain the "model.pkl" file *make sure you are using a virutal environment and install required packages.*

### a) Create a basic HTML form for the frontend

Create a file **index.html** in the templates folder and copy the following code.

In [None]:
##uploaded seperately

### b) Create app.py file and write the predict function

In [None]:
##uploaded seperately

### 13. Deploy your app on Heroku. (write commands for deployment)

In [None]:
##uploaded seperately

### 14. Paste the URL of the heroku application below, and while submitting the solution submit this notebook along with the source code.

In [None]:
##uploaded seperately

### Happy Learning :)