# 01_CO₂ Emission Prediction 🚗💨

---


### 🔗 Live Demo:
[Click here to test the model](https://your-streamlit-app-link.com)  

### 📌 How to Use:
1. Open the **live demo link** above.
2. Enter the required values (Engine Size, Cylinders, Fuel Consumption, etc.).
3. Click the **Predict** button to get the CO₂ emission estimate.
4. The model will process your input and display the prediction instantly.

---
🔍 **For Code & Implementation Details:**  
Check out the **GitHub Repository** [here](https://your-github-repo-link.com).  

⚡ **Note:** This notebook contains the full workflow (data exploration, training, and inference). You can explore the entire process or just produce inference using live demo link.  

---




In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# reading dataset
df=pd.read_csv("CO2_Emissions_Canada.csv")
df.head(20)

Unnamed: 0,Make,Model,Vehicle Class,Engine Size(L),Cylinders,Transmission,Fuel Type,Fuel Consumption City (L/100 km),Fuel Consumption Hwy (L/100 km),Fuel Consumption Comb (L/100 km),Fuel Consumption Comb (mpg),CO2 Emissions(g/km)
0,ACURA,ILX,COMPACT,2.0,4,AS5,Z,9.9,6.7,8.5,33,196
1,ACURA,ILX,COMPACT,2.4,4,M6,Z,11.2,7.7,9.6,29,221
2,ACURA,ILX HYBRID,COMPACT,1.5,4,AV7,Z,6.0,5.8,5.9,48,136
3,ACURA,MDX 4WD,SUV - SMALL,3.5,6,AS6,Z,12.7,9.1,11.1,25,255
4,ACURA,RDX AWD,SUV - SMALL,3.5,6,AS6,Z,12.1,8.7,10.6,27,244
5,ACURA,RLX,MID-SIZE,3.5,6,AS6,Z,11.9,7.7,10.0,28,230
6,ACURA,TL,MID-SIZE,3.5,6,AS6,Z,11.8,8.1,10.1,28,232
7,ACURA,TL AWD,MID-SIZE,3.7,6,AS6,Z,12.8,9.0,11.1,25,255
8,ACURA,TL AWD,MID-SIZE,3.7,6,M6,Z,13.4,9.5,11.6,24,267
9,ACURA,TSX,COMPACT,2.4,4,AS5,Z,10.6,7.5,9.2,31,212


In [None]:
df.shape

(7385, 12)

In [None]:
df.info

In [None]:
##summary of the dataset
df.describe()

Unnamed: 0,Engine Size(L),Cylinders,Fuel Consumption City (L/100 km),Fuel Consumption Hwy (L/100 km),Fuel Consumption Comb (L/100 km),Fuel Consumption Comb (mpg),CO2 Emissions(g/km)
count,7385.0,7385.0,7385.0,7385.0,7385.0,7385.0,7385.0
mean,3.160068,5.61503,12.556534,9.041706,10.975071,27.481652,250.584699
std,1.35417,1.828307,3.500274,2.224456,2.892506,7.231879,58.512679
min,0.9,3.0,4.2,4.0,4.1,11.0,96.0
25%,2.0,4.0,10.1,7.5,8.9,22.0,208.0
50%,3.0,6.0,12.1,8.7,10.6,27.0,246.0
75%,3.7,6.0,14.6,10.2,12.6,32.0,288.0
max,8.4,16.0,30.6,20.6,26.1,69.0,522.0


In [None]:
##Missing Values
df.isnull().sum()

Unnamed: 0,0
Make,0
Model,0
Vehicle Class,0
Engine Size(L),0
Cylinders,0
Transmission,0
Fuel Type,0
Fuel Consumption City (L/100 km),0
Fuel Consumption Hwy (L/100 km),0
Fuel Consumption Comb (L/100 km),0


### Insights and observation
The dataset has no missing values

In [None]:
df.head(2)

Unnamed: 0,Make,Model,Vehicle Class,Engine Size(L),Cylinders,Transmission,Fuel Type,Fuel Consumption City (L/100 km),Fuel Consumption Hwy (L/100 km),Fuel Consumption Comb (L/100 km),Fuel Consumption Comb (mpg),CO2 Emissions(g/km)
0,ACURA,ILX,COMPACT,2.0,4,AS5,Z,9.9,6.7,8.5,33,196
1,ACURA,ILX,COMPACT,2.4,4,M6,Z,11.2,7.7,9.6,29,221


In [None]:
## List down all the columns names
df.columns

Index(['Make', 'Model', 'Vehicle Class', 'Engine Size(L)', 'Cylinders',
       'Transmission', 'Fuel Type', 'Fuel Consumption City (L/100 km)',
       'Fuel Consumption Hwy (L/100 km)', 'Fuel Consumption Comb (L/100 km)',
       'Fuel Consumption Comb (mpg)', 'CO2 Emissions(g/km)'],
      dtype='object')

In [None]:
df['Make'].unique()

array(['ACURA', 'ALFA ROMEO', 'ASTON MARTIN', 'AUDI', 'BENTLEY', 'BMW',
       'BUICK', 'CADILLAC', 'CHEVROLET', 'CHRYSLER', 'DODGE', 'FIAT',
       'FORD', 'GMC', 'HONDA', 'HYUNDAI', 'INFINITI', 'JAGUAR', 'JEEP',
       'KIA', 'LAMBORGHINI', 'LAND ROVER', 'LEXUS', 'LINCOLN', 'MASERATI',
       'MAZDA', 'MERCEDES-BENZ', 'MINI', 'MITSUBISHI', 'NISSAN',
       'PORSCHE', 'RAM', 'ROLLS-ROYCE', 'SCION', 'SMART', 'SRT', 'SUBARU',
       'TOYOTA', 'VOLKSWAGEN', 'VOLVO', 'GENESIS', 'BUGATTI'],
      dtype=object)

In [None]:
df['Vehicle Class'].unique()

array(['COMPACT', 'SUV - SMALL', 'MID-SIZE', 'TWO-SEATER', 'MINICOMPACT',
       'SUBCOMPACT', 'FULL-SIZE', 'STATION WAGON - SMALL',
       'SUV - STANDARD', 'VAN - CARGO', 'VAN - PASSENGER',
       'PICKUP TRUCK - STANDARD', 'MINIVAN', 'SPECIAL PURPOSE VEHICLE',
       'STATION WAGON - MID-SIZE', 'PICKUP TRUCK - SMALL'], dtype=object)

In [None]:
df['Cylinders'].unique()

array([ 4,  6, 12,  8, 10,  3,  5, 16])

In [None]:
df['Transmission'].unique()

array(['AS5', 'M6', 'AV7', 'AS6', 'AM6', 'A6', 'AM7', 'AV8', 'AS8', 'A7',
       'A8', 'M7', 'A4', 'M5', 'AV', 'A5', 'AS7', 'A9', 'AS9', 'AV6',
       'AS4', 'AM5', 'AM8', 'AM9', 'AS10', 'A10', 'AV10'], dtype=object)



### **Understanding the Code Format**  
Each transmission code consists of:  
- **A / M / AS / AM / AV** → The type of transmission  
- **Number (4, 5, 6, 7, etc.)** → The number of gears  

### **Transmission Type Breakdown**  
- **A** → Automatic transmission (Traditional torque-converter automatic)  
- **M** → Manual transmission  
- **AS** → Automated Sequential transmission (Automated Manual Transmission or Dual-Clutch Transmission)  
- **AM** → Automated Manual transmission (Similar to AS but sometimes used differently by manufacturers)  
- **AV** → Continuously Variable Transmission (CVT)  

### **Examples and Their Meanings**  
- **M6** → 6-speed Manual transmission  
- **A6** → 6-speed Automatic transmission  
- **AS5** → 5-speed Automated Sequential transmission  
- **AM6** → 6-speed Automated Manual transmission  
- **AV7** → 7-speed CVT (Continuously Variable Transmission)  
- **A10** → 10-speed Automatic transmission  
- **AS8** → 8-speed Dual-Clutch or Automated Sequential transmission  
- **AV10** → 10-speed CVT  

### **Summary**  
- **M** → Manual  
- **A** → Automatic  
- **AS / AM** → Automated versions of Manual Transmission (e.g., Dual-Clutch or Single-Clutch Automated Manual)  
- **AV** → CVT  


In [None]:
df['Fuel Type'].unique()

array(['Z', 'D', 'X', 'E', 'N'], dtype=object)

Z → Zero-emission vehicle (ZEV), such as electric (EVs) or hydrogen fuel cell vehicles

X → Unknown or other fuel type (sometimes used for hybrid vehicles)

D → Diesel fuel

E → Ethanol or flex-fuel (E85, for example)

N → Natural gas (CNG or LNG)

In [None]:
## Duplicate records
df[df.duplicated()]

Unnamed: 0,Make,Model,Vehicle Class,Engine Size(L),Cylinders,Transmission,Fuel Type,Fuel Consumption City (L/100 km),Fuel Consumption Hwy (L/100 km),Fuel Consumption Comb (L/100 km),Fuel Consumption Comb (mpg),CO2 Emissions(g/km)
1075,ACURA,RDX AWD,SUV - SMALL,3.5,6,AS6,Z,12.1,8.7,10.6,27,244
1076,ACURA,RLX,MID-SIZE,3.5,6,AS6,Z,11.9,7.7,10.0,28,230
1081,ALFA ROMEO,4C,TWO-SEATER,1.8,4,AM6,Z,9.7,6.9,8.4,34,193
1082,ASTON MARTIN,DB9,MINICOMPACT,5.9,12,A6,Z,18.0,12.6,15.6,18,359
1084,ASTON MARTIN,V8 VANTAGE,TWO-SEATER,4.7,8,AM7,Z,17.4,11.3,14.7,19,338
...,...,...,...,...,...,...,...,...,...,...,...,...
7356,TOYOTA,Tundra,PICKUP TRUCK - STANDARD,5.7,8,AS6,X,17.7,13.6,15.9,18,371
7365,VOLKSWAGEN,Golf GTI,COMPACT,2.0,4,M6,X,9.8,7.3,8.7,32,203
7366,VOLKSWAGEN,Jetta,COMPACT,1.4,4,AS8,X,7.8,5.9,7.0,40,162
7367,VOLKSWAGEN,Jetta,COMPACT,1.4,4,M6,X,7.9,5.9,7.0,40,163


In [None]:
## Remove the duplicates
df.drop_duplicates(inplace=True)

In [None]:
df.shape

(6282, 12)

In [None]:
print(df.dtypes)  # Check column data types


Make                                 object
Model                                object
Vehicle Class                        object
Engine Size(L)                      float64
Cylinders                             int64
Transmission                         object
Fuel Type                            object
Fuel Consumption City (L/100 km)    float64
Fuel Consumption Hwy (L/100 km)     float64
Fuel Consumption Comb (L/100 km)    float64
Fuel Consumption Comb (mpg)           int64
CO2 Emissions(g/km)                   int64
dtype: object


In [None]:
print(df.select_dtypes(include=['object']).nunique())  # Show unique counts for object (categorical) columns


Make               42
Model            2053
Vehicle Class      16
Transmission       27
Fuel Type           5
dtype: int64


In [None]:
## Correlation
correlation_matrix = df.corr()
print(correlation_matrix)

ValueError: could not convert string to float: 'ACURA'