## 💡 Learn more

The following DataCamp courses can help review the skills needed for this challenge:

* [Introduction to Python](https://www.datacamp.com/courses/introduction-to-python)
* [Introduction to SQL](https://www.datacamp.com/courses/introduction-to-sql)

## 1️⃣ Python 🐍 - CO2 Emissions

Now let's now move on to the competition and challenge.

## 📖 Background
You volunteer for a public policy advocacy organization in Canada, and your colleague asked you to help her draft recommendations for guidelines on CO2 emissions rules. 

After researching emissions data for a wide range of Canadian vehicles, she would like you to investigate which vehicles produce lower emissions.



## 💾 The data I
​
### You have access to seven years of CO2 emissions data for Canadian vehicles ([source](https://open.canada.ca/data/en/dataset/98f1a129-f628-4ce4-b24d-6f16bf24dd64#wb-auto-6)):
​
- "Make" - The company that manufactures the vehicle.
- "Model" - The vehicle's model.
- "Vehicle Class" - Vehicle class by utility, capacity, and weight.
- "Engine Size(L)" - The engine's displacement in liters.
- "Cylinders" - The number of cylinders.
- "Transmission" - The transmission type: A = Automatic, AM = Automatic Manual, AS = Automatic with select shift, AV = Continuously variable, M = Manual, 3 - 10 = the number of gears.
- "Fuel Type" - The fuel type: X = Regular gasoline, Z = Premium gasoline, D = Diesel, E = Ethanol (E85), N = natural gas.
- "Fuel Consumption Comb (L/100 km)" - Combined city/highway (55%/45%) fuel consumption in liters per 100 km (L/100 km).
- "CO2 Emissions(g/km)" - The tailpipe carbon dioxide emissions in grams per kilometer for combined city and highway driving. 
​

The data comes from the Government of Canada's open data [website](https://open.canada.ca/en).


In [None]:
# Import the pandas and numpy packages
import pandas as pd
import numpy as np

# Load the data
cars = pd.read_csv('data/co2_emissions_canada.csv')

# create numpy arrays
cars_makes = cars['Make'].to_numpy()
cars_models = cars['Model'].to_numpy()
cars_classes = cars['Vehicle Class'].to_numpy()
cars_engine_sizes = cars['Engine Size(L)'].to_numpy()
cars_cylinders = cars['Cylinders'].to_numpy()
cars_transmissions = cars['Transmission'].to_numpy()
cars_fuel_types = cars['Fuel Type'].to_numpy()
cars_fuel_consumption = cars['Fuel Consumption Comb (L/100 km)'].to_numpy()
cars_co2_emissions = cars['CO2 Emissions(g/km)'].to_numpy()

# Preview the dataframe
cars

In [None]:
# Look at the first ten items in the CO2 emissions array
cars_co2_emissions[:10]

## 💪 Challenge I
Help your colleague gain insights on the type of vehicles that have lower CO2 emissions. Include:

1. What is the median engine size in liters?
2. What is the average fuel consumption for regular gasoline (Fuel Type = X), premium gasoline (Z), ethanol (E), and diesel (D)?  
3. What is the correlation between fuel consumption and CO2 emissions?
4. Which vehicle class has lower average CO2 emissions, 'SUV - SMALL' or 'MID-SIZE'? 
5. What are the average CO2 emissions for all vehicles? For vehicles with an engine size of 2.0 liters or smaller?
6. Any other insights you found during your analysis?

In [None]:
# 1. What is the median engine size in liters?
median_engine_size = np.median(cars_engine_sizes)

print(f'Median engine size: {median_engine_size}(L)')

In [None]:
# 2. What is the average fuel consumption for regular gasoline (Fuel Type = X), premium gasoline (Z), ethanol (E), and diesel (D)?
fuel_type = ('X', 'Z', 'E', 'D')

    
fuel_x = cars[cars['Fuel Type'] == "X"]['Fuel Consumption Comb (L/100 km)'].mean()
fuel_z = cars[cars['Fuel Type'] == "Z"]['Fuel Consumption Comb (L/100 km)'].mean()
fuel_e = cars[cars['Fuel Type'] == "E"]['Fuel Consumption Comb (L/100 km)'].mean()
fuel_d = cars[cars['Fuel Type'] == "D"]['Fuel Consumption Comb (L/100 km)'].mean()

print(f'Regular gasoline = {fuel_x}\nPremiun gasoline = {fuel_z}\nEthanol = {fuel_e}\nDiesel = {fuel_d}')

In [None]:
# 3. What is the correlation between fuel consumption and CO2 emissions?

correlation = np.corrcoef(cars_fuel_consumption, cars_co2_emissions)
print(correlation)

In [None]:
# 4. Which vehicle class has lower average CO2 emissions, 'SUV - SMALL' or 'MID-SIZE'?
min_emission = cars_co2_emissions.max()
car_type = ('SUV - SMALL', 'MID-SIZE') 

for tp in car_type:
    actual = cars[cars['Vehicle Class'] == tp]['CO2 Emissions(g/km)'].mean()
    
    if actual < min_emission:
        min_emission = actual
        car_class = tp

print(f'{car_class} has lower average CO2 emissions: {round(min_emission, 2)}')

In [None]:
# 5. What are the average CO2 emissions for all vehicles? For vehicles with an engine size of 2.0 liters or smaller?

gnrl_avg = cars_co2_emissions.mean()
eng_size_small = cars[cars['Engine Size(L)'] <= 2]['CO2 Emissions(g/km)'].mean()

print(f'General CO2 emissions for all vehicles: {gnrl_avg}\nEmissions for vehicles with small engine size: {eng_size_small}')

In [None]:
# 6. Any other insights you found during your analysis?

for car in np.unique(cars_makes):
    print(car + ' : ' + str(cars[cars['Make'] == car]['CO2 Emissions(g/km)'].mean()))


    
maximo = cars.sort_values('Cylinders', ascending=False)

    
print(maximo)