# Pandas: grouping

In [4]:
import pandas as pd
import numpy as np

In [90]:
cars = pd.read_csv("vehicles.csv")

In [50]:
cars.head()

Unnamed: 0,Make,Model,Year,Engine Displacement,Cylinders,Transmission,Drivetrain,Vehicle Class,Fuel Type,Fuel Barrels/Year,City MPG,Highway MPG,Combined MPG,CO2 Emission Grams/Mile,Fuel Cost/Year
0,AM General,DJ Po Vehicle 2WD,1984,2.5,4.0,Automatic 3-spd,2-Wheel Drive,Special Purpose Vehicle 2WD,Regular,19.388824,18,17,17,522.764706,1950
1,AM General,FJ8c Post Office,1984,4.2,6.0,Automatic 3-spd,2-Wheel Drive,Special Purpose Vehicle 2WD,Regular,25.354615,13,13,13,683.615385,2550
2,AM General,Post Office DJ5 2WD,1985,2.5,4.0,Automatic 3-spd,Rear-Wheel Drive,Special Purpose Vehicle 2WD,Regular,20.600625,16,17,16,555.4375,2100
3,AM General,Post Office DJ8 2WD,1985,4.2,6.0,Automatic 3-spd,Rear-Wheel Drive,Special Purpose Vehicle 2WD,Regular,25.354615,13,13,13,683.615385,2550
4,ASC Incorporated,GNX,1987,3.8,6.0,Automatic 4-spd,Rear-Wheel Drive,Midsize Cars,Premium,20.600625,14,21,16,555.4375,2550


In [91]:
cars.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 35952 entries, 0 to 35951
Data columns (total 15 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   Make                     35952 non-null  object 
 1   Model                    35952 non-null  object 
 2   Year                     35952 non-null  int64  
 3   Engine Displacement      35952 non-null  float64
 4   Cylinders                35952 non-null  float64
 5   Transmission             35952 non-null  object 
 6   Drivetrain               35952 non-null  object 
 7   Vehicle Class            35952 non-null  object 
 8   Fuel Type                35952 non-null  object 
 9   Fuel Barrels/Year        35952 non-null  float64
 10  City MPG                 35952 non-null  int64  
 11  Highway MPG              35952 non-null  int64  
 12  Combined MPG             35952 non-null  int64  
 13  CO2 Emission Grams/Mile  35952 non-null  float64
 14  Fuel Cost/Year        

First exploration of the dataset:

- How many observations does it have?
- Look at all the columns: do you understand what they mean?
- Look at the raw data: do you see anything weird?
- Look at the data types: are they the expected ones for the information the column contains?

In [92]:
cars.shape

(35952, 15)

### Cleaning and wrangling data

- Some car brand names refer to the same brand. Replace all brand names that contain the word "Dutton" for simply "Dutton". If you find similar examples, clean their names too. Use `loc` with boolean indexing.

- Convert CO2 Emissions from Grams/Mile to Grams/Km

- Create a binary column that solely indicates if the transmission of a car is automatic or manual. Use `pandas.Series.str.startswith` and .

- convert MPG columns to km_per_liter

In [93]:
cars['Make'].value_counts()

Chevrolet                             3643
Ford                                  2946
Dodge                                 2360
GMC                                   2347
Toyota                                1836
                                      ... 
Excalibur Autos                          1
S and S Coach Company  E.p. Dutton       1
Environmental Rsch and Devp Corp         1
E. P. Dutton, Inc.                       1
Lambda Control Systems                   1
Name: Make, Length: 127, dtype: int64

In [94]:
list(cars['Make'].unique())

['AM General',
 'ASC Incorporated',
 'Acura',
 'Alfa Romeo',
 'American Motors Corporation',
 'Aston Martin',
 'Audi',
 'Aurora Cars Ltd',
 'Autokraft Limited',
 'BMW',
 'BMW Alpina',
 'Bentley',
 'Bertone',
 'Bill Dovell Motor Car Company',
 'Bitter Gmbh and Co. Kg',
 'Bugatti',
 'Buick',
 'CCC Engineering',
 'CX Automotive',
 'Cadillac',
 'Chevrolet',
 'Chrysler',
 'Consulier Industries Inc',
 'Dabryan Coach Builders Inc',
 'Dacia',
 'Daewoo',
 'Daihatsu',
 'Dodge',
 'E. P. Dutton, Inc.',
 'Eagle',
 'Environmental Rsch and Devp Corp',
 'Evans Automobiles',
 'Excalibur Autos',
 'Federal Coach',
 'Ferrari',
 'Fiat',
 'Fisker',
 'Ford',
 'GMC',
 'General Motors',
 'Genesis',
 'Geo',
 'Goldacre',
 'Grumman Allied Industries',
 'Grumman Olson',
 'Honda',
 'Hummer',
 'Hyundai',
 'Import Foreign Auto Sales Inc',
 'Import Trade Services',
 'Infiniti',
 'Isis Imports Ltd',
 'Isuzu',
 'J.K. Motors',
 'JBA Motorcars, Inc.',
 'Jaguar',
 'Jeep',
 'Kia',
 'Laforza Automobile Inc',
 'Lambda Control

In [95]:
cars['Make'].unique()

array(['AM General', 'ASC Incorporated', 'Acura', 'Alfa Romeo',
       'American Motors Corporation', 'Aston Martin', 'Audi',
       'Aurora Cars Ltd', 'Autokraft Limited', 'BMW', 'BMW Alpina',
       'Bentley', 'Bertone', 'Bill Dovell Motor Car Company',
       'Bitter Gmbh and Co. Kg', 'Bugatti', 'Buick', 'CCC Engineering',
       'CX Automotive', 'Cadillac', 'Chevrolet', 'Chrysler',
       'Consulier Industries Inc', 'Dabryan Coach Builders Inc', 'Dacia',
       'Daewoo', 'Daihatsu', 'Dodge', 'E. P. Dutton, Inc.', 'Eagle',
       'Environmental Rsch and Devp Corp', 'Evans Automobiles',
       'Excalibur Autos', 'Federal Coach', 'Ferrari', 'Fiat', 'Fisker',
       'Ford', 'GMC', 'General Motors', 'Genesis', 'Geo', 'Goldacre',
       'Grumman Allied Industries', 'Grumman Olson', 'Honda', 'Hummer',
       'Hyundai', 'Import Foreign Auto Sales Inc',
       'Import Trade Services', 'Infiniti', 'Isis Imports Ltd', 'Isuzu',
       'J.K. Motors', 'JBA Motorcars, Inc.', 'Jaguar', 'Jeep', 'Ki

In [96]:
cars['Make'] = list(map(lambda x: "BMW" if ( "BMW" in x ) else x, cars['Make']))

In [97]:
cars['Make'] = list(map(lambda x: "AMG" if ( "AM" in x ) else x, cars['Make']))

In [98]:
cars['Make'] = list(map(lambda x: "ASC" if ( "ASC " in x ) else x, cars['Make']))

In [99]:
cars['Make'] = list(map(lambda x: "Grumman" if ( "Grumman " in x ) else x, cars['Make']))

In [100]:
cars['Make'] = list(map(lambda x: "PAS, Inc" if ( "PAS " in x ) else x, cars['Make']))

In [101]:
cars['Make'].value_counts()

Chevrolet              3643
Ford                   2946
Dodge                  2360
GMC                    2347
Toyota                 1836
                       ... 
ASC                       1
London Taxi               1
Mahindra                  1
Qvale                     1
London Coach Co Inc       1
Name: Make, Length: 124, dtype: int64

Converting Grams/Mile to Grams/Km

1 Mile = 1.60934 Km

Grams/Mile * Mile/Km -> Grams/Mile * 1 Mile/1.60934Km

$$ \frac{Grams}{Mile} * \frac{Mile}{Km} $$

$$ \frac{Grams}{Mile} * \frac{1 Mile}{1.60934Km}  $$

In [102]:
list(cars.columns)

['Make',
 'Model',
 'Year',
 'Engine Displacement',
 'Cylinders',
 'Transmission',
 'Drivetrain',
 'Vehicle Class',
 'Fuel Type',
 'Fuel Barrels/Year',
 'City MPG',
 'Highway MPG',
 'Combined MPG',
 'CO2 Emission Grams/Mile',
 'Fuel Cost/Year']

In [103]:
cars['CO2 Emission Grams/Km'] = list(map(lambda x: x / 1.60934  ,cars['CO2 Emission Grams/Mile']))

In [104]:
list(cars.columns)

['Make',
 'Model',
 'Year',
 'Engine Displacement',
 'Cylinders',
 'Transmission',
 'Drivetrain',
 'Vehicle Class',
 'Fuel Type',
 'Fuel Barrels/Year',
 'City MPG',
 'Highway MPG',
 'Combined MPG',
 'CO2 Emission Grams/Mile',
 'Fuel Cost/Year',
 'CO2 Emission Grams/Km']

In [105]:
cars = cars.drop(columns="CO2 Emission Grams/Mile")
#cars.drop(columns="CO2 Emission Grams/Mile", inplace=True)

In [106]:
list(cars.columns)

['Make',
 'Model',
 'Year',
 'Engine Displacement',
 'Cylinders',
 'Transmission',
 'Drivetrain',
 'Vehicle Class',
 'Fuel Type',
 'Fuel Barrels/Year',
 'City MPG',
 'Highway MPG',
 'Combined MPG',
 'Fuel Cost/Year',
 'CO2 Emission Grams/Km']

Replacing the column `Transmission` with either Transmission or Manual

In [107]:
cars['Transmission'].head()

0    Automatic 3-spd
1    Automatic 3-spd
2    Automatic 3-spd
3    Automatic 3-spd
4    Automatic 4-spd
Name: Transmission, dtype: object

In [108]:
cars['Transmission'].unique()

array(['Automatic 3-spd', 'Automatic 4-spd', 'Manual 5-spd',
       'Automatic (S5)', 'Manual 6-spd', 'Automatic 5-spd', 'Auto(AM8)',
       'Auto(AM-S8)', 'Auto(AV-S7)', 'Automatic (S6)', 'Automatic (S9)',
       'Automatic (S4)', 'Auto(AM-S9)', 'Automatic (S7)', 'Auto(AM7)',
       'Auto(AM-S7)', 'Auto(AM6)', 'Automatic 6-spd', 'Manual 4-spd',
       'Automatic (S8)', 'Manual(M7)', 'Auto(AM-S6)',
       'Automatic (variable gear ratios)', 'Automatic (AV)',
       'Auto(AV-S8)', 'Automatic (AM6)', 'Automatic 8-spd', 'Auto(A1)',
       'Automatic (A1)', 'Automatic (A6)', 'Auto(AV-S6)', 'Manual 3-spd',
       'Manual 7-spd', 'Automatic 9-spd', 'Auto (AV)', 'Automatic 6spd',
       'Auto(L4)', 'Auto(L3)', 'Auto (AV-S6)', 'Auto (AV-S8)',
       'Automatic (AV-S6)', 'Automatic 7-spd', 'Manual 5 spd',
       'Auto(AM5)', 'Automatic (AM5)'], dtype=object)

In [109]:
cars['Transmission'] = list( map(lambda x: "Automatic" if ("Auto" in x) else "Manual",cars['Transmission']) )

convert MPG columns to km_per_liter

MPG = Miles/Gallon -> Km/Liter

1 Mile = 1.60934 Km

1 Gallon = 3.78541 Liters

$$ \frac{Miles}{Gallon} -> \frac{Miles}{Gallon} * \frac{Km}{Miles} * \frac{Gallon}{Liters}$$

$$ \frac{Miles}{Gallon} -> \frac{Miles}{Gallon} * \frac{1.60934Km}{ 1Miles} * \frac{1 Gallon}{3.78541 Liters}$$

* ( 1.60934 / 3.78541 )


In [110]:
list(cars.columns)

['Make',
 'Model',
 'Year',
 'Engine Displacement',
 'Cylinders',
 'Transmission',
 'Drivetrain',
 'Vehicle Class',
 'Fuel Type',
 'Fuel Barrels/Year',
 'City MPG',
 'Highway MPG',
 'Combined MPG',
 'Fuel Cost/Year',
 'CO2 Emission Grams/Km']

In [111]:
cars['City Km/Liter'] = list( map(lambda x: x * ( 1.60934 / 3.78541 ),cars['City MPG']) )

In [112]:
cars.drop(columns="City MPG", inplace=True)

In [113]:
cars['Highway Km/Liter'] = list( map(lambda x: x * ( 1.60934 / 3.78541 ),cars['Highway MPG']) )
cars.drop(columns="Highway MPG", inplace=True)

In [114]:
cars['Combined Km/Liter'] = list( map(lambda x: x * ( 1.60934 / 3.78541 ),cars['Combined MPG']) )
cars.drop(columns="Combined MPG", inplace=True)

### Gathering insights:

- How many car makers are there? How many models? Which car maker has the most cars in the dataset?

- When were these cars made?

- How big is the engine of these cars?

- What's the frequency of different transmissions, drivetrains and fuel types?

- What's the car that consumes the least/most fuel?

How many makes

In [115]:
len(cars['Make'].unique().tolist())

124

In [116]:
cars['Make'].value_counts()

Chevrolet              3643
Ford                   2946
Dodge                  2360
GMC                    2347
Toyota                 1836
                       ... 
ASC                       1
London Taxi               1
Mahindra                  1
Qvale                     1
London Coach Co Inc       1
Name: Make, Length: 124, dtype: int64

How many models

In [77]:
### your code is here

3608

Which car Maker has more cars

In [117]:
make = cars['Make'].value_counts().index[0]
make

'Chevrolet'

group by the data by the Make  using count function

In [79]:
### your code us here

Unnamed: 0_level_0,Model,Year,Engine Displacement,Cylinders,Transmission,Drivetrain,Vehicle Class,Fuel Type,Fuel Barrels/Year,Fuel Cost/Year,CO2 Emission Grams/Km,City Km/Liter,Highway Km/Liter,Combined Km/Liter
Make,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
AMG,4,4,4,4,4,4,4,4,4,4,4,4,4,4
ASC,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Acura,302,302,302,302,302,302,302,302,302,302,302,302,302,302
Alfa Romeo,41,41,41,41,41,41,41,41,41,41,41,41,41,41
American Motors Corporation,22,22,22,22,22,22,22,22,22,22,22,22,22,22
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Volkswagen,1047,1047,1047,1047,1047,1047,1047,1047,1047,1047,1047,1047,1047,1047
Volvo,717,717,717,717,717,717,717,717,717,717,717,717,717,717
Wallace Environmental,32,32,32,32,32,32,32,32,32,32,32,32,32,32
Yugo,8,8,8,8,8,8,8,8,8,8,8,8,8,8


In [118]:
cars.groupby('Make')['Model'].count()

Make
AMG                               4
ASC                               1
Acura                           302
Alfa Romeo                       41
American Motors Corporation      22
                               ... 
Volkswagen                     1047
Volvo                           717
Wallace Environmental            32
Yugo                              8
smart                            20
Name: Model, Length: 124, dtype: int64

In [81]:
cars.groupby('Make').count()['Model']

Make
AMG                               4
ASC                               1
Acura                           302
Alfa Romeo                       41
American Motors Corporation      22
                               ... 
Volkswagen                     1047
Volvo                           717
Wallace Environmental            32
Yugo                              8
smart                            20
Name: Model, Length: 124, dtype: int64

When the cars of the Make which has more cars were made?

In [44]:
cars[ cars['Make'] == "Chevrolet" ][['Make','Model','Year','Engine Displacement']] 

Unnamed: 0,Make,Model,Year,Engine Displacement
4275,Chevrolet,Astro 2WD (cargo),1985,2.5
4276,Chevrolet,Astro 2WD (cargo),1985,4.3
4277,Chevrolet,Astro 2WD (cargo),1985,4.3
4278,Chevrolet,Astro 2WD (cargo),1985,4.3
4279,Chevrolet,Astro 2WD (cargo),1985,2.5
...,...,...,...,...
7913,Chevrolet,Volt,2013,1.4
7914,Chevrolet,Volt,2014,1.4
7915,Chevrolet,Volt,2015,1.4
7916,Chevrolet,Volt,2016,1.5


In [119]:
cars['Transmission'].value_counts()

Automatic    24290
Manual       11662
Name: Transmission, dtype: int64

In [120]:
cars.columns

Index(['Make', 'Model', 'Year', 'Engine Displacement', 'Cylinders',
       'Transmission', 'Drivetrain', 'Vehicle Class', 'Fuel Type',
       'Fuel Barrels/Year', 'Fuel Cost/Year', 'CO2 Emission Grams/Km',
       'City Km/Liter', 'Highway Km/Liter', 'Combined Km/Liter'],
      dtype='object')

In [121]:
cars['Drivetrain'].value_counts()

Front-Wheel Drive             13044
Rear-Wheel Drive              12726
4-Wheel or All-Wheel Drive     6503
All-Wheel Drive                2039
4-Wheel Drive                  1058
2-Wheel Drive                   423
Part-time 4-Wheel Drive         158
2-Wheel Drive, Front              1
Name: Drivetrain, dtype: int64

In [122]:
cars['Fuel Type'].value_counts()

Regular                        23587
Premium                         9921
Gasoline or E85                 1195
Diesel                           911
Premium or E85                   121
Midgrade                          74
CNG                               60
Premium and Electricity           20
Gasoline or natural gas           20
Premium Gas or Electricity        17
Regular Gas and Electricity       16
Gasoline or propane                8
Regular Gas or Electricity         2
Name: Fuel Type, dtype: int64

Cars which consumes more(max) or less(min) at year.

Fuel Barrels/Year

In [123]:
cars['Fuel Barrels/Year'].max()

47.08714285714285

In [32]:
cars[ cars['Fuel Barrels/Year'] == cars['Fuel Barrels/Year'].max()]

Unnamed: 0,Make,Model,Year,Engine Displacement,Cylinders,Transmission,Drivetrain,Vehicle Class,Fuel Type,Fuel Barrels/Year,City MPG,Highway MPG,Combined MPG,CO2 Emission Grams/Mile,Fuel Cost/Year
20894,Lamborghini,Countach,1986,5.2,12.0,Manual 5-spd,Rear-Wheel Drive,Two Seaters,Premium,47.087143,6,10,7,1269.571429,5800
20895,Lamborghini,Countach,1987,5.2,12.0,Manual 5-spd,Rear-Wheel Drive,Two Seaters,Premium,47.087143,6,10,7,1269.571429,5800
20896,Lamborghini,Countach,1988,5.2,12.0,Manual 5-spd,Rear-Wheel Drive,Two Seaters,Premium,47.087143,6,10,7,1269.571429,5800
20897,Lamborghini,Countach,1989,5.2,12.0,Manual 5-spd,Rear-Wheel Drive,Two Seaters,Premium,47.087143,6,10,7,1269.571429,5800
20898,Lamborghini,Countach,1990,5.2,12.0,Manual 5-spd,Rear-Wheel Drive,Two Seaters,Premium,47.087143,6,10,7,1269.571429,5800


In [33]:
cars[ cars['Fuel Barrels/Year'] == cars['Fuel Barrels/Year'].min()]

Unnamed: 0,Make,Model,Year,Engine Displacement,Cylinders,Transmission,Drivetrain,Vehicle Class,Fuel Type,Fuel Barrels/Year,City MPG,Highway MPG,Combined MPG,CO2 Emission Grams/Mile,Fuel Cost/Year
17395,Honda,Civic Natural Gas,2012,1.8,4.0,Automatic 5-spd,Front-Wheel Drive,Compact Cars,CNG,0.06,27,38,31,228.694355,1000
17396,Honda,Civic Natural Gas,2013,1.8,4.0,Automatic 5-spd,Front-Wheel Drive,Compact Cars,CNG,0.06,27,38,31,218.0,1000
17397,Honda,Civic Natural Gas,2014,1.8,4.0,Automatic 5-spd,Front-Wheel Drive,Compact Cars,CNG,0.06,27,38,31,218.0,1000
17398,Honda,Civic Natural Gas,2015,1.8,4.0,Automatic 5-spd,Front-Wheel Drive,Compact Cars,CNG,0.06,27,38,31,218.0,1000


Drop the column "Combined MPG"

In [124]:
cars.drop(columns="Combined Km/Liter",inplace=True)

In [125]:
cars.columns

Index(['Make', 'Model', 'Year', 'Engine Displacement', 'Cylinders',
       'Transmission', 'Drivetrain', 'Vehicle Class', 'Fuel Type',
       'Fuel Barrels/Year', 'Fuel Cost/Year', 'CO2 Emission Grams/Km',
       'City Km/Liter', 'Highway Km/Liter'],
      dtype='object')

In [126]:
# Change column names to these ones:
col_names = ["Brand", "Model", "Year", "Engine_cc", "Cyl", "Trans", "Drivetrain", "Class", "Fuel_type", "Barrels_per_year", "City_MPG", "Highway_MPG", "CO2_grams_per_km", "Fuel_cost_per_year"]

In [127]:
col_names = [ item.replace(" ","_") for item in cars.columns ]
cars.columns = col_names

In [128]:
conversion = {"Make": "Brand", "Model":"Model","Year": "Year", "Engine Displacement": "Engine_cc", 
 "Cylinders":"Cyl", "Transmission":"Trans", "Drivetrain": "Drivetrain", "Vehicle Class":"Class",
 "Fuel Type":"Fuel_Type", "Fuel Barrels/Year": "Barrels_per_year"}

In [129]:
cars.rename(columns=conversion, inplace = True)

In [131]:
cars.columns

Index(['Brand', 'Model', 'Year', 'Engine_Displacement', 'Cyl', 'Trans',
       'Drivetrain', 'Vehicle_Class', 'Fuel_Type', 'Fuel_Barrels/Year',
       'Fuel_Cost/Year', 'CO2_Emission_Grams/Km', 'City_Km/Liter',
       'Highway_Km/Liter'],
      dtype='object')

What brand has the most cars?

In [None]:
### your code us here

What brand has the worse CO2 Emissions on average?

Hint: use the function `sort_values()`

In [132]:
cars.sort_values("CO2_Emission_Grams/Km")

Unnamed: 0,Brand,Model,Year,Engine_Displacement,Cyl,Trans,Drivetrain,Vehicle_Class,Fuel_Type,Fuel_Barrels/Year,Fuel_Cost/Year,CO2_Emission_Grams/Km,City_Km/Liter,Highway_Km/Liter
3071,BMW,i3 REX,2016,0.6,2.0,Automatic,Rear-Wheel Drive,Subcompact Cars,Premium Gas or Electricity,1.563190,1050,22.990791,17.430857,15.730285
3069,BMW,i3 REX,2014,0.6,2.0,Automatic,Rear-Wheel Drive,Subcompact Cars,Premium Gas or Electricity,1.563190,1050,24.854909,17.430857,15.730285
3070,BMW,i3 REX,2015,0.6,2.0,Automatic,Rear-Wheel Drive,Subcompact Cars,Premium Gas or Electricity,1.563190,1050,24.854909,17.430857,15.730285
7916,Chevrolet,Volt,2016,1.5,4.0,Automatic,Front-Wheel Drive,Compact Cars,Regular Gas or Electricity,2.006844,800,31.690010,18.281143,17.856000
7917,Chevrolet,Volt,2017,1.5,4.0,Automatic,Front-Wheel Drive,Compact Cars,Regular Gas or Electricity,2.006844,800,31.690010,18.281143,17.856000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
20897,Lamborghini,Countach,1989,5.2,12.0,Manual,Rear-Wheel Drive,Two Seaters,Premium,47.087143,5800,788.877073,2.550857,4.251429
20894,Lamborghini,Countach,1986,5.2,12.0,Manual,Rear-Wheel Drive,Two Seaters,Premium,47.087143,5800,788.877073,2.550857,4.251429
20898,Lamborghini,Countach,1990,5.2,12.0,Manual,Rear-Wheel Drive,Two Seaters,Premium,47.087143,5800,788.877073,2.550857,4.251429
20896,Lamborghini,Countach,1988,5.2,12.0,Manual,Rear-Wheel Drive,Two Seaters,Premium,47.087143,5800,788.877073,2.550857,4.251429


<b>show the average CO2_Emission_Grams/Km  by Brand

In [66]:
### your code us here

Unnamed: 0_level_0,CO2_Emission_Grams/Km
Brand,Unnamed: 1_level_1
AMG,379.881345
ASC,345.133719
Acura,262.583000
Alfa Romeo,288.287195
American Motors Corporation,314.264744
...,...
Volkswagen,244.038998
Volvo,270.796572
Wallace Environmental,408.857065
Yugo,221.251107


<b>show the average CO2_Emission_Grams/Km  by Brand ... sorted

In [67]:
### your code us here

Unnamed: 0_level_0,CO2_Emission_Grams/Km
Brand,Unnamed: 1_level_1
Fisker,105.011992
smart,153.498052
Fiat,189.311494
Daihatsu,192.742404
MINI,194.935105
...,...
Rolls-Royce,475.397772
Dutton,476.419879
Laforza Automobile Inc,502.012683
Bugatti,542.497235


In [68]:
### your code us here

Unnamed: 0_level_0,CO2_Emission_Grams/Km
Brand,Unnamed: 1_level_1
Vector,651.919248
Bugatti,542.497235
Laforza Automobile Inc,502.012683
Dutton,476.419879
Rolls-Royce,475.397772
...,...
MINI,194.935105
Daihatsu,192.742404
Fiat,189.311494
smart,153.498052


Use `pd.cut` or `pd.qcut` to create 4 groups (bins) of cars, by Year. We want to explore how cars have evolved decade by decade.

In [133]:
cars['Year'].describe()

count    35952.00000
mean      2000.71640
std         10.08529
min       1984.00000
25%       1991.00000
50%       2001.00000
75%       2010.00000
max       2017.00000
Name: Year, dtype: float64

In [134]:
## your code here

In [135]:
cars[['Year','Decade']]

Unnamed: 0,Year,Decade
0,1984,80s
1,1984,80s
2,1985,80s
3,1985,80s
4,1987,80s
...,...,...
35947,2013,10s
35948,2014,10s
35949,2015,10s
35950,2016,10s


In [136]:
cars.loc[:,['Year','Decade']]

Unnamed: 0,Year,Decade
0,1984,80s
1,1984,80s
2,1985,80s
3,1985,80s
4,1987,80s
...,...,...
35947,2013,10s
35948,2014,10s
35949,2015,10s
35950,2016,10s


In [137]:
cars["Year_range"]= pd.cut(cars["Year"], 
                             bins = [1980,1989,1999,2009,2019],
                             labels=["80s", "90s", "00s", "10s"])

cars.loc[:,['Year','Decade','Year_range']]

Unnamed: 0,Year,Decade,Year_range
0,1984,80s,80s
1,1984,80s,80s
2,1985,80s,80s
3,1985,80s,80s
4,1987,80s,80s
...,...,...,...
35947,2013,10s,10s
35948,2014,10s,10s
35949,2015,10s,10s
35950,2016,10s,10s


### Did cars consume more gas in the eighties?

In [138]:
cars.columns

Index(['Brand', 'Model', 'Year', 'Engine_Displacement', 'Cyl', 'Trans',
       'Drivetrain', 'Vehicle_Class', 'Fuel_Type', 'Fuel_Barrels/Year',
       'Fuel_Cost/Year', 'CO2_Emission_Grams/Km', 'City_Km/Liter',
       'Highway_Km/Liter', 'Decade', 'Year_range'],
      dtype='object')

show the average City_Km/Liter by year_range

In [None]:
### your code is here

Which brands are more environment friendly?

In [77]:
### your code is here

Unnamed: 0_level_0,Unnamed: 1_level_0,CO2_Emission_Grams/Km
Decade,Brand,Unnamed: 2_level_1
80s,AMG,379.881345
80s,ASC,345.133719
80s,Acura,268.497682
80s,Alfa Romeo,286.715163
80s,American Motors Corporation,314.264744
...,...,...
10s,Volkswagen,219.440984
10s,Volvo,250.429309
10s,Wallace Environmental,
10s,Yugo,


Does the drivetrain affect fuel consumption?

In [139]:
# We can also sort by 2 columns 
# (the second column only matters in case there's a tie sorting by the first one)
cars.groupby("Drivetrain")[["Highway_Km/Liter","City_Km/Liter"]].mean().sort_values("City_Km/Liter",ascending=False)

Unnamed: 0_level_0,Highway_Km/Liter,City_Km/Liter
Drivetrain,Unnamed: 1_level_1,Unnamed: 2_level_1
"2-Wheel Drive, Front",14.029714,10.628571
Front-Wheel Drive,12.16621,9.002214
All-Wheel Drive,10.882531,7.785598
4-Wheel Drive,9.668584,7.190861
2-Wheel Drive,8.222444,6.64248
Rear-Wheel Drive,9.023946,6.556574
4-Wheel or All-Wheel Drive,8.34713,6.392049
Part-time 4-Wheel Drive,8.115385,6.215696


Do cars with automatic transmission consume more fuel than cars with manual transmission?

In [140]:
cars.columns

Index(['Brand', 'Model', 'Year', 'Engine_Displacement', 'Cyl', 'Trans',
       'Drivetrain', 'Vehicle_Class', 'Fuel_Type', 'Fuel_Barrels/Year',
       'Fuel_Cost/Year', 'CO2_Emission_Grams/Km', 'City_Km/Liter',
       'Highway_Km/Liter', 'Decade', 'Year_range'],
      dtype='object')

In [80]:
cars.groupby("Trans")[["City_Km/Liter"]].mean().sort_values("City_Km/Liter",ascending=False)

Unnamed: 0_level_0,City_Km/Liter
Trans,Unnamed: 1_level_1
Manual,7.968348
Automatic,7.278292


Use `groupby` and `aggregate` with different aggregation measures for different columns:

aggregate with average City_Km/Liter and the count of the Trans

In [143]:
## your code is here

Unnamed: 0_level_0,City_Km/Liter,Trans
Trans,Unnamed: 1_level_1,Unnamed: 2_level_1
Automatic,7.278292,24290
Manual,7.968348,11662


aggregate with average City_Km/Liter and the minimum of the Trans

In [144]:
### your code is here

Unnamed: 0_level_0,City_Km/Liter
Trans,Unnamed: 1_level_1
Automatic,2.976
Manual,2.550857
