# Programming Project - Unit 1,2
*by Igor A. Brandão and Leandro Max*

**Goals**
- Analyse and generate visualization related to **Monitoring of CO2 emissions from passenger cars in UE**
- Storytelling using Bokeh & Python
- Explore competences about data visualization

Dictionary
==============================

In order to perform our analysis, we are using these definitions as a support.

**Category:**
European data

**Records:**
286960

|#|Field name|Field Definition|Data type|Primary key|
|---|---|---|---|---|---|
|1|ID|ID|integer|Yes|
|2|MS|Member state|varchar(2)|No|
|3|MP|Manufacturer pooling|varchar(120)|No|
|4|MH|Manufacturer harmonised|varchar(120)|No|
|5|Man|Manufacturer name OEM declaration|varchar(120)|No|
|6|MMS|Manufacturer name as in MS registry|varchar(120)|No|
|7|T|Type|varchar(120)|No|
|8|Va|Variant|varchar(120)|No|
|9|Ve|Version|varchar(120)|No|
|10|Mk|Make|varchar(120)|No|
|11|Cn|Commercial name|varchar(120)|No|
|12|Ct|Category of the vehicle type approved|varchar(2)|No|
|13|R|Total new registrations|Integer|No|
|14|M (kg)|Mass|Integer|No|
|15|E (g/km)|Specific CO2 Emissions|Integer|No|
|16|W (mm)|Wheel Base|Integer|No|
|17|At1 (mm)|Axle width steering axle|Integer|No|
|18|At2 (mm)|Axle width other axle|Integer|No|
|19|Ft|Fuel type|varchar(120)|No|
|20|Fm|Fuel mode|varchar(1)|No|
|21|Ec (cm3)|Engine capacity|Integer|No|
|22|Z (Wh/km)|Electric energy consumption|Integer|No|
|23|IT|Innovative technology or group of innovative technologies|varchar(255)|No|
|24|Er (g/km)|Emissions reduction through innovative technologies|varchar(255)|No|

The informations above were taken from [Monitoring of CO2 emissions from passenger cars - Data](http://www.eea.europa.eu/data-and-maps/data/co2-cars-emission/monitoring-of-co2-emissions-from)

Code starts here!!!
==============================

## Dataset reader

Here is the section that we read the dataset

In [2]:
#import panda as pd
import pandas as pd

# Import the CO2_passenger_cars_v12.csv data: data
data = pd.read_csv( "CO2_passenger_cars_v12.csv", encoding = 'latin2', decimal = ',', sep = None, engine = 'python' )

data

Unnamed: 0,ďťżid,MS,MP,Mh,Man,MMS,TAN,T,Va,Ve,...,w (mm),at1 (mm),at2 (mm),Ft,Fm,ec (cm3),ep (KW),z (Wh/km),It,Er (g/km)
0,346261,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2007/46*0623*09,AU,AC4CRBCX0,FD6FD6D9004N7MJOMLVR2,...,2620.0,1527.0,1496.0,DIESEL,M,1968.0,110.0,,,
1,346262,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2007/46*0623*17,AU,GAC4CHHBX0,FD6FD6D9011S7MMON1ML71VR2,...,2626.0,1527.0,1496.0,PETROL,M,1984.0,162.0,,,
2,346263,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2001/116*0356*17,2EC2,KN4D1350N,MEC24VD9,...,3665.0,1710.0,1716.0,DIESEL,M,1968.0,120.0,,,
3,346264,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2007/46*0539*13,16,AECTHDX0,FD7FD7AM006N7MJVIVR0,...,2538.0,1570.0,1548.0,PETROL,M,1390.0,118.0,,,
4,346265,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2007/46*0539*14,16,ABCFFBX0,FD6FD62E018N7MJVIVR0,...,2524.0,1570.0,1546.0,DIESEL,M,1968.0,103.0,,,
5,346266,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2001/116*0242*49,1K,AECFHCX0,FD6FD62E018E7MJVI,...,2577.0,1527.0,1500.0,DIESEL,M,1968.0,103.0,,,
6,346267,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2001/116*0242*49,1K,AECCZBX0,FD6FD62E016N7MJVI,...,2577.0,1527.0,1500.0,PETROL,M,1984.0,155.0,,,
7,346268,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2001/116*0307*40,3C,ACDFCAX0,FD6FD6D9002SH7MMVR261,...,2786.0,1578.0,1562.0,DIESEL,M,1968.0,140.0,,,
8,346269,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2001/116*0220*35,7HC,KCAAC300X0,LNFM6500816NVR07MJG0S/26,...,3400.0,1618.0,1618.0,DIESEL,M,1968.0,103.0,,,
9,346270,LU,VW GROUP PC,VOLKSWAGEN,VOLKSWAGEN AG,VOLKSWAGEN AG,E1*2001/116*0510*25,6R,ABCJZD,FD7FD7CW003N2VR27MM62,...,2456.0,1441.0,1434.0,PETROL,M,1197.0,81.0,,,


## Commercial name x Specific CO2 Emissions

Here, we analyse the relation between the car model and its CO² emition

In [3]:
# Import figure from bokeh.plotting
from bokeh.plotting import figure
from bokeh.charts import Bar
from bokeh.io import output_notebook, show
from bokeh.layouts import row
from bokeh.models import HoverTool

# Make reference to global data
global data

# Get the manufacturer harmonised groups
mh_list = data.groupby(['Mh']).groups.keys()
mh_group = []

# Fill manufacturer harmonised list
for idx, item in enumerate(mh_list):
    mh_group.append(item)

# Fuel type CO² emission data list
mh_median_list = data.groupby('Mh')['e (g/km)'].median()
mh_group_value = []

# Fill the emission list
for idx, item in enumerate(mh_median_list):
    mh_group_value.append(item)
    
dataSet = {
    'company_name': mh_group,
    'emission': mh_group_value
}

# table-like data results in reconfiguration of the chart with no data manipulation
bar = Bar(dataSet, values='emission', label='company_name', color='company_name', agg='mean',
          title="Company CO² emission", plot_width=2000, tools='pan,wheel_zoom,box_zoom,reset, hover')

hover = bar.select(dict(type=HoverTool))
hover.tooltips = [('Company name:',' $x'),('Co² emission:',' $y')]

# Call the output_notebook() 
output_notebook()

# Display the bar chart
show(bar)

In [48]:
# Make reference to global data
global data

# Print round array
for i in [data['e (g/km)']]:
    print(i)

0         119.0
1         145.0
2         209.0
3         148.0
4         140.0
5         134.0
6         180.0
7         119.0
8         198.0
9         109.0
10        129.0
11        104.0
12        114.0
13        154.0
14        135.0
15        211.0
16        135.0
17        126.0
18        101.0
19        109.0
20        145.0
21        107.0
22        139.0
23        137.0
24        140.0
25        125.0
26        159.0
27        106.0
28        161.0
29        107.0
          ...  
440615    298.0
440616    298.0
440617    298.0
440618    298.0
440619    298.0
440620    298.0
440621    335.0
440622    335.0
440623    298.0
440624    298.0
440625    298.0
440626    298.0
440627    298.0
440628    335.0
440629    298.0
440630    298.0
440631    298.0
440632    298.0
440633    335.0
440634    298.0
440635    335.0
440636    298.0
440637    335.0
440638    343.0
440639    343.0
440640    343.0
440641    343.0
440642    343.0
440643    343.0
440644    343.0
Name: e (g/km), dtype: f

In [49]:
data.groupby(['Cn']).groups.keys()

dict_keys([' ', ' A4 ALLROAD', ' A6 ALLROAD', '*', '-', '- -', '.', '002', '0U5FS0/1', '1', '1 SERIES', '1.6I RC', '1.6I-S', '100 SPORT', '106 XN ZEST 2', '107', '107 ACTIVE', '107 ACTIVE S-A', '107 ALLURE', '108', '108 ACCESS', '108 ACT EVTI68 3T', '108 ACT PT 82 5T', '108 ACT VTI 68 5T', '108 ACT VTI68 5T', '108 ACT VTI68 STS 3T', '108 ACT VTI68ASG5 3T', '108 ACTIVE', '108 ACTIVE S-A', '108 ACTIVE TOP', '108 ACTIVE TOP S-A', '108 ALL PT 82 3T', '108 ALL PT 82 5T', '108 ALL PT82 3T', '108 ALL PT82 5T', '108 ALL VTI68 STS 5T', '108 ALL VTI68ASG5 3T', '108 ALL VTI68ASG5 5T', '108 ALLURE', '108 ALLURE TOP', '108 ENVY EVTI68 5T', '108 FELINE', '108 ROLAND GARROS TOP', '108 TOPACT EVTI68 3T', '108 TOPACT PT 82 3T', '108 TOPACT PT82 3T', '108 TOPACT VTI 68 3T', '108 TOPACT VTI68ASG5', '108 TOPALL EVT68OS5T', '108 TOPALL PT 82 3T', '108 TOPALL PT 82 5T', '108 TOPALL PT82 HFST', '108 TOPALL VT68ASG 5', '108 TOPALL VTI68 STS', '108 TOPALL VTI68ASG5', '108 TOPENVY PT82 5T', '108 VTI 68 3T', '10

In [1]:
from bokeh.plotting import figure

from bokeh.io import output_notebook, show

import pandas as pd

from bokeh.plotting import ColumnDataSource

from bokeh.layouts import gridplot

#data VOLKSWAGEN
VOLKSWAGENMk = data[data["Mk"] == "VOLKSWAGEN"]["m (kg)"]
VOLKSWAGENEmission = data[data["Mk"] == "VOLKSWAGEN"]["e (g/km)"]
#data PORSCHE
PORSCHEMk = data[data["Mk"] == "PORSCHE"]["m (kg)"]
PORSCHEEmission = data[data["Mk"] == "PORSCHE"]["e (g/km)"]
#data TOYOTA
TOYOTAMk = data[data["Mk"] == "TOYOTA"]["m (kg)"]
TOYOTAEmission = data[data["Mk"] == "TOYOTA"]["e (g/km)"]
#data MERCEDES AMG
MERCEDESAMGMk = data[data["Mk"] == "MERCEDES AMG"]["m (kg)"]
MERCEDESAMGEmission = data[data["Mk"] == "MERCEDES AMG"]["e (g/km)"]

#figures
p1 = figure(title='VOLKSWAGEN', x_axis_label='m (kg)', y_axis_label='e (g/km)')
p2 = figure(title='PORSCHE', x_axis_label='m (kg)', y_axis_label='e (g/km)')
p3 = figure(title='TOYOTA', x_axis_label='m (kg)', y_axis_label='e (g/km)')
p4 = figure(title='MERCEDES AMG', x_axis_label='m (kg)', y_axis_label='e (g/km)')


#circles
p1.circle(VOLKSWAGENMk[:180],VOLKSWAGENEmission[:180])
p2.circle(PORSCHEMk[:180],PORSCHEEmission[:180])
p3.circle(TOYOTAMk[:180],TOYOTAEmission[:180])
p4.circle(MERCEDESAMGMk[:180],MERCEDESAMGEmission[:180])

#rows
row1 = [p1,p2]
row2 = [p3,p4]

# Create a gridplot using row1 and row2: layout
layout = gridplot([row1,row2],sizing_mode='scale_width')

# Link the x_range of p2 to p1: p2.x_range
p2.x_range = p1.x_range

# Link the y_range of p2 to p1: p2.y_range
p2.y_range = p1.y_range

# Link the x_range of p3 to p1: p3.x_range
p3.x_range = p1.x_range

# Link the y_range of p4 to p1: p4.y_range
p4.y_range = p1.y_range

# Call the output_notebook() 

output_notebook()
# Display the plot
show(layout)

NameError: name 'data' is not defined