# Home Learning Task for Altair
•Go to: https://archive.ics.uci.edu/ml/datasets.php
•Select a dataset.
•Then create a data dashboard using Altair
•Then create a markdown cell explain why you have
decided on the design choices, what influenced
your decisions and what insights have you found
from the data.
•Aim to have linking charts.

I have chosen a dataset from Kaggle: The London bike sharing dataset.
This data contains the following columns:

"timestamp" - timestamp field for grouping the data
"cnt" - the count of a new bike shares
"t1" - real temperature in C
"t2" - temperature in C "feels like"
"hum" - humidity in percentage
"windspeed" - wind speed in km/h
"weathercode" - category of the weather
"isholiday" - boolean field - 1 holiday / 0 non holiday
"isweekend" - boolean field - 1 if the day is weekend
"season" - category field meteorological seasons: 0-spring ; 1-summer; 2-fall; 3-winter.

"weathe_code" category description:

1 = Clear ; mostly clear but have some values with haze/fog/patches of fog/ fog in vicinity 2 = scattered clouds / few clouds 3 = Broken clouds 4 = Cloudy 7 = Rain/ light Rain shower/ Light rain 10 = rain with thunderstorm 26 = snowfall 94 = Freezing Fog


In [1]:
import altair as alt
import pandas as pd
bikes=pd.read_csv('london_merged.csv')

In [115]:
bikes.head()

Unnamed: 0,timestamp,cnt,t1,t2,hum,wind_speed,weather_code,is_holiday,is_weekend,season
0,2015-01-04 00:00:00,182,3.0,2.0,93.0,6.0,3.0,0.0,1.0,3.0
1,2015-01-04 01:00:00,138,3.0,2.5,93.0,5.0,1.0,0.0,1.0,3.0
2,2015-01-04 02:00:00,134,2.5,2.5,96.5,0.0,1.0,0.0,1.0,3.0
3,2015-01-04 03:00:00,72,2.0,2.0,100.0,0.0,1.0,0.0,1.0,3.0
4,2015-01-04 04:00:00,47,2.0,0.0,93.0,6.5,1.0,0.0,1.0,3.0


In [66]:
bikes.shape

(17414, 10)

In [116]:
bikes['date']=pd.to_datetime(bikes['timestamp'])
bikes['year']=pd.DatetimeIndex(bikes['date']).year
bikes['month']=pd.DatetimeIndex(bikes['date']).month
filter=(bikes['year']==2016)
bikes=bikes[filter]
# Alt.Chart gives the error if I use the full dataset:
# MaxRowsError: The number of rows in your dataset is greater than the maximum allowed (5000). 
# Hence we will filter the data appropriately.
# We look individually at two months (one in summer and one in winter) of 2016 
# We also look at the entire 2016 dataset but with every alternate date removed.
# In this way we can analyse the seasons of the year and look at variation within a month. 
filter=bikes.index%2==0
bikes2=bikes[filter]
bikes2.shape

filter1=(bikes['month']==8)
filter2=(bikes['month']==1)
august=bikes[filter1]
jan=bikes[filter2]
august.shape, jan.shape

((740, 13), (744, 13))

In [147]:
bar_cnt_weather=alt.Chart(bikes2).mark_bar().encode(
y='weather_code:N',x='cnt:Q',color=alt.Color(
            'weather_code:N', scale=alt.Scale(scheme='blues')
    )).properties(
    width=800,
    height=300)
    
bar_cnt_weather

In [131]:
bar_cnt_month=alt.Chart(bikes2).mark_bar().encode(
y='month:N',x='cnt:Q',color='month:N').properties(
    width=800,
    height=300
)
bar_cnt_month

In [157]:
alt.Chart(bikes2).mark_rect().encode(
    x='t1:Q',
    y='cnt:Q',
    color='weather_code:N'
)

In [137]:
alt.Chart(bikes2).mark_circle(size=60).encode(
    x='t1:Q',
    y='cnt:Q',
    color='wind_speed:Q',
    tooltip=['weather_code:N']
).interactive()

In [163]:
brush=alt.selection_interval()
points_august=alt.Chart(august).mark_point().encode(
    x='date',y='cnt',
    #color='weather_code:N',
    color=alt.condition(brush,'weather_code:N',alt.value('lightgray')),
    tooltip=['t1', 'hum', 'wind_speed']
    ).add_selection(brush)

In [165]:
bars_august=alt.Chart(august).mark_bar().encode(
    y='weather_code:N',
    x='count(Origin):Q',
    color='weather_code:N').transform_filter(brush)  

points_august & bars_august

In [166]:
brush=alt.selection_interval()
points_jan=alt.Chart(jan).mark_point().encode(
    x='date',y='cnt',
    #color='weather_code:N',
    color=alt.condition(brush,'weather_code:N',alt.value('lightgray')),
    tooltip=['t1', 'hum', 'wind_speed']
    ).add_selection(brush)

In [167]:
bars_jan=alt.Chart(jan).mark_bar().encode(
    y='weather_code:N',
    x='count(Origin):Q',
    color='weather_code:N').transform_filter(brush)  

points_jan & bars_jan

# Summary of data analytics with Altair 

I looked at bike sharing in London using the Altair visualization library. 

The first figure shows that bike sharings are highest when the weather is clear, fall a little when it is rainy, fall further when rain is accompanied by thunderclouds or fog. I used a bar chart in a single colour to show this trend. As there are just a few weather categories and counts are meant to vary sytematically with weather I thought it was appropriate to use a single color.

During the summer months shares are roughly twice as many as during the winter months (Figure 2). I used a bar chart here, to see how bike shares vary with month. I used multiple colors for each month to make it colourful. 

A heatmap lets me show three variables. The heatmap shows that while bike shares are high when the weather is clear, when it is clear but the temperature is too high (25-30 C), then bike shares can be low. I also used a scatter plot with filled circles to look at the variation in bike shares with temperature and wind speed. 

When we zoom into the one of the most and least popular months to decipher any futher details, then through linked interactive charts we see that the trend in bike shares with weather is different in summer and winter. In summer bike shares are more strongly effected by details of weather, while in winter it seems to matter less if it is cloudy or even a little rainy as long as the rain isn't heavy. 

Beyond visualization, and guided by visualization, we can use a model to predict bike shares given features such as month, temperature, weather, wind speed, weekday or weekend.       

# Tableau

Here is a link to my Tableau interactive Dashboard: 

https://public.tableau.com/views/Co2Dashboard_16340435251120/Co2Dashboard?:language=en-GB&publish=yes&:display_count=n&:origin=viz_share_link

<tr>
<td><img src="Co2 Dashboard-3.jpg" width="1000" height="600" alt="Tableau Dahboard"> Tableau Dashboard on Co2 emissions
<tr>    

The CO2 emissions data were downloaded from https://ourworldindata.org/co2-emissions    
    
Summary of findings: 
The dashboard shows that our total CO2 emissions are increasing, though the increase has slowed down somewhat. CO2 emissions of some individual countries are shown, and while China's is higher than that of USA or Europe, its per capita CO2 emission is much lower than that of both, and also this is territorial emission while it produces goods for consumption is the US and Europe. CO2 emissions are correlated with GDP, so the GDP's of countries are shown on an area map.    

Here is another Tableau Dashboard with linked visualizations
https://public.tableau.com/views/CO2chart2/Dashboard1?:language=enGB&publish=yes&:display_count=n&:origin=viz_share_link
<tr>
<td><img src="CO2 Tableau Dashboard 2.jpg" width="1000" height="600" alt="Tableau Dahboard"> Tableau Dashboard on Co2 emissions
<tr>    