# Voila Web App

## A website built out of a Jupyter notebook using Voila

In [357]:
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interactive
%matplotlib inline

GLOBAL WARMING: Is Earth really getting warmer? 

The data chosen for this project is Climate change: Earth surface temperature data from [Kaggle](https://www.kaggle.com/datasets/berkeleyearth/climate-change-earth-surface-temperature-data).

There are 5 files (by city, by country, major city, state and global) with a total of 39 columns all.

We will be using the global file.

I would like to find out if global warming is accelerating in a visual way.

In this Project i had three tasks.  

The first task consists of exploring the mean of the average temperatures by month to get a general idea of the temperatures for each month. It will be done by getting the mean of every month from the historical data.  
The second task will be getting the data for each month and see the evolution in time. It will be using the same file with the data transformed to select the specific month.
The last task will be to see the evolution in a scatter plot with a linear regress for clarity on the visualization that will indicate if there is global warming or not.

The tasks will be operated on the data from the “GlobalTemperatures.csv” CSV file. I am taking the role as a data analyst who is taking the tasks, and provide a good visualization for the committee. My goal is to show the effect of global warming temperatures so the committee can take action.

In [351]:
import pandas as pd
import time
import numpy as np
import altair as alt
from vega_datasets import data as vd
data = pd.read_csv("GlobalTemperatures.csv")
#data = data[data['LandAverageTemperatureUncertainty'] < 0.5] 
data.count()


dt                                           3192
LandAverageTemperature                       3180
LandAverageTemperatureUncertainty            3180
LandMaxTemperature                           1992
LandMaxTemperatureUncertainty                1992
LandMinTemperature                           1992
LandMinTemperatureUncertainty                1992
LandAndOceanAverageTemperature               1992
LandAndOceanAverageTemperatureUncertainty    1992
dtype: int64

In [352]:
data.tail()

Unnamed: 0,dt,LandAverageTemperature,LandAverageTemperatureUncertainty,LandMaxTemperature,LandMaxTemperatureUncertainty,LandMinTemperature,LandMinTemperatureUncertainty,LandAndOceanAverageTemperature,LandAndOceanAverageTemperatureUncertainty
3187,2015-08-01,14.755,0.072,20.699,0.11,9.005,0.17,17.589,0.057
3188,2015-09-01,12.999,0.079,18.845,0.088,7.199,0.229,17.049,0.058
3189,2015-10-01,10.801,0.102,16.45,0.059,5.232,0.115,16.29,0.062
3190,2015-11-01,7.433,0.119,12.892,0.093,2.157,0.106,15.252,0.063
3191,2015-12-01,5.518,0.1,10.725,0.154,0.287,0.099,14.774,0.062


In [353]:
# Convert the date to datetime64 and get the mean for each month in the data source
data['dt'] = pd.to_datetime(data['dt'], format='%Y-%m-%d')
# Add month to dataset for filtering
data['month'] = pd.DatetimeIndex(data['dt']).month

# Filter data between two dates
filtered_df = data.loc[(data['dt'] >= '1900-01-01')]
filtered_df.index = pd.to_datetime(filtered_df['dt'],format='%Y-%m-%d')
g_by_month = filtered_df.groupby(by=[filtered_df.index.month, 'month']).mean()
g_by_month_df = g_by_month.reset_index()   


In [354]:
brush = alt.selection(type='interval')
base_plot = alt.Chart().mark_circle(size=20).encode(
    alt.X('dt:T', title='Date'),
    alt.Y('LandAverageTemperature:Q', title='Daily Average Temperature (C)'),
    color=alt.condition(brush, 'LandAverageTemperature:Q', alt.value('lightgray')),
    size=alt.Size('LandMaxTemperature:Q', scale=alt.Scale(range=[1, 30])),
    tooltip = ["LandAverageTemperature", "LandMaxTemperature", "LandMinTemperature"]
)
circle_plot = base_plot.transform_filter(
    brush
).properties(
    width=600,
)
reg = circle_plot.transform_regression(
        "dt",
        "LandAverageTemperature",
    ).mark_line(color="'#1f77b4'")

loess = circle_plot.transform_loess(
        "dt",
        "LandAverageTemperature",
    ).mark_line(color="#F4D03F")

scatter_plot = circle_plot.interactive()


bars = alt.Chart().mark_bar(size=20).encode(
    x=alt.X("month:Q", bin=alt.Bin(maxbins=12)),
    y='LandAverageTemperature',
    color=alt.condition(brush, 'LandAverageTemperature:Q', alt.value('lightgray')),
).properties(
    width=720
).add_selection(
    brush
)

# alt.vconcat(bars, scatter_plot, data=filtered_df)

In [355]:
# selection = alt.selection_multi(fields=['month'])
# color = alt.condition(selection,
#                       alt.Color('month:N', legend=None),
#                       alt.value('lightgray'))
# hist = alt.Chart(g_by_month_df).mark_bar(size=30).encode(
#     x = "month",
#     y = "LandAverageTemperature",
#     color = alt.Color("LandAverageTemperature", scale=alt.Scale(scheme = "spectral", reverse=True)),
#     tooltip = ["LandAverageTemperature", "LandAndOceanAverageTemperature"]
# ).interactive()
# hist.add_selection(
#     selection
# )


# reg = base_plot.transform_regression(
#         "dt",
#         "LandAverageTemperature",
#     ).mark_line(color="'#1f77b4'")

# loess = base_plot.transform_loess(
#         "dt",
#         "LandAverageTemperature",
#     ).mark_line(color="#F4D03F")

# scatter_plot_with_regress = (base_plot + loess + reg).interactive()
# alt.vconcat(hist, scatter_plot_with_regress)

In [356]:
simple_hist = alt.Chart(g_by_month_df).mark_bar(size=30).encode(
    x = "month",
    y = "LandAverageTemperature",
    color = alt.Color("LandAverageTemperature", scale=alt.Scale(scheme = "spectral", reverse=True)),
    tooltip = ["LandAverageTemperature", "LandAndOceanAverageTemperature"]
).interactive()

scatter_with_regress_plot = alt.Chart(filtered_df).mark_circle(size=20).encode(
    x=alt.X("dt", scale=alt.Scale(zero=False)),
    y=alt.Y("LandAverageTemperature", scale=alt.Scale(zero=False)),
    tooltip = ["LandAverageTemperature", "LandAndOceanAverageTemperature"]
)
reg = scatter_with_regress_plot.transform_regression(
        "dt",
        "LandAverageTemperature",
    ).mark_line(color="#000000")

loess = scatter_with_regress_plot.transform_loess(
        "dt",
        "LandAverageTemperature",
    ).mark_line(color="#F4D03F")

final_scatter = (scatter_with_regress_plot + loess + reg).interactive()
#.save('chartName.html')
alt.vconcat(bars, scatter_plot, final_scatter, simple_hist, data=filtered_df)

Key elements:
- Tooltip with average, max and min temperatures for each time point. Allows the user some visual aid on identifying temperature in a specific data point and its previous one (comparing visually).
- Zooming in the scatter plot for comparing shorter time ranges and specific groups of temperatures.
- Selecting months in the bar chart (months with mean by month) to filter specific months of the year and compare temperatures in its own season.
