<div class="alert alert-info">
<u><strong>Authors:</strong></u> <b>Ahmed Mukhtar</b> (ahmed.mukhtar@mail.polimi.it),and <b>Ahmed Yassin</b> (ahmedmohamed1@mail.polimi.it) - 2023 - Politecnico di Milano, Italy <br>
</div>

# Visualization of Netatmo data

In [None]:
import pandas as pd
import geopandas as gpd
import ipywidgets as widgets
import datetime
from IPython.display import display

In [None]:
import analysis_functions as af
%load_ext autoreload

In [None]:
year_w = widgets.Dropdown(
    options = [i for i in range(2014, 2024)],
    value = 2023,
    description = 'Year:',
    disabled = False,
    layout = {'width': 'max-content'},
    style = {'description_width': 'initial'}
)
year_w

In [None]:
year = year_w.value

In [None]:
month_w = widgets.Dropdown(
    options = [i for i in range(1, 13)],
    value = 1,
    description = 'Month:',
    disabled = False,
    layout = {'width': 'max-content'},
    style = {'description_width': 'initial'}
)
month_w

In [None]:
month = month_w.value

In [None]:
netatmo_out_path = './Netatmo_csv_files/'

# <u>Compute and plot **daily** statistics

Display the map with the daily statistics (**min, max, mean, std, median**) and plot the time series of each sensor at the end of each cleaning step of Netatmo data considering:
1) Netatmo raw data (before cleaning)
2) After removing uncorrelated stations
3) After removing the unrealistic values
4) After removing the unbiased stations
5) After removing the local outliers
6) After removing the unreliable stations

## 1. Before cleaning

Create a daily statistics dataframe with the (*min*, *max*, *mean*, *std*, *median*) temperature values of each sensor:

In [None]:
netatmo_raw_data = af.resample_df_daily(netatmo_out_path, year, month, 'clip')

Seprate the dataframe by day and get the temperature values of that day for each sensor:

In [None]:
#netatmo_raw_data

In [None]:
date_dropdown1, separate_dataframes1 = af.create_date_dropdown(netatmo_raw_data)
date_dropdown1

In [None]:
date_dropdown1.value

In [None]:
before_cleaning = separate_dataframes1.get(date_dropdown1.value)

### Plot histogram 

In [None]:
before_cleaning['Median'].plot.hist(by=None, bins=30)

### Plot map
display a map with the temperature values of each sensor on the selected day

In [None]:
af.plot_median_temp_map(before_cleaning)

### Plot stations statistics
display the temperature statistics for each sensor on the selected day

In [None]:
af.plot_daily_statistics(before_cleaning, date_dropdown1)

### Plot time series
plot the time series of the selected station on a monthly basis

In [None]:
sensor_dropdown1, separated_dataframe1 = af.create_sensors_dropdown(netatmo_raw_data)
sensor_dropdown1

In [None]:
sensor_dropdown1.value

In [None]:
selected_dataframe1 = separated_dataframe1.get(sensor_dropdown1.value)

In [None]:
af.plot_time_series_m(selected_dataframe1,sensor_dropdown1, month, year)

## 2. After removing uncorrelated stations

Create a daily statistics dataframe with the (*min*, *max*, *mean*, *std*, *median*) temperature values of each sensor:

In [None]:
netatmo_data_high_corr = af.resample_df_daily(netatmo_out_path, year, month, 'high_corr')

Seprate the dataframe by day and get the temperature values of that day for each sensor:

In [None]:
date_dropdown2, separate_dataframes2 = af.create_date_dropdown(netatmo_data_high_corr)
date_dropdown2

In [None]:
date_dropdown2.value

In [None]:
uncorrelated_stations_removed = separate_dataframes2.get(date_dropdown2.value)

### Plot histogram 

In [None]:
uncorrelated_stations_removed['Median'].plot.hist(by=None, bins=30)

### Plot map
display a map with the temperature values of each sensor on the selected day

In [None]:
af.plot_median_temp_map(uncorrelated_stations_removed)

### Plot stations statistics
display the temperature statistics for each sensor on the selected day

In [None]:
af.plot_daily_statistics(uncorrelated_stations_removed, date_dropdown2)

## 3. After removing the unrealistic values

Create a daily statistics dataframe with the (*min*, *max*, *mean*, *std*, *median*) temperature values of each sensor:

In [None]:
netatmo_data_realistic_values = af.resample_df_daily(netatmo_out_path, year, month, 'realistic')

Seprate the dataframe by day and get the temperature values of that day for each sensor:

In [None]:
date_dropdown3, separate_dataframes3 = af.create_date_dropdown(netatmo_data_realistic_values)
date_dropdown3

In [None]:
date_dropdown3.value

In [None]:
unrealistic_values_removed = separate_dataframes3.get(date_dropdown3.value)

### Plot histogram 

In [None]:
unrealistic_values_removed['Median'].plot.hist(by=None, bins=30)

### Plot map
display a map with the temperature values of each sensor on the selected day

In [None]:
af.plot_median_temp_map(unrealistic_values_removed)

### Plot stations statistics
display the temperature statistics for each sensor on the selected day

In [None]:
af.plot_daily_statistics(unrealistic_values_removed, date_dropdown3)

## 4. After removing the unbiased stations

create a daily statistics Dataframe with the (min, max, mean, std, median) temperature values of each sensor

In [None]:
netatmo_data_unbiased = af.resample_df_daily(netatmo_out_path, year, month, 'unbiased')

seprate the dataframe by day and get the temperature values of that day for each sensor

In [None]:
date_dropdown4, separate_dataframes4 = af.create_date_dropdown(netatmo_data_unbiased)
date_dropdown4

In [None]:
date_dropdown4.value

In [None]:
biased_stations_removed = separate_dataframes4.get(date_dropdown4.value)

### Plot histogram 

In [None]:
biased_stations_removed['Median'].plot.hist(by=None, bins=30)

### Plot map
display a map with the temperature values of each sensor on the selected day

In [None]:
af.plot_median_temp_map(biased_stations_removed)

### Plot stations statistics
display the temperature statistics for each sensor on the selected day

In [None]:
af.plot_daily_statistics(biased_stations_removed, date_dropdown4)

## 5. After removing the local outliers

Create a daily statistics dataframe with the (*min*, *max*, *mean*, *std*, *median*) temperature values of each sensor:

In [None]:
netatmo_data_cleaned = af.resample_df_daily(netatmo_out_path, year, month, 'clean')

In [None]:
#netatmo_data_cleaned

seprate the dataframe by day and get the temperature values of that day for each sensor

In [None]:
date_dropdown5, separate_dataframes5 = af.create_date_dropdown(netatmo_data_cleaned)
date_dropdown5

In [None]:
date_dropdown5.value

In [None]:
after_cleaning = separate_dataframes5.get(date_dropdown5.value)

### Plot histogram 

In [None]:
after_cleaning['Median'].plot.hist(by=None, bins=20)

### Plot map
display a map with the temperature values of each sensor on the selected day

In [None]:
af.plot_median_temp_map(after_cleaning)

### Plot stations statistics
display the temperature statistics for each sensor on the selected day

In [None]:
af.plot_daily_statistics(after_cleaning, date_dropdown5)

### Plot time series
plot the time series of the selected station on a monthly basis

In [None]:
sensor_dropdown5, separated_dataframe5 = af.create_sensors_dropdown(netatmo_data_cleaned)
sensor_dropdown5

In [None]:
date_dropdown5, separate_dataframes5 = af.create_date_dropdown(netatmo_data_cleaned)
date_dropdown5

In [None]:
sensor_dropdown5.value

In [None]:
selected_dataframe5 = separated_dataframe5.get(sensor_dropdown5.value)

In [None]:
af.plot_time_series_m(selected_dataframe5, sensor_dropdown5, month, year)

## 6. After removing the unreliable stations

Create a daily statistics dataframe with the (*min*, *max*, *mean*, *std*, *median*) temperature values of each sensor:

In [None]:
netatmo_data_cleaned2 = af.resample_df_daily(netatmo_out_path, year, month, 'filtered')

In [None]:
#netatmo_data_cleaned2

seprate the dataframe by day and get the temperature values of that day for each sensor

In [None]:
date_dropdown6, separate_dataframes6 = af.create_date_dropdown(netatmo_data_cleaned2)
date_dropdown6

In [None]:
date_dropdown6.value

In [None]:
after_cleaning2 = separate_dataframes6.get(date_dropdown6.value)

### Plot histogram 

In [None]:
after_cleaning2['Median'].plot.hist(by=None, bins=30)

### Plot map
display a map with the temperature values of each sensor on the selected day

In [None]:
af.plot_median_temp_map(after_cleaning2)

### Plot stations statistics
display the temperature statistics for each sensor on the selected day

In [None]:
af.plot_daily_statistics(after_cleaning2, date_dropdown6)

### Plot time series
plot the time series of the selected station on a monthly basis

In [None]:
sensor_dropdown6, separated_dataframe6 = af.create_sensors_dropdown(after_cleaning2)
sensor_dropdown6

In [None]:
sensor_dropdown6.value

In [None]:
selected_dataframe6 = separated_dataframe6.get(sensor_dropdown6.value)

In [None]:
af.plot_time_series_m(selected_dataframe6, sensor_dropdown6, month, year)

# <u>Compute and plot **monthly** statistics
Display a map with monthly statistics (*min*, *max*, *mean*, *std*, *median*) and plot the time series of each sensor at the end of each cleaning step of Netatmo data considering:

## 1. Before cleaning

Create a daily statistics dataframe with the (*min*, *max*, *mean*, *std*, *median*) temperature values of each sensor:

In [None]:
Netatmo_original = af.resample_df_montly(netatmo_out_path, year, month, 'clip')

In [None]:
#Netatmo_original

### Plot histogram 

In [None]:
af.plot_histogram_clip(Netatmo_original, month, year)

### Plot map

In [None]:
af.plot_median_temp_map(Netatmo_original)

### Plot temperature statistics
display the temperature statistics for each sensor on the selected mont

In [None]:
af.plot_montly_statistics(Netatmo_original,month,year)

## 2. After cleaning

In [None]:
Netatmo_cleaned = af.resample_df_montly(netatmo_out_path, year, month, 'filtered')

#### Plot histogram 

In [None]:
af.plot_histogram_clean(Netatmo_cleaned, month, year)

### Plot map

In [None]:
af.plot_median_temp_map(Netatmo_cleaned)

### Plot temperature statistics
display the temperature statistics for each sensor on the selected mont

In [None]:
af.plot_montly_statistics(Netatmo_cleaned,month,year)

# Compute and plot the **yearly** statistics
plot the temperature statistics time series of the selected station on an annual basis

## 1. Before cleaning

In [None]:
Netatmo_raw = af.resample_df_annually(netatmo_out_path, year, 'clip')

In [None]:
#Netatmo_raw

In [None]:
sensor_dropdown1, separate_dataframes1 = af.create_sensors_dropdown(Netatmo_raw)
sensor_dropdown1

In [None]:
sensor_dropdown1.value

In [None]:
selected_dataframe1 = separate_dataframes1.get(sensor_dropdown1.value)

In [None]:
selected_dataframe1['Median'].plot.hist(by=None, bins=20)

In [None]:
af.plot_time_series(selected_dataframe1, sensor_dropdown1, year)

## 2. After cleaning

In [None]:
Netatmo_filtered = af.resample_df_annually(netatmo_out_path,year,'filtered')

In [None]:
sensor_dropdown2, separate_dataframes2 = af.create_sensors_dropdown(Netatmo_filtered)
sensor_dropdown2

In [None]:
sensor_dropdown2.value

In [None]:
selected_dataframe2 = separate_dataframes2.get(sensor_dropdown2.value)

In [None]:
selected_dataframe2['Median'].plot.hist(by=None, bins=20)

In [None]:
af.plot_time_series(selected_dataframe2, sensor_dropdown2, year)