<div class="alert alert-info">
<u><strong>Authors:</strong></u> <b>Ahmed Mukhtar</b> (ahmed.mukhtar@mail.polimi.it) and <b>Ahmed Yassin</b> (ahmedmohamed1@mail.polimi.it) - 2023 - Politecnico di Milano, Italy <br>
</div>

## Netatmo temperature time series cleaning (part 2)

This notebook computes the reliability index for a selected month and removes the unreliable stations. Also, it displays the reliability index on a map.

In [None]:
import os
import netatmo_cleaning as nc
import pandas as pd
import geopandas as gpd
import folium
import plotly.graph_objects as go
from shapely.geometry import Point
import ipywidgets as widgets

In [None]:
from analysis_functions import *
%load_ext autoreload

In [None]:
year_w = widgets.Dropdown(
    options = [i for i in range(2014, 2024)],
    value = 2023,
    description = 'Year:',
    disabled = False,
    layout = {'width': 'max-content'},
    style = {'description_width': 'initial'}
)
year_w

In [None]:
year = year_w.value

In [None]:
month_w = widgets.Dropdown(
    options = [i for i in range(1, 13)],
    value = 1,
    description = 'Month:',
    disabled = False,
    layout = {'width': 'max-content'},
    style = {'description_width': 'initial'}
)
month_w

In [None]:
month = month_w.value

In [None]:
netatmo_out_path = 'Netatmo_csv_files/'

Open the original and clean Netatmo dataset for the selected year and month:

In [None]:
netatmo_original = pd.read_csv(netatmo_out_path + 'temp_Net_milan_%s-%s_clip.csv' % (year, month), skiprows=0)
netatmo_cleaned = pd.read_csv(netatmo_out_path + 'temp_Net_milan_%s-%s_clean.csv' % (year, month), skiprows=0)

Compute the reliability index and remove unreliable stations:

In [None]:
reliability_index, filtered_stations, removed_stations, netatmo_filtered = nc.remove_unreliable_stations(netatmo_original, netatmo_cleaned, netatmo_out_path, year, month)

**Note: reliability index is defined as the fraction between the number of measurements kept after cleaning and the original number of measurements. A station is labelled as unreliable if the reliability index is less than 0.5.**

In [None]:
reliability_index

<u>Number of the remaining stations:

In [None]:
len(filtered_stations)

<u>Number of removed stations:

In [None]:
len(removed_stations)

<u>Percentage of removed stations:

In [None]:
round(len(removed_stations)/len(reliability_index), 3)

In [None]:
plot_stations_reliability_map(reliability_index)

In [None]:
reliability_Index['sens_reliability'].plot.hist(by=None, bins=20)