_**DELETE BEFORE PUBLISHING**_

_This is a template also containing the style guide for use cases. The styling uses the use-case css when uploaded to the website, which will not be visible on your local machine._

_Change any text marked with {} and delete any cells marked DELETE_

***

In [1]:
# DELETE BEFORE PUBLISHING
# This is just here so you can preview the styling on your local machine

from IPython.core.display import HTML
HTML("""
<style>
.usecase-title, .usecase-duration, .usecase-section-header {
    padding-left: 15px;
    padding-bottom: 10px;
    padding-top: 10px;
    padding-right: 15px;
    background-color: #0f9295;
    color: #fff;
}

.usecase-title {
    font-size: 1.7em;
    font-weight: bold;
}

.usecase-authors, .usecase-level, .usecase-skill {
    padding-left: 15px;
    padding-bottom: 7px;
    padding-top: 7px;
    background-color: #baeaeb;
    font-size: 1.4em;
    color: #121212;
}

.usecase-level-skill  {
    display: flex;
}

.usecase-level, .usecase-skill {
    width: 50%;
}

.usecase-duration, .usecase-skill {
    text-align: right;
    padding-right: 15px;
    padding-bottom: 8px;
    font-size: 1.4em;
}

.usecase-section-header {
    font-weight: bold;
    font-size: 1.5em;
}

.usecase-subsection-header, .usecase-subsection-blurb {
    font-weight: bold;
    font-size: 1.2em;
    color: #121212;
}

.usecase-subsection-blurb {
    font-size: 1em;
    font-style: italic;
}
</style>
""")

<div class="usecase-title"><b>Weather Condition Classification</b></div>

<div class="usecase-authors"><b>Authored by: </b>Aremu Akintomiwa James</div>

<div class="usecase-duration"><b>Duration:</b> {90} mins</div>

<div class="usecase-level-skill">
    <div class="usecase-level"><b>Level: </b>{Intermediate}</div>
    <div class="usecase-skill"><b>Pre-requisite Skills: </b>Python, Data analysis, Machine Learning, Basic Meteorology</div>
</div>

<div class="usecase-section-header"><b>Scenario</b></div>

 As an urban planner or agricultural manager, I need to accurately classify different weather conditions using environmental features to determine the optimal times for infrastructure projects and agricultural activities. This will ensure that operations are conducted under favorable weather conditions, thereby providing actionable insights for planning and decision-making.



<div class="usecase-section-header"><b>What this use case will teach you</b></div>

At the end of this use case you will:
- Understand how to preprocess and analyze environmental data.
- Learn how to build and evaluate a machine learning model for classification tasks.
- Gain experience in feature selection and engineering for weather-related datasets.
- Develop skills in using Python libraries such as Pandas, Scikit-learn, and Matplotlib.
- Understand the importance of accurate weather classification for planning and decision-making in various sectors.


<div class="usecase-section-header"><b> introduction</b></div>

In this use case, we aim to develop a robust machine learning model capable of accurately classifying various weather conditions such as sunny, cloudy, rainy, and stormy using environmental features. These features include ambient air temperature, relative humidity, atmospheric pressure, wind speed and direction, and gust wind speed. Accurate weather classification is important for optimizing the timing of infrastructure projects and agricultural activities, ensuring that operations are conducted under favorable weather conditions. By leveraging machine learning techniques, we can provide actionable insights for planning and decision-making.



<div class="usecase-section-header"><b>Background</b></div>

Weather conditions have a significant impact on various sectors, including agriculture, construction, and transportation. Accurate weather forecasts and classifications can help in planning and executing operations more efficiently. For instance, farmers can optimize planting and harvesting times based on expected weather conditions, while construction projects can be scheduled to avoid adverse weather that could delay progress or compromise safety.

In this project, we will use historical weather data from Melbourne's open data portal. The datasets include:
- Microclimate sensors data — CoM Open Data Portal (melbourne.vic.gov.au)
- Argyle Square Weather Stations (Historical Data) — CoM Open Data Portal (melbourne.vic.gov.au)

- Argyle Square Air Quality — CoM Open Data Portal (melbourne.vic.gov.au)


<div class="usecase-section-header"><b>Dataset Information</b></div>

The dataset for this project includes the following features::
- Ambient air temperature (°C)
- Relative humidity (%)
- Atmospheric pressure (hPa)
- Wind speed (m/s)
- Wind direction (degrees)
- Gust wind speed (m/s)


These features will be used to classify weather conditions into categories such as sunny, cloudy, rainy, and stormy. The dataset will be preprocessed to handle any missing values, outliers, or inconsistencies before being used to train the machine learning model.

In [2]:
#dependencies
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


In [3]:
# **Preferred Method**: Export Endpoint
import requests
from io import StringIO


def API_unlimited(datasetName):

    dataset_id = datasetName
    # https://data.melbourne.vic.gov.au/explore/dataset/pedestrian-counting-system-monthly-counts-per-hour/information/
    #dataset_id = 'pedestrian-counting-system-monthly-counts-per-hour'
    
    base_url = 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/'
    #apikey = api_key
    dataset_id = dataset_id
    format = 'csv'
    
    url = f'{base_url}{dataset_id}/exports/{format}'
    params = {
        'select': '*',
        'limit': -1,  # all records
        'lang': 'en',
        'timezone': 'UTC',
       # 'api_key': apikey
    }
    
    # GET request
    response = requests.get(url, params=params)
    
    if response.status_code == 200:
        # StringIO to read the CSV data
        url_content = response.content.decode('utf-8')
        datasetName = pd.read_csv(StringIO(url_content), delimiter=';')
        print(datasetName.sample(10, random_state=999)) # Test
        return datasetName
    else:
        print(f'Request failed with status code {response.status_code}')

In [4]:
dataset_id_1 = 'microclimate-sensors-data'
dataset_id_2 = 'meshed-sensor-type-1'
dataset_id_3 = 'argyle-square-air-quality'

dataset1 = API_unlimited(dataset_id_1)

                device_id                received_at  \
37167  ICTMicroclimate-07  2024-06-27T18:03:02+00:00   
15595  ICTMicroclimate-07  2024-06-06T22:26:36+00:00   
12685  ICTMicroclimate-01  2024-07-09T15:16:22+00:00   
22535  ICTMicroclimate-07  2024-07-04T15:55:06+00:00   
445    ICTMicroclimate-03  2024-06-09T17:50:16+00:00   
37578  ICTMicroclimate-07  2024-07-24T19:59:30+00:00   
1311   ICTMicroclimate-09  2024-06-02T06:08:36+00:00   
16488  ICTMicroclimate-08  2024-06-05T01:09:43+00:00   
16478  ICTMicroclimate-03  2024-06-05T00:20:12+00:00   
30652  ICTMicroclimate-08  2024-05-31T04:24:08+00:00   

                                          sensorlocation  \
37167  Tram Stop 7C - Melbourne Tennis Centre Precinc...   
15595  Tram Stop 7C - Melbourne Tennis Centre Precinc...   
12685                    Birrarung Marr Park - Pole 1131   
22535  Tram Stop 7C - Melbourne Tennis Centre Precinc...   
445                                          CH1 rooftop   
37578  Tram Stop 7C - M

In [5]:
dataset2 = API_unlimited(dataset_id_2)

              dev_id                       time         rtc  battery  \
100561  atmos41-32fc  2023-12-27T23:25:29+00:00  69709815.0    4.209   
6734    atmos41-32fc  2022-07-24T05:56:35+00:00  24632660.0    4.210   
58511   atmos41-32fc  2022-09-26T07:14:11+00:00  30166883.0    4.196   
37342   atmos41-32fc  2021-02-01T21:18:39+00:00   1288506.0    4.169   
96403   atmos41-32fc  2023-11-11T08:37:11+00:00  65682125.0    4.186   
100120  atmos41-32fc  2023-12-24T05:00:47+00:00  69384334.0    4.208   
115919  atmos41-32fc  2024-06-08T04:50:38+00:00  83812484.0    4.206   
65554   atmos41-32fc  2023-01-08T07:07:53+00:00  39152075.0    4.193   
76613   atmos41-32fc  2023-04-13T23:36:08+00:00  47419352.0    4.206   
78534   atmos41-32fc  2023-03-24T02:16:32+00:00  45614582.0    4.209   

        solarpanel  command  solar  precipitation  strikes  windspeed  \
100561      21.645      0.0  159.0            0.0      0.0       1.34   
6734        22.040      0.0  173.0            0.0      0.0   

In [6]:
dataset3 = API_unlimited(dataset_id_3)

                             time    dev_id           sensor_name  \
33731   2020-06-11T16:21:52+00:00  ems-ce10  Air Quality Sensor 1   
58139   2021-07-10T10:57:51+00:00  ems-ce10  Air Quality Sensor 1   
107044  2022-06-09T20:34:51+00:00  ems-ec8a  Air Quality Sensor 2   
86091   2020-11-29T11:46:53+00:00  ems-ec8a  Air Quality Sensor 2   
142270  2024-06-23T03:53:08+00:00  ems-ec8a  Air Quality Sensor 2   
1659    2021-05-20T03:25:01+00:00  ems-ce10  Air Quality Sensor 1   
137971  2024-04-05T18:50:31+00:00  ems-ec8a  Air Quality Sensor 2   
2464    2020-07-12T19:39:43+00:00  ems-ec8a  Air Quality Sensor 2   
124252  2023-02-10T03:12:05+00:00  ems-ec8a  Air Quality Sensor 2   
117136  2022-11-03T14:55:29+00:00  ems-ec8a  Air Quality Sensor 2   

                       lat_long  averagespl  carbonmonoxide  humidity  ibatt  \
33731   -37.802772, 144.9655513        55.0         -2037.0      89.0  163.0   
58139   -37.802772, 144.9655513        62.0         -3058.0     100.0  138.0   


In [7]:
dataset1.head()

Unnamed: 0,device_id,received_at,sensorlocation,latlong,minimumwinddirection,averagewinddirection,maximumwinddirection,minimumwindspeed,averagewindspeed,gustwindspeed,airtemperature,relativehumidity,atmosphericpressure,pm25,pm10,noise
0,ICTMicroclimate-07,2024-06-12T05:58:33+00:00,Tram Stop 7C - Melbourne Tennis Centre Precinc...,"-37.8222341, 144.9829409",0.0,297.0,358.0,0.0,0.9,3.6,12.5,57.2,1020.9,2.0,2.0,76.2
1,ICTMicroclimate-03,2024-06-12T05:58:16+00:00,CH1 rooftop,"-37.8140348, 144.96728",166.0,199.0,215.0,2.9,3.4,6.4,11.9,62.5,1013.6,2.0,4.0,82.5
2,aws5-0999,2024-06-12T05:36:39+00:00,Royal Park Asset ID: COM2707,"-37.7956167, 144.9519007",0.0,77.0,112.0,0.9,0.2,2.0,11.6,56.9,1016.8,,,
3,ICTMicroclimate-09,2024-06-12T05:55:29+00:00,SkyFarm (Jeff's Shed). Rooftop - Melbourne Con...,"-37.8223306, 144.9521696",0.0,233.0,359.0,0.0,2.2,6.7,12.3,57.7,1017.7,1.0,2.0,66.5
4,ICTMicroclimate-02,2024-06-12T06:04:45+00:00,101 Collins St L11 Rooftop,"-37.814604, 144.9702991",0.0,150.0,314.0,0.0,1.1,2.0,12.9,60.4,1014.6,4.0,7.0,71.4


In [8]:
dataset1.shape

(44631, 16)

In [9]:
dataset1.columns

Index(['device_id', 'received_at', 'sensorlocation', 'latlong',
       'minimumwinddirection', 'averagewinddirection', 'maximumwinddirection',
       'minimumwindspeed', 'averagewindspeed', 'gustwindspeed',
       'airtemperature', 'relativehumidity', 'atmosphericpressure', 'pm25',
       'pm10', 'noise'],
      dtype='object')

In [10]:
dataset2.head()

Unnamed: 0,dev_id,time,rtc,battery,solarpanel,command,solar,precipitation,strikes,windspeed,winddirection,gustspeed,vapourpressure,atmosphericpressure,relativehumidity,airtemp,lat_long,sensor_name
0,atmos41-32fc,2021-05-14T18:11:23+00:00,10090042.0,4.161,0.024,0.0,0.0,0.0,0.0,4.09,253.5,10.74,0.93,100.7,88.0,7.8,"-37.8022141, 144.9656262",Weather Station
1,atmos41-32fc,2022-05-03T10:25:31+00:00,17564045.0,4.181,0.0,0.0,0.0,0.0,0.0,0.54,158.5,1.14,1.25,100.91,68.0,16.2,"-37.8022141, 144.9656262",Weather Station
2,atmos41-32fc,2022-05-03T21:42:23+00:00,17604657.0,4.143,20.67,0.0,24.0,0.0,0.0,0.76,7.7,2.43,1.21,100.83,75.0,14.0,"-37.8022141, 144.9656262",Weather Station
3,atmos41-32fc,2022-05-04T00:42:43+00:00,17615477.0,4.208,20.694,0.0,68.0,0.0,0.0,1.01,9.9,3.22,1.17,100.91,76.0,13.4,"-37.8022141, 144.9656262",Weather Station
4,atmos41-32fc,2021-05-15T07:55:22+00:00,10139481.0,4.197,0.128,0.0,0.0,0.0,0.0,2.66,253.8,8.07,0.78,101.63,56.0,11.7,"-37.8022141, 144.9656262",Weather Station


In [11]:
dataset2.shape

(119036, 18)

In [12]:
dataset2.columns

Index(['dev_id', 'time', 'rtc', 'battery', 'solarpanel', 'command', 'solar',
       'precipitation', 'strikes', 'windspeed', 'winddirection', 'gustspeed',
       'vapourpressure', 'atmosphericpressure', 'relativehumidity', 'airtemp',
       'lat_long', 'sensor_name'],
      dtype='object')

In [13]:
dataset3.head()

Unnamed: 0,time,dev_id,sensor_name,lat_long,averagespl,carbonmonoxide,humidity,ibatt,nitrogendioxide,ozone,particulateserr,particulatesvsn,peakspl,pm1,pm10,pm25,temperature,vbatt,vpanel
0,2020-06-09T09:02:38+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",56.0,-6448.0,65.0,71.0,287.0,137.0,0.0,151.0,69.0,12.0,19.0,17.0,12.3,3.96,0.0
1,2020-06-09T11:17:37+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",55.0,-6916.0,68.0,89.0,325.0,156.0,0.0,151.0,62.0,15.0,24.0,22.0,10.9,3.93,0.0
2,2022-05-03T21:46:34+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",58.0,-6261.0,77.0,169.0,268.0,137.0,0.0,151.0,64.0,0.0,0.0,0.0,15.1,3.76,16.33
3,2020-06-09T11:32:37+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",55.0,-6916.0,69.0,76.0,325.0,156.0,0.0,151.0,68.0,19.0,29.0,24.0,10.5,3.92,0.0
4,2021-05-15T06:04:33+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",56.0,-6261.0,51.0,12.0,258.0,119.0,0.0,151.0,62.0,0.0,0.0,0.0,14.9,4.01,18.33


In [14]:
dataset3.shape

(142507, 19)

In [15]:
dataset3.columns

Index(['time', 'dev_id', 'sensor_name', 'lat_long', 'averagespl',
       'carbonmonoxide', 'humidity', 'ibatt', 'nitrogendioxide', 'ozone',
       'particulateserr', 'particulatesvsn', 'peakspl', 'pm1', 'pm10', 'pm25',
       'temperature', 'vbatt', 'vpanel'],
      dtype='object')

#### standardizing column names for dataset1

In [31]:
dataset1 = dataset1.rename(columns={'received_at':'time', 'latlong':'lat_long', 'minimumwinddirection':'min_wind_direction', 'averagewinddirection':'avg_wind_direction', 'maximumwinddirection':'max_wind_direction', 
                        'minimumwindspeed':'min_wind_speed', 'averagewindspeed':'avg_wind_speed', 'gustwindspeed':'gust_wind_speed',
          'airtemperature':'air_temp', 'relativehumidity':'humidity', 'atmosphericpressure':'atm_pressure'})
dataset1.head()

Unnamed: 0,device_id,time,sensorlocation,lat_long,min_wind_direction,avg_wind_direction,max_wind_direction,min_wind_speed,avg_wind_speed,gust_wind_speed,air_temp,humidity,atm_pressure,pm25,pm10,noise
0,ICTMicroclimate-07,2024-06-12T05:58:33+00:00,Tram Stop 7C - Melbourne Tennis Centre Precinc...,"-37.8222341, 144.9829409",0.0,297.0,358.0,0.0,0.9,3.6,12.5,57.2,1020.9,2.0,2.0,76.2
1,ICTMicroclimate-03,2024-06-12T05:58:16+00:00,CH1 rooftop,"-37.8140348, 144.96728",166.0,199.0,215.0,2.9,3.4,6.4,11.9,62.5,1013.6,2.0,4.0,82.5
2,aws5-0999,2024-06-12T05:36:39+00:00,Royal Park Asset ID: COM2707,"-37.7956167, 144.9519007",0.0,77.0,112.0,0.9,0.2,2.0,11.6,56.9,1016.8,,,
3,ICTMicroclimate-09,2024-06-12T05:55:29+00:00,SkyFarm (Jeff's Shed). Rooftop - Melbourne Con...,"-37.8223306, 144.9521696",0.0,233.0,359.0,0.0,2.2,6.7,12.3,57.7,1017.7,1.0,2.0,66.5
4,ICTMicroclimate-02,2024-06-12T06:04:45+00:00,101 Collins St L11 Rooftop,"-37.814604, 144.9702991",0.0,150.0,314.0,0.0,1.1,2.0,12.9,60.4,1014.6,4.0,7.0,71.4


#### atandersizing column names for datasets2

In [25]:
dataset2 =dataset2.rename(columns={'windspeed':'avg_wind_speed', 'winddirection':'avg_wind_direction', 'gustspeed':'gust_wind_speed',
           'atmosphericpressure':'atm_pressure', 'relativehumidity':'humidity', 'airtemp':'air_temp'})
dataset2.head()

Unnamed: 0,dev_id,time,rtc,battery,solarpanel,command,solar,precipitation,strikes,avg_wind_speed,avg_wind_direction,gust_wind_speed,vapourpressure,atm_pressure,humidity,air_temp,lat_long,sensor_name
0,atmos41-32fc,2021-05-14T18:11:23+00:00,10090042.0,4.161,0.024,0.0,0.0,0.0,0.0,4.09,253.5,10.74,0.93,100.7,88.0,7.8,"-37.8022141, 144.9656262",Weather Station
1,atmos41-32fc,2022-05-03T10:25:31+00:00,17564045.0,4.181,0.0,0.0,0.0,0.0,0.0,0.54,158.5,1.14,1.25,100.91,68.0,16.2,"-37.8022141, 144.9656262",Weather Station
2,atmos41-32fc,2022-05-03T21:42:23+00:00,17604657.0,4.143,20.67,0.0,24.0,0.0,0.0,0.76,7.7,2.43,1.21,100.83,75.0,14.0,"-37.8022141, 144.9656262",Weather Station
3,atmos41-32fc,2022-05-04T00:42:43+00:00,17615477.0,4.208,20.694,0.0,68.0,0.0,0.0,1.01,9.9,3.22,1.17,100.91,76.0,13.4,"-37.8022141, 144.9656262",Weather Station
4,atmos41-32fc,2021-05-15T07:55:22+00:00,10139481.0,4.197,0.128,0.0,0.0,0.0,0.0,2.66,253.8,8.07,0.78,101.63,56.0,11.7,"-37.8022141, 144.9656262",Weather Station


#### standersizing column names for datasets3

In [26]:
dataset3.rename(columns={'temperature':'air_temp',})
dataset3.head()

Unnamed: 0,time,dev_id,sensor_name,lat_long,averagespl,carbonmonoxide,humidity,ibatt,nitrogendioxide,ozone,particulateserr,particulatesvsn,peakspl,pm1,pm10,pm25,temperature,vbatt,vpanel
0,2020-06-09T09:02:38+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",56.0,-6448.0,65.0,71.0,287.0,137.0,0.0,151.0,69.0,12.0,19.0,17.0,12.3,3.96,0.0
1,2020-06-09T11:17:37+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",55.0,-6916.0,68.0,89.0,325.0,156.0,0.0,151.0,62.0,15.0,24.0,22.0,10.9,3.93,0.0
2,2022-05-03T21:46:34+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",58.0,-6261.0,77.0,169.0,268.0,137.0,0.0,151.0,64.0,0.0,0.0,0.0,15.1,3.76,16.33
3,2020-06-09T11:32:37+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",55.0,-6916.0,69.0,76.0,325.0,156.0,0.0,151.0,68.0,19.0,29.0,24.0,10.5,3.92,0.0
4,2021-05-15T06:04:33+00:00,ems-ec8a,Air Quality Sensor 2,"-37.802772, 144.9655513",56.0,-6261.0,51.0,12.0,258.0,119.0,0.0,151.0,62.0,0.0,0.0,0.0,14.9,4.01,18.33


#### merging the datasets together

In [72]:
combine_df = pd.concat([dataset1, dataset2], axis=0)
combine_df.head()

Unnamed: 0,device_id,time,sensorlocation,lat_long,min_wind_direction,avg_wind_direction,max_wind_direction,min_wind_speed,avg_wind_speed,gust_wind_speed,...,dev_id,rtc,battery,solarpanel,command,solar,precipitation,strikes,vapourpressure,sensor_name
0,ICTMicroclimate-07,2024-06-12T05:58:33+00:00,Tram Stop 7C - Melbourne Tennis Centre Precinc...,"-37.8222341, 144.9829409",0.0,297.0,358.0,0.0,0.9,3.6,...,,,,,,,,,,
1,ICTMicroclimate-03,2024-06-12T05:58:16+00:00,CH1 rooftop,"-37.8140348, 144.96728",166.0,199.0,215.0,2.9,3.4,6.4,...,,,,,,,,,,
2,aws5-0999,2024-06-12T05:36:39+00:00,Royal Park Asset ID: COM2707,"-37.7956167, 144.9519007",0.0,77.0,112.0,0.9,0.2,2.0,...,,,,,,,,,,
3,ICTMicroclimate-09,2024-06-12T05:55:29+00:00,SkyFarm (Jeff's Shed). Rooftop - Melbourne Con...,"-37.8223306, 144.9521696",0.0,233.0,359.0,0.0,2.2,6.7,...,,,,,,,,,,
4,ICTMicroclimate-02,2024-06-12T06:04:45+00:00,101 Collins St L11 Rooftop,"-37.814604, 144.9702991",0.0,150.0,314.0,0.0,1.1,2.0,...,,,,,,,,,,


In [73]:
combine_df.isna().sum()*100/len(combine_df)

device_id             72.730605
time                   0.000000
sensorlocation        73.221236
lat_long               2.126269
min_wind_direction    76.286606
avg_wind_direction     1.355191
max_wind_direction    76.286606
min_wind_speed        76.286606
avg_wind_speed         1.354580
gust_wind_speed        4.900194
air_temp               1.711402
humidity               1.355191
atm_pressure           1.355191
pm25                  75.211863
pm10                  75.211863
noise                 75.211863
dev_id                27.269395
rtc                   28.905033
battery               28.545156
solarpanel            28.611143
command               28.570207
solar                 28.969798
precipitation         28.970409
strikes               28.970409
vapourpressure        28.970409
sensor_name           28.905033
dtype: float64

In [74]:
combine_df.shape

(163667, 26)

In [93]:
final = pd.concat([combine_df, dataset3], axis=1)
final.head()

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

In [88]:
final.isna().sum()*100/len(final)

device_id             99.2
time                   0.0
sensorlocation        99.2
lat_long_x             2.4
min_wind_direction    99.2
avg_wind_direction     3.2
max_wind_direction    99.2
min_wind_speed        99.2
avg_wind_speed         3.2
gust_wind_speed        3.2
air_temp               3.2
humidity_x             3.2
atm_pressure           3.2
pm25_x                99.2
pm10_x                99.2
noise                 99.2
dev_id_x               0.8
rtc                    3.2
battery                3.2
solarpanel             4.0
command                4.0
solar                  4.0
precipitation          4.0
strikes                4.0
vapourpressure         4.0
sensor_name_x          3.2
dev_id_y               0.0
sensor_name_y          0.0
lat_long_y             0.0
averagespl             2.4
carbonmonoxide         2.4
humidity_y             2.4
ibatt                  2.4
nitrogendioxide        2.4
ozone                  2.4
particulateserr        2.4
particulatesvsn        2.4
p

In [91]:
final.shape

(0, 44)

***

_**DELETE BEFORE PUBLISHING**_

## Style guide for use cases

### Headers

For styling within your markdown cells, there are two choices you can use for headers.

1) You can use HTML classes specific to the use case styling:

```<p class="usecase-subsection-header">This is a subsection header.</p>```

<p style="font-weight: bold; font-size: 1.2em;">This is a subsection header.</p>

```<p class="usecase-subsection-blurb">This is a blurb header.</p>```

<p style="font-weight: bold; font-size: 1em; font-style:italic;">This is a blurb header.</p>


2) Or if you like you can use the markdown header styles:

```# for h1```

```## for h2```

```### for h3```

```#### for h4```

```##### for h5```

## Plot colour schemes

General advice:
1. Use the same colour or colour palette throughout your notebook, unless variety is necessary
2. Select a palette based on the type of data being represented
3. Consider accessibility (colourblindness, low vision)

#### 1) If all of your plots only use 1-2 colors use one of the company style colors:

| Light theme | Dark Theme |
|-----|-----|
|<p style="color:#2af598;">#2af598</p>|<p style="color:#08af64;">#08af64</p>|
|<p style="color:#22e4ac;">#22e4ac</p>|<p style="color:#14a38e;">#14a38e</p>|
|<p style="color:#1bd7bb;">#1bd7bb</p>|<p style="color:#0f9295;">#0f9295</p>|
|<p style="color:#14c9cb;">#14c9cb</p>|<p style="color:#056b8a;">#056b8a</p>|
|<p style="color:#0fbed8;">#0fbed8</p>|<p style="color:#121212;">#121212</p>|
|<p style="color:#08b3e5;">#08b3e5</p>||


#### 2) If your plot needs multiple colors, choose an appropriate palette using either of the following tutorials:
- https://seaborn.pydata.org/tutorial/color_palettes.html
- https://matplotlib.org/stable/tutorials/colors/colormaps.html

#### 3) Consider accessibility as well.

For qualitative plotting Seaborn's 'colorblind' palette is recommended. For maps with sequential or diverging it is recommended to use one of the Color Brewer schemes which can be previewed at https://colorbrewer2.org/.

If you want to design your own colour scheme, it should use the same principles as Cynthia Brewer's research (with variation not only in hue but also, saturation or luminance).

### References

Be sure to acknowledge your sources and any attributions using links or a reference list.

If you have quite a few references, you might wish to have a dedicated section for references at the end of your document, linked using footnote style numbers.

You can connect your in-text reference by adding the number with a HTML link: ```<a href="#fn-1">[1]</a>```

and add a matching ID in the reference list using the ```<fn>``` tag: ```<fn id="fn-1">[1] Author (Year) _Title_, Publisher, Publication location.</fn>```