# Postdam PM2.5 Deep Learning Forecasting 

* Between 2013 and 2023, data collected by DEBB021 was used.
* To increase the accuracy of PM2.5 data estimation, NO2, O3, SO2, PM10 pollutant gas data accepted by the EEA was added.


In [1]:
#pip install tensorflow==2.15.0

In [2]:
#pip install keras-tuner==1.4.6 

In [3]:
# imports
import sys
import os
sys.path.append(os.path.dirname(os.getcwd()))
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np, pandas as pd

In [4]:
import model_base as mb
import deep_learning as dl


# %env TF_ENABLE_ONEDNN_OPTS=0
# print(os.environ["TF_ENABLE_ONEDNN_OPTS"])
# oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable

## Data Exploration

* Load Data


In [5]:
df_hourly, df_daily, df_weekly, df_monthly = mb.read_date_freq()

# Artificial Neural Network (ANN)

An Artificial Neural Network (ANN) is a computational model based on the structure and functions of biological neural networks. Information flows through networks of interconnected nodes, or neurons, each processing input and passing their output to the next layer. These networks are capable of learning complex patterns using algorithms that adjust connections between neurons based on the input data.

ANNs consist of input, hidden, and output layers. The hidden layers can perform nonlinear transformations on the inputs, allowing ANNs to model complex relationships. They're applied in various fields like image and speech recognition, natural language processing, and predictive analytics.

Comparatively, a Recurrent Neural Network (RNN) is specialized for processing sequences, capturing temporal dependencies by using loops within the network. However, standard RNNs struggle with long-term dependencies due to issues like vanishing gradients.

Long Short-Term Memory (LSTM) networks are a type of RNN designed to overcome this limitation. They include mechanisms called gates that regulate the flow of information and allow the network to retain or discard data over long sequences, making them more effective for tasks like time series analysis and language modeling.

Convolutional Neural Networks (CNNs) are another specialized kind of ANN designed for grid-like data, such as images. CNNs employ filters to perform convolution operations that capture spatial hierarchies and features, making them powerful for image and video recognition tasks.

Each type of network—ANN, RNN, LSTM, and CNN—serves different purposes and excels in different applications, from the general pattern recognition of ANNs to the nuanced temporal or spatial data handling in RNNs, LSTMs, and CNNs, respectively.


* Best Model Train and Evolve 
* Hyperparamater with Keras-Tuner

In [None]:
# # Train and Evolve

# Hourly
dl.ann_train_and_evaluate(df_hourly)

# Daily
dl.ann_train_and_evaluate(df_daily, 'D')

# Weekly
dl.ann_train_and_evaluate(df_weekly, 'W')

# Monthly
dl.ann_train_and_evaluate(df_monthly, 'M')


{'Total Data Points': 87648, 'Training Data Size': 52573, 'Validation Data Size': 17524, 'Testing Data Size': 17526}
{'learning_rate': 0.00014014488528467923, 'num_layers': 3, 'units': [448, 256, 160], 'activations': ['tanh', 'tanh', 'tanh', 'tanh', 'tanh'], 'dropout': True}
Epoch 1/10


2023-12-28 01:10:12.437691: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2023-12-28 01:10:12.437711: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB
2023-12-28 01:10:12.437714: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB
2023-12-28 01:10:12.437741: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-12-28 01:10:12.437754: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


   1/1643 [..............................] - ETA: 15:22 - loss: 0.3757 - mean_absolute_error: 0.4861

2023-12-28 01:10:12.968902: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.




In [None]:
# Hyperparamater Tuning 

# Hourly
# hourly_best_model, hourly_best_hp = dl.ann_tune_and_evolve(df_hourly)

In [None]:
# # Daily
# daily_best_model, daily_best_hp = dl.ann_tune_and_evolve(df_daily, 'D')

In [None]:

# # Weekly
# weekly_best_model, weekly_best_hp = dl.ann_tune_and_evolve(df_weekly, 'W')

In [None]:

# Monthly
# monthly_best_model, monthly_best_hp = dl.ann_tune_and_evolve(df_monthly, 'M')

# LSTM

* Best Model Train and Evolve 
* Hyperparamater with Keras-Tuner

In [None]:
# # Train and Evolve

# Hourly
dl.lstm_train_and_evaluate(df_hourly)

# Daily
dl.lstm_train_and_evaluate(df_daily, 'D')

# Weekly
dl.lstm_train_and_evaluate(df_weekly, 'W')

# Monthly
dl.lstm_train_and_evaluate(df_monthly, 'M')


In [None]:
# Hyperparamater Tuning 

# # Hourly
# hourly_best_model, hourly_best_hp = dl.lstm_tune_and_evolve(df_hourly)

In [None]:

# # Daily
#daily_best_model, daily_best_hp = dl.lstm_tune_and_evolve(df_daily, 'D')

In [None]:

# # Weekly
#weekly_best_model, weekly_best_hp = dl.lstm_tune_and_evolve(df_weekly, 'W')

In [None]:

# # Monthly
# monthly_best_model, monthly_best_hp = dl.lstm_tune_and_evolve(df_monthly, 'M')

# CNN 

* Best Model Train and Evolve 
* Hyperparamater with Keras-Tuner

In [None]:
# Train and Evolve

# Hourly
dl.cnn_train_and_evaluate(df_hourly)

# Daily
dl.cnn_train_and_evaluate(df_daily, 'D')

# Weekly
dl.cnn_train_and_evaluate(df_weekly, 'W')

# Monthly
dl.cnn_train_and_evaluate(df_monthly, 'M')

In [None]:
# Hyperparamater Tuning 

# # Hourly
# hourly_best_model, hourly_best_hp = dl.cnn_tune_and_evolve(df_hourly)

In [None]:
# # Daily
#daily_best_model, daily_best_hp = dl.cnn_tune_and_evolve(df_daily, 'D')

In [None]:
# # Weekly
# weekly_best_model, weekly_best_hp = dl.cnn_tune_and_evolve(df_weekly, 'W')

In [None]:
# # Monthly
# monthly_best_model, monthly_best_hp = dl.cnn_tune_and_evolve(df_monthly, 'M')