<a href="https://colab.research.google.com/github/kenzo94/unwetterwarnung/blob/master/Unwetterwarnung.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Unwetterwarnung mit Arduino Nano 33 BLE**

**1. Rohdaten von Kaggle Herunterladen**

Über kaggle die eigene API herunterladen und uploaden (Hinweis: Account bei Kaggle und Google benötigt!)

In [None]:
from google.colab import drive
from google.colab import files 

drive.mount('/content/drive/')
files.upload() #upload your kaggle.json file


In [3]:
!mkdir ~/.kaggle #create a directory called .kaggle in the root folder
!cp kaggle.json ~/.kaggle/ #copy kaggle.json to this folder
!chmod 600 ~/.kaggle/kaggle.json #add full rights to this copied file
!rm kaggle.json #remove the original one



**Datensatz über die Kaggle API downloaden**

In [None]:
!kaggle datasets download -d selfishgene/historical-hourly-weather-data #paste the kaggle API command
!unzip -qq /content/historical-hourly-weather-data.zip #unzip the zip file
!rm historical-hourly-weather-data.zip #remove the zip file


**Check prerequisites**

In [None]:
!pip install pandas #well known pandas library, used for data processing, wrangling .... by data scientists
!pip install keras
!pip install tensorflow

**2. Data processing** 

Identifizierung von Datenlücken

In [None]:
import pandas as pd

df_temp = pd.read_csv('temperature.csv', usecols = ['datetime','Seattle'])
df_temp['Seattle'] = df_temp['Seattle'].apply(lambda x: x-273.15) # umsrechnung von kelvin auf celcius
df_humid = pd.read_csv('humidity.csv', usecols = ['datetime','Seattle'])
df_pres = pd.read_csv('pressure.csv', usecols = ['datetime','Seattle'])
df_desc = pd.read_csv('weather_description.csv', usecols = ['datetime','Seattle'])

df_list = [df_temp, df_humid, df_pres, df_desc]

for df in df_list:
  df.drop(df.head(1).index, inplace=True)
  df.reset_index(inplace=True, drop=True)
  print("Count of nan values:", df.isnull().values.sum())
  print("Indexes of nan values:", list(df.loc[pd.isna(df['Seattle']), :].index))

Die Datenlücken linear befüllen (default von interpolate).

In [None]:
for df in df_list[:3]: 
  df.interpolate(inplace=True)
  print(df)
  print("Count of nan values:", df.isnull().values.sum())


Die Klassen für die Klassifizierung definieren.

In [None]:
weather_types = {'keinUnwetter':['few clouds', 'scattered clouds', 'broken clouds', 'overcast clouds',
                                  'sky is clear','mist', 'haze', 'fog', 'smoke'],
                 'regnerisch':['light rain', 'moderate rain','light intensity drizzle','drizzle',
                               'heavy intensity drizzle', 'light intensity shower rain', 'shower rain',
                               'light snow', 'snow','light shower snow'],
                 'Unwetter':['heavy snow','heavy intensity rain', 'proximity thunderstorm',
                                'thunderstorm with rain','thunderstorm with heavy rain', 'thunderstorm with light rain',
                               'very heavy rain','heavy intensity shower rain', 'thunderstorm', 'squalls']
                }

def replace(content):
  for weather_type in weather_types.keys():
    for weather in weather_types[weather_type]:
      if content == weather:      
        return weather_type
  print(content)

df_desc['Seattle'] = df_desc['Seattle'].map(replace)  
print(df_desc)

labels_unique = df_desc.Seattle.unique()
print("Wetter in Seattle:", labels_unique)

Die einzelnen Attributwerte zusammenbringen.

In [None]:
#löscht die ersten 11 Zeilen und die letzte damit daten immer nur ein fenster von 24h haben
df_temp = df_temp[11:-1].reset_index(drop=True)
df_temp.rename(columns={'datetime':'timestamp','Seattle': 'temperatur'}, inplace=True)
#print(df_temp)

df_humid = df_humid[11:-1].reset_index(drop=True)
df_humid.rename(columns={'datetime':'timestamp','Seattle': 'feuchtigkeit'}, inplace=True)
#print(df_humid)

df_pres = df_pres[11:-1].reset_index(drop=True)
df_pres.rename(columns={'datetime':'timestamp','Seattle': 'druck'}, inplace=True)
#print(df_pres)

df_desc = df_desc[11:-1].reset_index(drop=True)
df_desc.rename(columns={'datetime':'timestamp','Seattle': 'wetter'}, inplace=True)
#print(df_desc)

#left join
df = df_temp.merge(df_humid, how='left', on=['timestamp']).merge(df_pres, how='left', on=['timestamp']).merge(df_desc,
                                                                                                            how='left',
                                                                                                            on=['timestamp'])

print("First 50 rows: \n", df[0:50])
print("Last 50 rows: \n", df[-50:])

#final check for null values
print(df['temperatur'].isnull().sum())
print(df['feuchtigkeit'].isnull().sum())
print(df['druck'].isnull().sum())
print(df['wetter'].isnull().sum())

Statistik der Trainingsdaten

In [None]:
import matplotlib.pyplot as plt
top_types = df['wetter'].value_counts()
print(top_types ,"\n", df.value_counts)
df['wetter'].value_counts().plot(kind='pie', autopct='%1.0f%%', colors=["yellow", "green", "red"])

**3. Unser Modell trainieren**

Train Test Split

In [11]:
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.utils import to_categorical
import numpy as np
from sklearn.model_selection import train_test_split
pd.options.mode.chained_assignment = None  # default='warn'

df_model = df[['temperatur', 'feuchtigkeit', 'druck', 'wetter']]
df_model['wetter'] = df_model['wetter'].replace(['regnerisch', 'Unwetter', 'keinUnwetter'],[0,1,2])
df_model['wetter'] = df_model['wetter'].astype(int)

labels = to_categorical(df_model.pop('wetter')) #Create classes from the labels
features = np.array(df_model) #convert our dataframe into ndarray, only array type that neural network takes as input

train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size=0.15,shuffle=True)


Das neurale Netzwerk definieren.

In [None]:
#Parameters:
NB_classes = 3 #number of outputs
NB_neurones = 30 #main number of neurones
NB_features = 3 #number of inputs
activation_func = tf.keras.activations.relu #activation function used

#Densly connected neural network
model = tf.keras.Sequential([
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func,input_shape=(NB_features,)),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dense(NB_neurones,activation=activation_func),
                             tf.keras.layers.Dropout(0.4), #avoid overfiting
                             #softmax will output an array containing probabilities of each classes
                             #the highest one is the predicted class
                             tf.keras.layers.Dense(NB_classes,activation=tf.keras.activations.softmax)
])

model.compile(optimizer="adam",loss=tf.keras.losses.categorical_crossentropy, metrics=['accuracy']) #compile the model

model.summary() #to see the paramter of our model

Training Model

In [None]:
model.fit(x=train_features,
          y=train_labels,
          epochs=20,
          validation_data=(test_features,test_labels),
          verbose=1,
          shuffle=True) #Train our model


performance=model.evaluate(test_features,test_labels, batch_size=32, verbose=1, steps=None, )[1] * 100
print('Final accuracy : ', round(performance), '%')

**4. Konvertierung des Modells für das deployment auf den Arduino**

In [None]:
converter = tf.lite.TFLiteConverter.from_keras_model(model) #create a converter
tflite_model = converter.convert() #convert the model without quantization 


open("/content/tflite_model.tflite","wb").write(tflite_model) #Create a file containing our tflite model

In [None]:
!apt-get install -qq xxd #installing the tool
!echo "const unsigned char model[] = {" > /content/model.h
!cat /content/tflite_model.tflite | xxd -i >> /content/model.h #create an hexadecimal array containing all our parameters
!echo "};" >> /content/model.h

files.download("/content/model.h") #automaticly download your file