# The Regular Neural Network for the Prediction of Bike Sharing

The goal of the project is to predict the ranges of cnt (the count of a new bike shares) each hour based on the given factors. To achieve this goal, a regular neural network based on tensorflow is designed.

First of all, several essential packages and training data are imported. The data is from https://www.kaggle.com/c/cee-498-project1-london-bike-sharing 

In [None]:
import numpy as np
import tensorflow as tf
import pandas as pd

In [None]:
df = pd.read_csv("../input/cee-498-project1-london-bike-sharing/train.csv")
df

## Data Preprocessing

In data preprocessing, split "timestamp" to "year", "month", "day" and "hour". Besides, "t1" is deleted because "t1" and "t2" are highly correlated.

In [None]:
df.dropna(axis=0, how='any')
df['timestamp'] = pd.to_datetime(df['timestamp'])
df['year'] = df['timestamp'].dt.year
df['month'] = df['timestamp'].dt.month
df['day'] = df['timestamp'].dt.day
df['hour'] = df['timestamp'].dt.hour
df = df.drop(['timestamp','t1'], axis=1)
df

Assign columns to dataset in the type of tensorflow. "cnt" is the target and the other columns are factors.

In [None]:
cnt = df.pop('cnt')
dataset = tf.data.Dataset.from_tensor_slices((df.values, cnt.values))

In [None]:
for feat, targ in dataset.take(5):
  print ('Features: {}, Target: {}'.format(feat, targ))

The training dataset is splited to mini batches.

In [None]:
train_dataset = dataset.shuffle(len(df)).batch(batch_size=150).repeat(20)

In [None]:
for feat, targ in train_dataset.take(1):
  print ('Features: {}, Target: {}'.format(feat, targ))

## The Archtecture of Model

The layers are shown as follow. The loss is defined by mean square error. The learning rate will be changed by epoches during the training.

In [None]:
model = tf.keras.Sequential(
    [tf.keras.layers.Dense(units=32, input_shape=(11,)),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.Dense(64, activation='relu'),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.Dense(128, activation='relu'),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.Dense(64, activation='relu'),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.Dense(1)
    ])

model.compile(
     optimizer=tf.keras.optimizers.Adam(lr=0.001),
     loss='mean_squared_error',
    )

def scheduler(epoch, lr):
    if epoch < 10:
        return lr
    else:
        return lr * tf.math.exp(-0.1)**(epoch//3-3)

callback = [tf.keras.callbacks.LearningRateScheduler(scheduler),
            tf.keras.callbacks.EarlyStopping(monitor='loss',
                                             min_delta=0, patience=32,
                                             mode='min', restore_best_weights=True)]

history = model.fit(train_dataset, epochs=300, callbacks=[callback])

## The Test of Model

Import test data to test the accuracy of the model.

The performance of the neural network is evaluated based on the RMSE of predictions.

In [None]:
df2 = pd.read_csv("../input/cee-498-project1-london-bike-sharing/test.csv")
df2.dropna(axis=0, how='any')
df2['timestamp'] = pd.to_datetime(df2['timestamp'])
df2['year'] = df2['timestamp'].dt.year
df2['month'] = df2['timestamp'].dt.month
df2['day'] = df2['timestamp'].dt.day
df2['hour'] = df2['timestamp'].dt.hour
df_test = df2.drop(['timestamp','t1'], axis=1)
df_test

In [None]:
test_dataset = tf.data.Dataset.from_tensor_slices(df_test.values).batch(1)
for feat in test_dataset.take(5):
  print ('Features: {}'.format(feat))

In [None]:
prediction = model.predict(test_dataset)

In [None]:
df2['cnt']=prediction
df2

In [None]:
output = df2.drop(['t1','t2','hum','wind_speed','weather_code','is_holiday','is_weekend','season','year','month','day','hour'], axis=1)

The CSV document of results is uploaded to Kaggle for evaluated.

In [None]:
output.to_csv('output_32_64_128_64_1_001xe0.1xepoch3-3_min_loss_32.csv',index=False)

The model is saved in h5 format.

In [None]:
model.save("32_64_128_64_1_001xe0.1xepoch3-3_min_loss_32.h5")