## Predicting number of customers of Rossmann stores with Artificial Neural Network in Tensorflow

The structure of this network is literally the same as the structure of a neural network that processes the same dataset with Keras (repository: https://github.com/oleksandrkim/Predicitng-sales-and-number-of-customers-of-Rossmann-stores-with-Artificial-Neural-Network-in-Keras). <br>
**It is a "mimic" of keras model but done in tensorflow.**

### Information about the dataset

- Number of inputs: **1 017 209**
- Number of features: **19**
- Dataset: https://www.kaggle.com/c/rossmann-store-sales
- Data fields description: can be found here https://www.kaggle.com/c/rossmann-store-sales

### Importing main libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import preprocessing
import warnings; warnings.simplefilter('ignore')

**Loading of the dataset. Two dataset from the website were merged sepately, a merged version is presented**

In [2]:
df_un = pd.read_csv("exported.csv")

**Some data preprocessing**

In [3]:
#convert to datetime
df_un['Date'] = pd.to_datetime(df_un['Date'])

#create a month column from date column
df_un['month'] = df_un['Date'].dt.month

#create seasonal column
conditions = [
    (df_un['month'] == 1) | (df_un['month'] == 2) | (df_un['month'] == 12),
    (df_un['month'] == 3) | (df_un['month'] == 4) | (df_un['month'] == 5),
    (df_un['month'] == 6) | (df_un['month'] == 7) | (df_un['month'] == 8)  
]

choices = ['Winter', 'Spring', 'Summer']
df_un['Season'] = np.select(conditions, choices, default='Autumn')

**Removing values with competion distance = na and days when shops were closed**

In [4]:
df_un = df_un[df_un['CompetitionDistance'].notnull()]
df_un = df_un[df_un['Open']!=0]

**A value to predict**

In [5]:
y = df_un.iloc[:, 4] #4 for customers, 3 for sales

**Creating a separate dataframe with categorical variables to apply get_dummies** <br>
Only some columns from a dataset will be used - DayOfWeek,	Promo,	StateHoliday,	SchoolHoliday,	StoreType,	Assortment,	month,	Season


In [6]:
#indexes of columns with and without categorical variables
col_list = [1,6,7,8,9,10,19,20]
no_cat_var = [11]

df_un_cat = df_un.iloc[:, col_list]
df_un_non_cat = df_un.iloc[:, no_cat_var]

**Convert some variables to "category" so get_dummies encodes it**

In [7]:
#conversion so get_dummies works
df_un_cat['Promo']= df_un_cat['Promo'].astype('category')
df_un_cat['SchoolHoliday'] = df_un_cat['SchoolHoliday'].astype('category')
df_un_cat['month']= df_un_cat['month'].astype('category')
df_un_cat['DayOfWeek']= df_un_cat['DayOfWeek'].astype('category')

**Applying get_dummies** <br/>
Dropping first dummy column is important to avoid collinearity, so drop_first is set to True

In [8]:
df = pd.get_dummies(df_un_cat, drop_first=True)

In [9]:
pd.options.display.max_columns = None

In [10]:
df.head()

Unnamed: 0,DayOfWeek_2,DayOfWeek_3,DayOfWeek_4,DayOfWeek_5,DayOfWeek_6,DayOfWeek_7,Promo_1,StateHoliday_a,StateHoliday_b,StateHoliday_c,SchoolHoliday_1,StoreType_b,StoreType_c,StoreType_d,Assortment_b,Assortment_c,month_2,month_3,month_4,month_5,month_6,month_7,month_8,month_9,month_10,month_11,month_12,Season_Spring,Season_Summer,Season_Winter
1,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0
2,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1
3,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1
4,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1
5,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1


**Adding continuos variables to encoded categorical**

In [11]:
X = pd.merge(df, df_un_non_cat, left_index=True, right_index=True)

**Final dataset**

In [12]:
X.head()

Unnamed: 0,DayOfWeek_2,DayOfWeek_3,DayOfWeek_4,DayOfWeek_5,DayOfWeek_6,DayOfWeek_7,Promo_1,StateHoliday_a,StateHoliday_b,StateHoliday_c,SchoolHoliday_1,StoreType_b,StoreType_c,StoreType_d,Assortment_b,Assortment_c,month_2,month_3,month_4,month_5,month_6,month_7,month_8,month_9,month_10,month_11,month_12,Season_Spring,Season_Summer,Season_Winter,CompetitionDistance
1,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,4610.0
2,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4610.0
3,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4610.0
4,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4610.0
5,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4610.0


### Creation of a Neural Network

**Train-test split**

In [13]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df, y, test_size = 0.2)


from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

**Reshaping is needed to feed data into tensorflow**

In [14]:
y_train = y_train.values
y_train.shape = (len(y_train), 1)

y_train = y_train.astype(float)

**Create placeholders for x and y, layers** <br>
Biases are initialized with zeros <br>
Kernels are initialized with glorot uniform initializer <br>
4 hidden layers with 64 neurons (this is the only difference with keras model) <br>
10% of data dropped to prevent overfitting <br>

Cost is calculated wit MAE (Mean Absolute Error)<br>
Optimizer is "adam"

In [15]:
import tensorflow as tf
import numpy as np
import uuid

x = tf.placeholder(shape=[None, 30], dtype=tf.float32) #number of features
y = tf.placeholder(shape=[None, 1], dtype=tf.float32)


dense = tf.layers.dense(x, 30, activation = tf.nn.relu,
                        bias_initializer = tf.zeros_initializer(),
                        kernel_initializer = tf.glorot_uniform_initializer())
dropout = tf.layers.dropout(inputs = dense, rate = 0.1)
dense = tf.layers.dense(dropout, 64, activation = tf.nn.relu,
                        bias_initializer = tf.zeros_initializer(),
                        kernel_initializer = tf.glorot_uniform_initializer())
dropout = tf.layers.dropout(inputs = dense, rate = 0.1)
dense = tf.layers.dense(dropout, 64, activation = tf.nn.relu,
                        bias_initializer = tf.zeros_initializer(),
                        kernel_initializer = tf.glorot_uniform_initializer())
dropout = tf.layers.dropout(inputs = dense, rate = 0.1)
dense = tf.layers.dense(dropout, 64, activation = tf.nn.relu,
                        bias_initializer = tf.zeros_initializer(),
                        kernel_initializer = tf.glorot_uniform_initializer())
dropout = tf.layers.dropout(inputs = dense, rate = 0.1)
dense = tf.layers.dense(dropout, 64, activation = tf.nn.relu,
                        bias_initializer = tf.zeros_initializer(),
                        kernel_initializer = tf.glorot_uniform_initializer())
dropout = tf.layers.dropout(inputs = dense, rate = 0.1)
output = tf.layers.dense(dropout, 1, activation = tf.nn.sigmoid)

cost = tf.losses.absolute_difference(y, output) #mae
optimizer = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(cost)
init = tf.global_variables_initializer()

tf.summary.scalar("cost", cost)
merged_summary_op = tf.summary.merge_all()

with tf.Session() as sess:
    sess.run(init)
    uniq_id = "/tmp/tensorboard-layers-api/" + uuid.uuid1().__str__()[:6]
    summary_writer = tf.summary.FileWriter(uniq_id, graph=tf.get_default_graph())
    x_vals = X_train
    y_vals = y_train
    for step in range(100):
        _, val, summary = sess.run([optimizer, cost, merged_summary_op],
                                   feed_dict={x: x_vals, y: y_vals})
        if step % 20 == 0:
            print("step: {}, value: {}".format(step, val))
            summary_writer.add_summary(summary, step)

step: 0, value: 762.8004760742188
step: 20, value: 762.5020751953125
step: 40, value: 762.4652099609375
step: 60, value: 762.3936157226562
step: 80, value: 762.3175659179688


Results are simiar to keras' first steps of trainig

In [None]:
#TODO: batching
#input_func = tf.estimator.inputs.numpy_input_fn({'x':x_train},y_train,batch_size=4,num_epochs=None,shuffle=True) (???)