# Fully Connected Network

This is the second part of our whole deep learning planning structure. 

The Data in this experiments is generated from RDDL simulator [Github](https://github.com/ssanner/rddlsim), which is written by Prof.Scott Sanner at University of Toronto.

In the section, we need a fully connected network to compute the reward of each (STATE,ACTION,STATE') tuple, that is (STATE,ACTION,STATE') -> Reward. Since this part is deterministic, fully connected network is capable to solve.

Problem list:
1. Data normalization will highly impact the network performance, we need to normalize the input. However, the input of this section is an output of VAE, which is unnormalized. And since everything is working under tensorflow environment, we need to build normalizer inside tensorflow graph.
2. For reward function R(s,a), in nondeterministic domain, R(s,a) is stochastic. We need a deterministic function. Therefore, we rewrite the reward function as R(s,a,s'). This requires us to concate s,a,s' as single input matrix(also under tensorflow).

### Import Packages
We do note provide pip installation commands, please search this package and install it through pip install. Please upgrade your pip before installing, since old pip would cause errors.

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from sklearn.metrics import confusion_matrix
import time
from datetime import timedelta
import math
import os
import pandas as pd
#Functional coding
import functools
from functools import partial
import re

### Load Data
We load data from csv files. The following code shows how to load data through pandas and numpy. Result of this progress can be feed into tensorflow with "feed_dict" argument. 

In [2]:
#Data Path..
Datapath="DATA/Navigation/Nav_RDDL_Data.txt"
Labelpath="DATA/Navigation/Nav_RDDL_Label.txt"
Rewardpath="DATA/Navigation/Nav_RDDL_Reward.txt"

In [3]:
#Given local path, find full path
def PathFinder(path):
    #python 2
    #script_dir = os.path.dirname('__file__')
    #fullpath = os.path.join(script_dir,path)
    #python 3
    fullpath=os.path.abspath(path)
    print(fullpath)
    return fullpath

#Read Data for Deep Learning
def ReadData(path):
    fullpath=PathFinder(path)
    return pd.read_csv(fullpath, sep=',', header=0)

#Won't use this one to normalize
#Input Normalization
def Normalize(features, mean = [], std = []):
    if mean == []:
        mean = np.mean(features, axis = 0)
        std = np.std(features, axis = 0)
#     print std
#     print std[:,None]
    new_feature = (features.T - mean[:,None]).T
    new_feature = (new_feature.T / std[:,None]).T
    new_feature[np.isnan(new_feature)]=0
#     print new_feature
    return new_feature, mean, std

In [4]:
x_pd = ReadData(Datapath)
y_pd = ReadData(Labelpath)

/home/wuga/Documents/Notebook/VAE-PLANNING/DATA/Navigation/Nav_RDDL_Data.txt
/home/wuga/Documents/Notebook/VAE-PLANNING/DATA/Navigation/Nav_RDDL_Label.txt


In [5]:
#Get Numpy Arrays
x_matrix=x_pd.as_matrix()
x_train = x_matrix[:50000]
x_valid = x_matrix[50000:]
y_matrix=y_pd.as_matrix()
y_train = y_matrix[:50000]
y_valid = y_matrix[50000:]

In [6]:
len(x_matrix)

60000

In [7]:
data_size=len(x_matrix)
# Uppercase for constants
INPUT_S_A_SIZE = 4
OUTPUT_SIZE = 2

### Support Functions
The following functions allows us passthrough multiple functions without explicitly assign intermediate output variables.

In [8]:
def compose(f,g):
    return lambda x:g(f(x))
    
def composeAll(*args):
    """
    composeAll([f,g,h])(x): f(g(h(x)))
    """
    return partial(functools.reduce, compose)(*args)

## Tensorflow
### Input tensor place holders

In [9]:
# Input features s,a
x = tf.placeholder(tf.float32,[None, INPUT_S_A_SIZE],name="Features_S_A")

# Input label
y = tf.placeholder(tf.float32, [None, OUTPUT_SIZE],name="Labels")

### Variable Generating functions

In [10]:
#Weight constructing function
def weight_variable(shape):
    initial = tf.truncated_normal(shape,stddev=0.001)
    return tf.Variable(initial,name="Matrix")

#Bias constructing function
def bias_variable(shape):
    initial = tf.constant(0.,shape=shape)
    return tf.Variable(initial,name="Bias")

### Fully Connected Layer Defination

In [11]:
class Dense():
    """Fully Connected Layer"""
    def __init__(self, scope="fully_connected_layer", output_dim =None, dropout=1.0, activation=tf.identity):
        assert output_dim, "Missing output dimension specification!"
        self.scope = scope
        self.output_dim = output_dim
        self.dropout = dropout
        self.activation = activation
        
    def __call__(self,x):
        with tf.name_scope(self.scope):
            while True:
                try:
                    return self.activation(tf.matmul(x,self.w)+self.b)
                except(AttributeError):
                    self.w = tf.nn.dropout(weight_variable([x.get_shape()[1].value, self.output_dim]),self.dropout)
                    self.b = bias_variable([self.output_dim])
    
    def set_parameters(self, weight, bias):
        self.w.assign(weight)
        self.b.assign(bias)
        
    def get_l2_loss(self):
        return tf.nn.l2_loss(self.w)

### Full Deep Network Class
The following class define a complete deep network, which include:
1. Network structure specification
2. Loss function specification
3. Prediction specification
4. Optimization method specification
5. Training function
6. Saving function(not tensorflow variable saving, but numpy weight dumping!)
7. Loading function(not tensorflow variable loading, but numpy weight assignment!)
8. Mini-Batch generation function

In [12]:
class DeepNet(object):
    
    def __init__(self, 
                 x, #Input Features for S,A
                 y, #Output Label for R
                 num_hidden_layers, #number of layers for both encoder and decoder
                 num_hidden_nodes, #number of nodes in each layer
                 activation, #nonlinear activation function
                 learning_rate=0.01, #Learning rate
                 batch_size=100, 
                 dropout = 1,
                 l2_lambda = 1E-4): #Batch size        
        self.mean = tf.Variable(tf.zeros([x.get_shape()[1]]),trainable=False,name="NORM_MEAN")
        self.var = tf.Variable(tf.ones([x.get_shape()[1]]),trainable=False,name="NORM_VAR")
        self.f = self._p_normalize(x)
        self.y = y
        self.num_hidden_layers = num_hidden_layers
        self.num_hidden_nodes = num_hidden_nodes
        self.activation = activation
        self.learning_rate = learning_rate
        self.batch_size = batch_size
        self.l2_lambda = l2_lambda
        self.dropout = dropout
        self._p_create_dnn_graph()
        self._p_create_loss()
        self.sess = tf.InteractiveSession()
        self.sess.run(tf.global_variables_initializer())
        
    def _p_create_dnn_graph(self):

        layers = []
        for i in range(self.num_hidden_layers):
            layers.append(Dense("Layer"+str(i),self.num_hidden_nodes,self.dropout))
        layers.append(Dense("Layer"+str(self.num_hidden_layers),self.y.get_shape()[1].value,self.dropout))
        self.y_pred = composeAll(layers)(self.f)
        self.layers = layers 
    
    def _p_normalize(self, unnormed):
        epsilon = 1e-3
        normed = tf.nn.batch_normalization(unnormed,self.mean,self.var,None,None,epsilon)
        return normed
        
    def _p_create_loss(self): #lambda for l2 regularization

        #L2 regularization loss
        l2_loss = tf.constant(0.0)
        for layer in self.layers:
            l2_loss += layer.get_l2_loss()

        #Mean Squared Error
        mse_r = tf.reduce_mean(tf.square(tf.sub(self.y,self.y_pred)), reduction_indices=1)

        #loss
        self.loss = tf.reduce_mean(mse_r)+self.l2_lambda*l2_loss
        self.optimizer = tf.train.RMSPropOptimizer(self.learning_rate).minimize(self.loss)   
        
    def update_normalization(self,x_matrix):
        tf_mean,tf_var = tf.nn.moments(x, axes = [0])
        feed_dict = {x:x_matrix}
        np_mean,np_var = self.sess.run([tf_mean,tf_var],feed_dict=feed_dict)
        print(np_mean)
        self.sess.run(self.mean.assign(np_mean))
        self.sess.run(self.var.assign(np_var))
        print(self.mean.eval())
        
    
    def train_model(self,train_s_a,train_r,test_s_a,test_r,epoch=100):
        
        self.update_normalization(train_s_a)
        
        batches = self._p_get_batches(train_s_a,train_r,self.batch_size)
        
        self.mean = tf.Variable(tf.zeros([x.get_shape()[1]]),trainable=False)
        self.var = tf.Variable(tf.ones([x.get_shape()[1]]),trainable=False)
        
        summary_writer = tf.summary.FileWriter('experiment', graph=self.sess.graph)
        feed_test={x:test_s_a,y:test_r}
        feed_train={x:train_s_a, y:train_r}

        #Training
        for epoch in range(epoch):
            for step in range(len(batches)):
                feed_dict = {x: batches[step][0],y: batches[step][1]}
                training = self.sess.run([self.optimizer], feed_dict=feed_dict)
            train_loss = self.sess.run([self.loss],feed_dict=feed_dict)
            test_loss = self.sess.run([self.loss],feed_dict=feed_test)
            print('Train loss in epoch {0}: {1}, Test loss: {2}'.format(epoch, train_loss, test_loss))    
            
    def _p_get_batches(self,x_matrix,r_matrix,batch_size):
        remaining_size = len(x_matrix)
        batch_index=0
        batches = []
        while(remaining_size>0):
            batch = []
            if remaining_size<batch_size:
                batch.append(x_matrix[batch_index*batch_size:-1])
                batch.append(r_matrix[batch_index*batch_size:-1])
            else:
                batch.append(x_matrix[batch_index*batch_size:(batch_index+1)*batch_size]) 
                batch.append(r_matrix[batch_index*batch_size:(batch_index+1)*batch_size]) 
            batch_index+=1
            remaining_size-=batch_size
            batches.append(batch)
        return batches
    
    def _p_extract_weights(self):
        # a hashmap maps from layer name to weights and biases
        mp_layer_weights = {}

        #iteratively save values
        for dense in self.layers:
            values = {'weights':dense.w, 'biases':dense.b}
            mp_layer_weights[layer.scope] = values
        
        norms = {'mean':self.mean, 'var':self.var}
        mp_layer_weights['normalizations'] = norms

        return mp_layer_weights
    
    def save_weights(self,path):
        #extract weights from trained model
        layer_weights = self.sess.run(_p_extract_weights())
        print('Whole layer weights: {0}'.format(layer_weights))
        np.save(path,layer_weights)
    
    def load_weights(self,path):
        layer_weights = np.load(path)
        for dense in self.layers:
            print('Scope:{0}'.format(dense.scope))
            values = layer_weights.get(dense.scope)
            weights = values.get('weights')
            biases = values.get('biases')
            dense.set_parameters(weights,biases)
        print('Done!')
        
    def save_variables_for_rnn(self,path,prefix="RNN/FullNetworkCell/Transition/"):
        variables = tf.trainable_variables()
        var_dict = {}
        for v in variables:
            if "/read" in v.name:
                name = prefix+re.sub("/read", "", v.name)
                name = re.sub(":0", "", name)
                var_dict[name] = v
            else:
                name = prefix+v.name
                name = re.sub(":0", "", name)
                var_dict[name] = v
        for k,v in var_dict.items():
            print(k)
            print(v)
        saver = tf.train.Saver(var_dict)
        saver.save(self.sess, PathFinder(path))     

In [13]:
#Instantiate a network
dnn_inst = DeepNet(x,y,2,20,tf.nn.sigmoid)

### Tensorflow graph visualization function

In [14]:
from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = "<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:960px;height:600px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

In [15]:
show_graph(tf.get_default_graph().as_graph_def())

### Training

In [16]:
dnn_inst.train_model(x_train,y_train,x_valid,y_valid,100)

[ 0.3121216   0.31117877  1.83066809  1.82496965]
[ 0.3121216   0.31117877  1.83066809  1.82496965]
Train loss in epoch 0: [0.062466368], Test loss: [0.034349203]
Train loss in epoch 1: [0.033031475], Test loss: [0.018952383]
Train loss in epoch 2: [0.028611565], Test loss: [0.01649496]
Train loss in epoch 3: [0.02273812], Test loss: [0.013875462]
Train loss in epoch 4: [0.026982909], Test loss: [0.017640404]
Train loss in epoch 5: [0.023936484], Test loss: [0.016481556]
Train loss in epoch 6: [0.020464502], Test loss: [0.014591033]
Train loss in epoch 7: [0.021019015], Test loss: [0.015299686]
Train loss in epoch 8: [0.020014178], Test loss: [0.014765291]
Train loss in epoch 9: [0.019201435], Test loss: [0.014758096]
Train loss in epoch 10: [0.018675691], Test loss: [0.014668151]
Train loss in epoch 11: [0.018675022], Test loss: [0.014364643]
Train loss in epoch 12: [0.018756781], Test loss: [0.014489091]
Train loss in epoch 13: [0.01901284], Test loss: [0.014716228]
Train loss in epo

In [17]:
#Saving function checking..
dnn_inst.layers[0].w.name

'Layer0/Matrix/read:0'

In [18]:
tf.trainable_variables()

[<tensorflow.python.ops.variables.Variable at 0x7f4a8cc5f908>,
 <tensorflow.python.ops.variables.Variable at 0x7f4a8cc5fa90>,
 <tensorflow.python.ops.variables.Variable at 0x7f4a8cc1fe10>,
 <tensorflow.python.ops.variables.Variable at 0x7f4a8cc1fc50>,
 <tensorflow.python.ops.variables.Variable at 0x7f4a8cbf7860>,
 <tensorflow.python.ops.variables.Variable at 0x7f4a8cc2bf98>]

In [19]:
dnn_inst.save_variables_for_rnn("WEIGHTS_FOLDER/TRANSITION_NET.chkp")

RNN/FullNetworkCell/Transition/Layer0/Bias
Tensor("Layer0/Bias/read:0", shape=(20,), dtype=float32)
RNN/FullNetworkCell/Transition/Layer2/Bias
Tensor("Layer2/Bias/read:0", shape=(2,), dtype=float32)
RNN/FullNetworkCell/Transition/Layer0/Matrix
Tensor("Layer0/Matrix/read:0", shape=(4, 20), dtype=float32)
RNN/FullNetworkCell/Transition/Layer1/Matrix
Tensor("Layer1/Matrix/read:0", shape=(20, 20), dtype=float32)
RNN/FullNetworkCell/Transition/Layer1/Bias
Tensor("Layer1/Bias/read:0", shape=(20,), dtype=float32)
RNN/FullNetworkCell/Transition/Layer2/Matrix
Tensor("Layer2/Matrix/read:0", shape=(20, 2), dtype=float32)
/home/wuga/Documents/Notebook/VAE-PLANNING/WEIGHTS_FOLDER/TRANSITION_NET.chkp
