Skip to content

Latest commit

 

History

History
executable file
·
720 lines (502 loc) · 17.7 KB

layers.rst

File metadata and controls

executable file
·
720 lines (502 loc) · 17.7 KB

API - Layers

To make TensorLayer simple, we minimize the number of layer classes as much as we can. So we encourage you to use TensorFlow's function. For example, we provide layer for local response normalization, but we still suggest users to apply tf.nn.lrn on network.outputs. More functions can be found in TensorFlow API.

Understand Basic layer

All TensorLayer layers have a number of properties in common:

  • layer.outputs : a Tensor, the outputs of current layer.
  • layer.all_params : a list of Tensor, all network variables in order.
  • layer.all_layers : a list of Tensor, all network outputs in order.
  • layer.all_drop : a dictionary of {placeholder : float}, all keeping probabilities of noise layer.

All TensorLayer layers have a number of methods in common:

  • layer.print_params() : print the network variables information in order (after tl.layers.initialize_global_variables(sess)). alternatively, print all variables by tl.layers.print_all_variables().
  • layer.print_layers() : print the network layers information in order.
  • layer.count_params() : print the number of parameters in the network.

The initialization of a network is done by input layer, then we can stacked layers as follow, a network is a Layer class. The most important properties of a network are network.all_params, network.all_layers and network.all_drop. The all_params is a list which store all pointers of all network parameters in order, the following script define a 3 layer network, then:

all_params = [W1, b1, W2, b2, W_out, b_out]

To get specified variables, you can use network.all_params[2:3] or get_variables_with_name(). As the all_layers is a list which store all pointers of the outputs of all layers, in the following network:

all_layers = [drop(?,784), relu(?,800), drop(?,800), relu(?,800), drop(?,800)], identity(?,10)]

where ? reflects any batch size. You can print the layer information and parameters information by using network.print_layers() and network.print_params(). To count the number of parameters in a network, run network.count_params().

sess = tf.InteractiveSession()

x = tf.placeholder(tf.float32, shape=[None, 784], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')

network = tl.layers.InputLayer(x, name='input_layer')
network = tl.layers.DropoutLayer(network, keep=0.8, name='drop1')
network = tl.layers.DenseLayer(network, n_units=800,
                                act = tf.nn.relu, name='relu1')
network = tl.layers.DropoutLayer(network, keep=0.5, name='drop2')
network = tl.layers.DenseLayer(network, n_units=800,
                                act = tf.nn.relu, name='relu2')
network = tl.layers.DropoutLayer(network, keep=0.5, name='drop3')
network = tl.layers.DenseLayer(network, n_units=10,
                                act = tl.activation.identity,
                                name='output_layer')

y = network.outputs
y_op = tf.argmax(tf.nn.softmax(y), 1)

cost = tl.cost.cross_entropy(y, y_)

train_params = network.all_params

train_op = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999,
                            epsilon=1e-08, use_locking=False).minimize(cost, var_list = train_params)

tl.layers.initialize_global_variables(sess)

network.print_params()
network.print_layers()

In addition, network.all_drop is a dictionary which stores the keeping probabilities of all noise layer. In the above network, they are the keeping probabilities of dropout layers.

So for training, enable all dropout layers as follow.

feed_dict = {x: X_train_a, y_: y_train_a}
feed_dict.update( network.all_drop )
loss, _ = sess.run([cost, train_op], feed_dict=feed_dict)
feed_dict.update( network.all_drop )

For evaluating and testing, disable all dropout layers as follow.

feed_dict = {x: X_val, y_: y_val}
feed_dict.update(dp_dict)
print("   val loss: %f" % sess.run(cost, feed_dict=feed_dict))
print("   val acc: %f" % np.mean(y_val ==
                        sess.run(y_op, feed_dict=feed_dict)))

For more details, please read the MNIST examples on Github.

Understand Dense layer

Before creating your own TensorLayer layer, let's have a look at Dense layer. It creates a weights matrix and biases vector if not exists, then implement the output expression. At the end, as a layer with parameter, we also need to append the parameters into all_params.

class MyDenseLayer(Layer):
  def __init__(
      self,
      layer = None,
      n_units = 100,
      act = tf.nn.relu,
      name ='simple_dense',
  ):
      # check layer name (fixed)
      Layer.__init__(self, name=name)

      # the input of this layer is the output of previous layer (fixed)
      self.inputs = layer.outputs

      # print out info (customized)
      print("  MyDenseLayer %s: %d, %s" % (self.name, n_units, act))

      # operation (customized)
      n_in = int(self.inputs._shape[-1])
      with tf.variable_scope(name) as vs:
          # create new parameters
          W = tf.get_variable(name='W', shape=(n_in, n_units))
          b = tf.get_variable(name='b', shape=(n_units))
          # tensor operation
          self.outputs = act(tf.matmul(self.inputs, W) + b)

      # get stuff from previous layer (fixed)
      self.all_layers = list(layer.all_layers)
      self.all_params = list(layer.all_params)
      self.all_drop = dict(layer.all_drop)

      # update layer (customized)
      self.all_layers.extend( [self.outputs] )
      self.all_params.extend( [W, b] )

Your layer

A simple layer

To implement a custom layer in TensorLayer, you will have to write a Python class that subclasses Layer and implement the outputs expression.

The following is an example implementation of a layer that multiplies its input by 2:

class DoubleLayer(Layer):
    def __init__(
        self,
        layer = None,
        name ='double_layer',
    ):
        # check layer name (fixed)
        Layer.__init__(self, name=name)

        # the input of this layer is the output of previous layer (fixed)
        self.inputs = layer.outputs

        # operation (customized)
        self.outputs = self.inputs * 2

        # get stuff from previous layer (fixed)
        self.all_layers = list(layer.all_layers)
        self.all_params = list(layer.all_params)
        self.all_drop = dict(layer.all_drop)

        # update layer (customized)
        self.all_layers.extend( [self.outputs] )

Modifying Pre-train Behaviour

Greedy layer-wise pretraining is an important task for deep neural network initialization, while there are many kinds of pre-training methods according to different network architectures and applications.

For example, the pre-train process of Vanilla Sparse Autoencoder can be implemented by using KL divergence (for sigmoid) as the following code, but for Deep Rectifier Network, the sparsity can be implemented by using the L1 regularization of activation output.

# Vanilla Sparse Autoencoder
beta = 4
rho = 0.15
p_hat = tf.reduce_mean(activation_out, reduction_indices = 0)
KLD = beta * tf.reduce_sum( rho * tf.log(tf.div(rho, p_hat))
        + (1- rho) * tf.log((1- rho)/ (tf.sub(float(1), p_hat))) )

There are many pre-train methods, for this reason, TensorLayer provides a simple way to modify or design your own pre-train method. For Autoencoder, TensorLayer uses ReconLayer.__init__() to define the reconstruction layer and cost function, to define your own cost function, just simply modify the self.cost in ReconLayer.__init__(). To creat your own cost expression please read Tensorflow Math. By default, ReconLayer only updates the weights and biases of previous 1 layer by using self.train_params = self.all _params[-4:], where the 4 parameters are [W_encoder, b_encoder, W_decoder, b_decoder], where W_encoder, b_encoder belong to previous DenseLayer, W_decoder, b_decoder belong to this ReconLayer. In addition, if you want to update the parameters of previous 2 layers at the same time, simply modify [-4:] to [-6:].

ReconLayer.__init__(...):
    ...
    self.train_params = self.all_params[-4:]
    ...
  self.cost = mse + L1_a + L2_w

Layer list

tensorlayer.layers

get_variables_with_name get_layers_with_name set_name_reuse print_all_variables initialize_global_variables

Layer

InputLayer OneHotInputLayer Word2vecEmbeddingInputlayer EmbeddingInputlayer

DenseLayer ReconLayer DropoutLayer GaussianNoiseLayer DropconnectDenseLayer

Conv1dLayer Conv2dLayer DeConv2dLayer Conv3dLayer DeConv3dLayer PoolLayer PadLayer UpSampling2dLayer DownSampling2dLayer AtrousConv1dLayer AtrousConv2dLayer SeparableConv2dLayer

Conv1d Conv2d DeConv2d

MaxPool1d MeanPool1d MaxPool2d MeanPool2d MaxPool3d MeanPool3d

SubpixelConv2d

SpatialTransformer2dAffineLayer transformer batch_transformer

BatchNormLayer LocalResponseNormLayer

TimeDistributedLayer

RNNLayer BiRNNLayer advanced_indexing_op retrieve_seq_length_op retrieve_seq_length_op2 DynamicRNNLayer BiDynamicRNNLayer

Seq2Seq PeekySeq2Seq AttentionSeq2Seq

FlattenLayer ReshapeLayer LambdaLayer

ConcatLayer ElementwiseLayer

ExpandDimsLayer TileLayer

EstimatorLayer SlimNetsLayer KerasLayer

PReluLayer

MultiplexerLayer

EmbeddingAttentionSeq2seqWrapper

flatten_reshape clear_layers_name initialize_rnn_state list_remove_repeat

Name Scope and Sharing Parameters

These functions help you to reuse parameters for different inference (graph), and get a list of parameters by given name. About TensorFlow parameters sharing click here.

Get variables with name

get_variables_with_name

Get layers with name

get_layers_with_name

Enable layer name reuse

set_name_reuse

Print variables

print_all_variables

Initialize variables

initialize_global_variables

Basic layer

Layer

Input layer

InputLayer

One-hot layer

OneHotInputLayer

Word Embedding Input layer

Word2vec layer for training

Word2vecEmbeddingInputlayer

Embedding Input layer

EmbeddingInputlayer

Dense layer

Dense layer

DenseLayer

Reconstruction layer for Autoencoder

ReconLayer

Noise layer

Dropout layer

DropoutLayer

Gaussian noise layer

GaussianNoiseLayer

Dropconnect + Dense layer

DropconnectDenseLayer

Convolutional layer (Pro)

1D Convolutional layer

Conv1dLayer

2D Convolutional layer

Conv2dLayer

2D Deconvolutional layer

DeConv2dLayer

3D Convolutional layer

Conv3dLayer

3D Deconvolutional layer

DeConv3dLayer

2D UpSampling layer

UpSampling2dLayer

2D DownSampling layer

DownSampling2dLayer

1D Atrous convolutional layer

AtrousConv1dLayer

2D Atrous convolutional layer

AtrousConv2dLayer

2D Separable convolutional layer ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. autoclass:: SeparableConv2dLayer

Convolutional layer (Simplified)

For users don't familiar with TensorFlow, the following simplified functions may easier for you. We will provide more simplified functions later, but if you are good at TensorFlow, the professional APIs may better for you.

1D Convolutional layer

Conv1d

2D Convolutional layer

Conv2d

2D Deconvolutional layer

DeConv2d

1D Max pooling layer

MaxPool1d

1D Mean pooling layer

MeanPool1d

2D Max pooling layer

MaxPool2d

2D Mean pooling layer

MeanPool2d

3D Max pooling layer

MaxPool3d

3D Mean pooling layer

MeanPool3d

Super-resolution layer

SubpixelConv2d

Spatial Transformer

2D Affine Transformation layer

SpatialTransformer2dAffineLayer

2D Affine Transformation function

transformer

Batch 2D Affine Transformation function

batch_transformer

Pooling layer

Pooling layer for any dimensions and any pooling functions.

PoolLayer

Padding layer

Padding layer for any modes.

PadLayer

Normalization layer

For local response normalization as it does not have any weights and arguments, you can also apply tf.nn.lrn on network.outputs.

Batch Normalization

BatchNormLayer

Local Response Normalization

LocalResponseNormLayer

Time distributed layer

TimeDistributedLayer

Fixed Length Recurrent layer

All recurrent layers can implement any type of RNN cell by feeding different cell function (LSTM, GRU etc).

RNN layer

RNNLayer

Bidirectional layer

BiRNNLayer

Advanced Ops for Dynamic RNN

These operations usually be used inside Dynamic RNN layer, they can compute the sequence lengths for different situation and get the last RNN outputs by indexing.

Output indexing

advanced_indexing_op

Compute Sequence length 1

retrieve_seq_length_op

Compute Sequence length 2

retrieve_seq_length_op2

Dynamic RNN layer

RNN layer

DynamicRNNLayer

Bidirectional layer

BiDynamicRNNLayer

Sequence to Sequence

Simple Seq2Seq

Seq2Seq

PeekySeq2Seq

PeekySeq2Seq

AttentionSeq2Seq

AttentionSeq2Seq

Shape layer

Flatten layer

FlattenLayer

Reshape layer

ReshapeLayer

Lambda layer

LambdaLayer

Merge layer

Concat layer

ConcatLayer

Element-wise layer

ElementwiseLayer

Extend layer

Expand dims layer

ExpandDimsLayer

Tile layer

TileLayer

Estimator layer

EstimatorLayer

Connect TF-Slim

Yes ! TF-Slim models can be connected into TensorLayer, all Google's Pre-trained model can be used easily , see Slim-model .

SlimNetsLayer

Connect Keras

Yes ! Keras models can be connected into TensorLayer! see tutorial_keras.py .

KerasLayer

Parametric activation layer

PReluLayer

Flow control layer

MultiplexerLayer

Wrapper

Embedding + Attention + Seq2seq

EmbeddingAttentionSeq2seqWrapper

Helper functions

Flatten tensor

flatten_reshape

Permanent clear existing layer names

clear_layers_name

Initialize RNN state

initialize_rnn_state

Remove repeated items in a list

list_remove_repeat