Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use DynamicRNNLayer ? #18

Closed
cobnut opened this issue Nov 9, 2016 · 15 comments
Closed

How to use DynamicRNNLayer ? #18

cobnut opened this issue Nov 9, 2016 · 15 comments

Comments

@cobnut
Copy link

cobnut commented Nov 9, 2016

Thanks for your debug, but...

  1. "max_length = tf.shape(self.outputs)[1]
    self.outputs = tf.reshape(tf.concat(1, outputs), [-1, max_length, n_hidden])", this is your new code
    but also wrong, "max_length = tf.shape(self.outputs)[1]" no self.outputs, maybe like this
    "max_length = tf.shape(outputs)[1]"

  2. i can not find some DynamicRNNLayer example, the examples:

    input_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name="input_seqs")
    network = tl.layers.EmbeddingInputlayer(
                 inputs = input_seqs,
                 vocabulary_size = vocab_size,
                 embedding_size = embedding_size,
                 name = 'seq_embedding')
     network = tl.layers.DynamicRNNLayer(network,
                 cell_fn = tf.nn.rnn_cell.BasicLSTMCell,
                 n_hidden = embedding_size,
                 dropout = 0.7,
                 sequence_length = tl.layers.retrieve_seq_length_op2(input_seqs),
                 return_seq_2d = True,     # stack denselayer or compute cost after it
                 name = 'dynamic_rnn',)
     network = tl.layers.DenseLayer(network, n_units=vocab_size,
                 act=tf.identity, name="output")
    

    i can not understand "shape=[batch_size, None]" i just write None? or n_step(max)
    if my data like this: (Sentiment Analysis)

    x = [[2, 1, 8, 9, 2]
         [2, 4]
         [1, 1, 3, 5] 
         [6, 3, 2]]
    y = [1, 0, 1, 1]
    

    i padding with zero

     x = [[2, 1, 8, 9, 2]
          [2, 4, 0, 0, 0]
          [1, 1, 3, 5, 0] 
          [6, 3, 2, 0, 0]]
    

    how to use DynamicRNNLayer?

@wagamamaz
Copy link
Collaborator

@narrator-wong Does max_length = tf.shape(outputs)[1] work for you?

The shape=[batch_size, None] means you have batch_size of sentences, and the max_length can be any number.

If your number of step is fixed, RNNLayer is easier than DynamicRNNLayer, so in your case, you can define your placeholder as shape=[4, 5].

For dynamic rnn, In your case, as your batch_size is 4, you can define your placeholder as follow.

input_seqs = tf.placeholder(dtype=tf.int64, shape=[4, None], name="input_seqs")
network = tl.layers.EmbeddingInputlayer(
             inputs = input_seqs,
             vocabulary_size = vocab_size,
             embedding_size = embedding_size,
             name = 'seq_embedding')
 network = tl.layers.DynamicRNNLayer(network,
             cell_fn = tf.nn.rnn_cell.BasicLSTMCell,
             n_hidden = embedding_size,
             dropout = 0.7,
             sequence_length = tl.layers.retrieve_seq_length_op2(input_seqs),
             return_seq_2d = True,     # stack denselayer or compute cost after it
             name = 'dynamic_rnn',)
 network = tl.layers.DenseLayer(network, n_units=vocab_size,
             act=tf.identity, name="output")

tl.layers.retrieve_seq_length_op2(input_seqs) computes the sequence lengths of every sentence after zero padding, in your case, they are [5, 2, 4, 3]. (you are correct)

But in your case, it seem that, every sentences have a single label (you don't have a target sequence), I think you should use the last output for classification ? i.e. return_last=True. This blog may help you better understand either to use the last output or all the outputs.

If you use all the outputs (you have an input sequence and target sequence), you will need a mask to define the cost function, here is an example from Google im2txt.

def batch_with_dynamic_pad(images_and_captions,
                           batch_size,
                           queue_capacity,
                           add_summaries=True):
  """Batches input images and captions, returns the images, input sequence and
  output sequence.

  This function splits the caption into an input sequence and a target sequence,
  where the target sequence is the input sequence right-shifted by 1. Input and
  target sequences are batched and padded up to the maximum length of sequences
  in the batch. A mask is created to distinguish real words from padding words.

  Example 1
  -----------

    Actual captions in the batch ('-' denotes padded character):
    |      [
    |        [ 1 2 5 4 5 ],
    |        [ 1 2 3 4 - ],
    |        [ 1 2 3 - - ],
    |      ]
    |
    |    input_seqs:
    |      [
    |        [ 1 2 3 4 ],
    |        [ 1 2 3 - ],
    |        [ 1 2 - - ],
    |      ]
    |
    |    target_seqs:
    |      [
    |        [ 2 3 4 5 ],
    |        [ 2 3 4 - ],
    |        [ 2 3 - - ],
    |      ]
    |
    |    mask:
    |      [
    |        [ 1 1 1 1 ],
    |        [ 1 1 1 0 ],
    |        [ 1 1 0 0 ],
    |      ]

  Example 2
  -----------
  - input_seqs - <S> a figurine with a plastic witches head is standing in front of a computer keyboard . a
  - target_seqs - a figurine with a plastic witches head is standing in front of a computer keyboard . </S> a
  - input_mask - [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0]

  Parameters
  -----------
    images_and_captions : A list of pairs [image, caption], where image is a
      Tensor of shape [height, width, channels] and caption is a 1-D Tensor of
      any length. Each pair will be processed and added to the queue in a
      separate thread.
    batch_size : Batch size.
    queue_capacity : Queue capacity.
    add_summaries : If true, add caption length summaries.

  Returns
  --------
    images : A Tensor of shape [batch_size, height, width, channels].
    input_seqs : An int32 Tensor of shape [batch_size, padded_length].
    target_seqs : An int32 Tensor of shape [batch_size, padded_length].
    mask : An int32 0/1 Tensor of shape [batch_size, padded_length].
  """
  enqueue_list = []
  for image, caption in images_and_captions:
    caption_length = tf.shape(caption)[0]
    input_length = tf.expand_dims(tf.sub(caption_length, 1), 0)

    input_seq = tf.slice(caption, [0], input_length)
    target_seq = tf.slice(caption, [1], input_length)
    indicator = tf.ones(input_length, dtype=tf.int32)
    enqueue_list.append([image, input_seq, target_seq, indicator])

  images, input_seqs, target_seqs, mask = tf.train.batch_join(
      enqueue_list,
      batch_size=batch_size,
      capacity=queue_capacity,
      dynamic_pad=True,
      name="batch_and_pad")

  if add_summaries:
    lengths = tf.add(tf.reduce_sum(mask, 1), 1)
    tf.scalar_summary("caption_length/batch_min", tf.reduce_min(lengths))
    tf.scalar_summary("caption_length/batch_max", tf.reduce_max(lengths))
    tf.scalar_summary("caption_length/batch_mean", tf.reduce_mean(lengths))

  return images, input_seqs, target_seqs, mask

queue_capacity = (2 * num_preprocess_threads * batch_size)
images, input_seqs, target_seqs, input_mask = (
              batch_with_dynamic_pad(images_and_captions,
                                               batch_size=batch_size,
                                               queue_capacity=queue_capacity))

batch_loss, losses, weights, _ = tl.cost.cross_entropy_seq_with_mask(logits, target_seqs, input_mask, return_details=True)

If you feel the code from Google is difficult, you can define the mask by yourself, and use tl.cost.cross_entropy_seq_with_mask() to compute the cost.

More details about dynamic_rnn ops, rnn vs dynamic_rnn

Feel free to let us know when you have problem.

@cobnut
Copy link
Author

cobnut commented Nov 10, 2016

@wagamamaz max_length = tf.shape(outputs)[1] is not my work, i just find it, and report it to @zsdonghao , i saw the new code is right, appreciate to your speeeeeed! Thanks, and i wanna
write an example for DynamicRNN by using tl.layers.DynamicRNNLayer, Thanks for your answer

@zsdonghao
Copy link
Member

@narrator-wong Hi, I have an Image Captioning example for TensorLayer, hope it help https://github.com/zsdonghao/Image-Captioning

@cobnut
Copy link
Author

cobnut commented Nov 10, 2016

@zsdonghao how to write stack DynamicRNN?

  1. DynamicRNNLayer : n_layer = 2 ?
            network = tl.layers.EmbeddingInputlayer(
                        inputs = inputs,
                        vocabulary_size = vocab_size,
                        embedding_size = hidden_size,
                        E_init = tf.random_uniform_initializer(-init_scale, init_scale),
                        name ='embedding_layer')

            if is_training:
                network = tl.layers.DropoutLayer(network, keep=keep_prob, name='drop1')

            network = tl.layers.DynamicRNNLayer(network,
                        cell_fn=tf.nn.rnn_cell.BasicLSTMCell,
                        cell_init_args={'forget_bias': 0.0, 'state_is_tuple': True},
                        n_hidden=hidden_size,
                        initializer=tf.random_uniform_initializer(-init_scale, init_scale),
                        sequence_length = tl.layers.retrieve_seq_length_op2(inputs),
                        return_last=False,
                        name='dynamic_lstm_layer1')
            lstm1 = network

            if is_training:
                network = tl.layers.DropoutLayer(network, keep=keep_prob, name='drop2')

            network = tl.layers.DynamicRNNLayer(network,
                        cell_fn=tf.nn.rnn_cell.BasicLSTMCell,
                        cell_init_args={'forget_bias': 0.0, 'state_is_tuple': True},
                        n_hidden=hidden_size,
                        initializer=tf.random_uniform_initializer(-init_scale, init_scale),
                        sequence_length = tl.layers.retrieve_seq_length_op2(inputs),
                        return_last=True,
                        name='dynamic_lstm_layer2')
            lstm2 = network

            if is_training:
                network = tl.layers.DropoutLayer(network, keep=keep_prob, name='drop3')

            network = tl.layers.DenseLayer(network,
                        n_units=2,
                        W_init=tf.random_uniform_initializer(-init_scale, init_scale),
                        b_init=tf.random_uniform_initializer(-init_scale, init_scale),
                        act = tf.identity,
                        name='output_layer')

@zsdonghao
Copy link
Member

@narrator-wong You don't need to stack DynamicRNNLayer like that, you can use n_layer to set the number of RNN layers you want, and use dropout to set the input and output keeping probabilities of all RNN layers.

the dropout is implemented by tf.nn.rnn_cell.DropoutWrapper, stacking layers is implemented by tf.nn.rnn_cell.MultiRNNCell.

@cobnut
Copy link
Author

cobnut commented Nov 10, 2016

@zsdonghao Thanks, i think i got it...

@jcqu
Copy link

jcqu commented Dec 5, 2016

hello,i hava the same problem with you ,but i use the RNNLayer:

def define_layers(self,is_training=False):
print '\n building model:'
network = tl.layers.EmbeddingInputlayer(
inputs=self.x,
vocabulary_size=self.embs.n_words,
embedding_size=self.embs.d_vector,
E_init=self.embs.embs,
name='embedding_layer',
istrain=False)
network = tl.layers.RNNLayer(network,
cell_fn=tf.nn.rnn_cell.BasicLSTMCell,
cell_init_args={'forget_bias': 0.0, 'state_is_tuple': True},
n_hidden=self.embs.d_vector,
initializer=tf.random_uniform_initializer(-0.001, 0.001),
n_steps=5,
return_last=True,
name='basic_lstm')
#define NN network
for i in range(len(self.nodes)-1):
network = tl.layers.DenseLayer(network, n_units=self.nodes[i], act=tf.nn.relu,
name='hidden' + str(i))
network = tl.layers.DenseLayer(network, n_units=self.nodes[-1], act=tf.identity, name='output')
return network
def fit(self,x_data,y_data,print_model=True):
network = self.network
x,y = self.x,self.y
pred = network.outputs
cost = tf.reduce_mean(tf.pow(pred - y, 2))
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# define optimizer
train_params = network.all_params
train_op = tf.train.AdamOptimizer(learning_rate=self.lr, epsilon=1e-08,use_locking=False).minimize(cost, var_list=train_params)
# initialize_all_variables
self.sess.run(tf.initialize_all_variables())
# model training
tl.utils.fit(self.sess, network, train_op, cost, x_data, y_data, x, y, acc=acc,
#batch_size=self.batch, n_epoch=self.training_epochs, print_freq=self.display_step)
#tl.utils.test(self.sess, network, acc=acc,cost=cost, X_test=x_data, y_test=y_data, x=x, y
=y, batch_size=self.batch)

my input shape also have padding with zero just like:
x = [[5, 4, 8, 9, 2]
[0, 0, 0, 2, 4]
[0, 0, 3, 5, 7]
[0, 0, 2, 3, 3]]
y = [[1],[0],[0],[1]]
but when i try to run this session,always get the problem as follow:

File "main/lstm_units.py", line 90, in fit
tl.utils.test(self.sess, network, acc=acc,cost=cost, X_test=x_data, y_test=y_data, x=x, y_=y, batch_size=self.batch)
File "main/tensorln/utils.py", line 149, in test
err, ac = sess.run([cost, acc], feed_dict=feed_dict)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 710, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 908, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 958, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 978, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: <exception str() failed>

so i want to know where the problem is? Thank you very much

@zsdonghao
Copy link
Member

@qjc937044867 why you have a self in the function?

@jcqu
Copy link

jcqu commented Dec 5, 2016

@zsdonghao because i had packaged all network and fit/preditc into a class, so i can use them just like 'sklearn' way. As follow:

class OPR_LSTM():
    def __init__(self, d_hidden, nodes, embs, active_function='relu', loss='mean_squared_error', lr=.0001, training_epochs=100, display_step=10, batch=10, regular='L2', optimizer='sgd', keep_prob = 0.9):
    self.d_hidden, self.nodes = d_hidden, nodes
    self.embs, self.active_function, self.lr, self.keep_prob = embs, active_function, lr, keep_prob
    self.training_epochs,self.batch,self.display_step = training_epochs,batch,display_step
    self.regular = regular
    self.sess = tf.InteractiveSession()
    self.x, self.y = self._define_train_inputs()
    self.network = self._define_layers()

def _projection(self, *inputs):
    #'''
    xa, xp, x1, x2, x3, xd, xm = inputs
    xa = tl.utils.pad_sequences(xa, maxlen=5,truncating='post')
    xp = tl.utils.pad_sequences(xp, maxlen=5,truncating='post')
    x1 = tl.utils.pad_sequences(x1, maxlen=5,truncating='post')
    x2 = tl.utils.pad_sequences(x2, maxlen=5,truncating='post')
    x3 = tl.utils.pad_sequences(x3, maxlen=10,truncating='post')
    xt = np.hstack([xa, xp, x1, x2, x3])
    return xt

def _define_train_inputs(self):
    # Define placeholder
    x = tf.placeholder(tf.int32, shape=[self.batch, 30], name='x')
    y = tf.placeholder(tf.float32, shape=[self.batch, 1], name='y')
    return x,y

def _define_layers(self,is_training=False):
    print '\n  building model:'
    network = tl.layers.EmbeddingInputlayer(
        inputs=self.x,
        vocabulary_size=self.embs.n_words,
        embedding_size=self.embs.d_vector,
        E_init=self.embs.embs,
        name='embedding_layer',
        istrain=False)
    '''
    network = tl.layers.DynamicRNNLayer(network,
        cell_fn = tf.nn.rnn_cell.BasicLSTMCell,
        n_hidden = self.embs.d_vector,
        dropout = 0.8,
        sequence_length = tl.layers.retrieve_seq_length_op2(self.x),
        return_last=True,
        return_seq_2d = False,  # stack denselayer or compute cost after it
        name = 'dynamic_rnn')
    network = tl.layers.DenseLayer(network, n_units=self.nodes[-1], act=tf.sigmoid, name='output')
    '''
    return network

def fit(self,x_data,y_data,print_model=False):
    y_data = np.array(y_data)
    network = self.network
    x,y = self.x,self.y

    pred = network.outputs
    cost = tf.reduce_mean(tf.pow(pred - y, 2)/self.batch)
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    # define optimizer
    #train_params = network.all_params
    #train_op = tf.train.AdamOptimizer(learning_rate=self.lr, epsilon=1e-08,use_locking=False).minimize(cost, var_list=train_params)
    tvars = tf.trainable_variables()
    grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars),5)
    optimizer = tf.train.GradientDescentOptimizer(self.lr)
    train_op = optimizer.apply_gradients(zip(grads, tvars))
    # initialize_all_variables
    self.sess.run(tf.initialize_all_variables())
    # print model information
    if print_model:
        network.print_params()
        network.print_layers()
    # model training
    tl.utils.fit(self.sess, network, train_op, cost, x_data, y_data, x, y, acc=acc,
                        batch_size=self.batch, n_epoch=self.training_epochs, print_freq=self.display_step)
    #tl.utils.test(self.sess, network, acc=acc,cost=cost, X_test=x_data, y_test=y_data, x=x, y_=y, batch_size=self.batch)

def predict(self,x_test):
    # Predict surviving chances (class 1 results)
    y_op = self.network.outputs
    return tl.utils.predict(self.sess, self.network, x_test, self.x, y_op)

def save_model(self,name):
    # save model as '.npz' file
    tl.files.save_npz(self.network.all_params, name=name)

def load_model(self,name):
    # load model by name
    tl.files.load_npz(name=name)

And i have find another problem in EmbeddingInputlayer, how can i use the embedings which has been pre-trained with wiki. Now i use like this, where self.embs.embs is the embedings_weights

network = tl.layers.EmbeddingInputlayer(
        inputs=self.x,
        vocabulary_size=self.embs.n_words,
        embedding_size=self.embs.d_vector,
        E_init=self.embs.embs,
        name='embedding_layer',
        istrain=False)

but i got the problem as follow,and i don't know why:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: <exception str() failed>

Thanks for you attention, and looking forward to receiving your soonest reply.

@zsdonghao
Copy link
Member

@qjc937044867 you cannot use utils functions, when you use RNN. please see the PTB example.

@jcqu
Copy link

jcqu commented Dec 5, 2016

ok, i will try . And i have find another problem in EmbeddingInputlayer, how can i use the embedings which has been pre-trained with wiki. Now i use like this, where self.embs.embs is the embedings_weights

network = tl.layers.EmbeddingInputlayer(
inputs=self.x,
vocabulary_size=self.embs.n_words,
embedding_size=self.embs.d_vector,
E_init=tf.constant_initializer(self.embs.embs),
name='embedding_layer',
istrain=False)

but i got the problem as follow,and i don't know why:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: <exception str() failed>

thank you very much!

@zsdonghao
Copy link
Member

you can find this example in here as well.

reuse embedding matrix
ptb

@jcqu
Copy link

jcqu commented Dec 5, 2016

@zsdonghao sorry,i haven't found the right way to use the pre-trained embedings in those file. And when i try to use RNNLayer in PTB way,i got another problem, my code as follow:

class OPR_LSTM():
def __init__(self, d_hidden, nodes, embs, active_function='sigmoid', loss='mean_squared_error',
             training_epochs=100, display_step=10, batch=10, regular='L2', decay=0.8, optimizer='sgd', keep_prob = 0.9):
    self.d_hidden, self.nodes = d_hidden, nodes
    self.embs, self.active_function, self.keep_prob = embs, active_function, keep_prob
    self.training_epochs,self.batch,self.display_step = training_epochs,batch,display_step
    self.regular,self.lr_decay, self.lr = regular, decay, 1.0
    self.max_epoch = 14
    self.max_max_epoch = 55
    self.num_steps = 30
    #self.sess = tf.InteractiveSession()
    self.x, self.y = self._define_train_inputs()

def _projection(self, *inputs):
    #'''
    xa, xp, x1, x2, x3, xd, xm = inputs
    xa = tl.utils.pad_sequences(xa, maxlen=5,truncating='post')
    xp = tl.utils.pad_sequences(xp, maxlen=5,truncating='post')
    x1 = tl.utils.pad_sequences(x1, maxlen=5,truncating='post')
    x2 = tl.utils.pad_sequences(x2, maxlen=5,truncating='post')
    x3 = tl.utils.pad_sequences(x3, maxlen=10,truncating='post')
    xt = np.hstack([xa, xp, x1, x2, x3])
    #return [x,np.array(xd).astype(dtype='float32'),np.array(xm).astype(dtype='float32')]
    return xt

def _define_train_inputs(self):
    # Define placeholder
    '''
    x1 = tf.placeholder(tf.int32, shape=[None, 5], name='x1')
    x2 = tf.placeholder(tf.int32, shape=[None, 5], name='x2')
    x3 = tf.placeholder(tf.float32, shape=[None, 2], name='x3')
    '''
    x = tf.placeholder(tf.int32, shape=[None, self.num_steps], name='x')
    y = tf.placeholder(tf.float32, shape=[None, 1], name='y')
    return x,y

def _define_layers(self,is_training=False,reuse=None):
    print '\n  building model:'
    with tf.variable_scope("model", reuse=reuse):
        tl.layers.set_name_reuse(reuse)
        network = tl.layers.EmbeddingInputlayer(
            inputs=self.x,
            vocabulary_size=self.embs.n_words,
            embedding_size=self.embs.d_vector,
            E_init=tf.constant_initializer(self.embs.embs),
            name='embedding_layer',
            istrain=False)
        network = tl.layers.RNNLayer(network,
            cell_fn=tf.nn.rnn_cell.BasicLSTMCell,
            cell_init_args={'forget_bias': 0.0, 'state_is_tuple': True},
            n_hidden=self.embs.d_vector,
            initializer=tf.random_uniform_initializer(-0.001, 0.001),
            n_steps=self.num_steps,
            return_last=False,
            return_seq_2d=True,
            name='basic_lstm')
        lstm = network
        network = tl.layers.DenseLayer(network, n_units=self.nodes[-1], act=tf.sigmoid, name='output')
    return network,lstm

def fit(self,x_data,y_data,print_model=False):
    sess = tf.InteractiveSession()
    network, lstm = self._define_layers()
    x,y = self.x,self.y
    with tf.variable_scope('learning_rate'):
        lr = tf.Variable(0.0, trainable=False)
    pred = network.outputs
    cost = tf.reduce_mean(tf.pow(pred - y, 2)/self.batch)
    # define optimizer
    tvars = tf.trainable_variables()
    grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars),5)
    optimizer = tf.train.GradientDescentOptimizer(self.lr)
    train_op = optimizer.apply_gradients(zip(grads, tvars))
    # initialize_all_variables
    sess.run(tf.initialize_all_variables())
    # print model information
    if print_model:
        network.print_params()
        network.print_layers()
    # model training
    for i in range(self.max_max_epoch):
        # decreases the initial learning rate after several
        # epoachs (defined by ``max_epoch``), by multipling a ``lr_decay``.
        new_lr_decay = self.lr_decay ** max(i - self.max_epoch, 0.0)
        sess.run(tf.assign(lr, self.lr * new_lr_decay))

        # Training
        print("Epoch: %d/%d Learning rate: %.3f" % (i + 1, self.max_max_epoch, sess.run(lr)))
        epoch_size = ((len(x_data) // self.batch) - 1) // self.num_steps
        start_time = time.time()
        costs,iters = 0.0, 0

        # reset all states at the begining of every epoch
        state = tl.layers.initialize_rnn_state(lstm.initial_state) # ERROR

        for step, (x_, y_) in enumerate(tl.iterate.minibatches(x_data, self.batch, self.num_steps)):
            feed_dict = {x: x_, y: y_,
                         lstm.initial_state: state,
                         }
            # For training, enable dropout
            feed_dict.update(network.all_drop)
            _cost, state, _ = sess.run([cost,lstm.final_state,train_op],feed_dict=feed_dict)
            costs += _cost
            iters += self.num_steps

            if step % (epoch_size // 10) == 10:
                print("%.3f perplexity: %.3f speed: %.0f wps" %
                      (step * 1.0 / epoch_size, np.exp(costs / iters),
                       iters * self.batch / (time.time() - start_time)))
        train_perplexity = np.exp(costs / iters)
        print("Epoch: %d/%d Train Perplexity: %.3f" % (i + 1, self.max_max_epoch,
                                                       train_perplexity))

Error report as :
Traceback (most recent call last):
File "/home/qjc/文档/zero_anaphora/main/overt_pronoun_resolution1.py", line 54, in
md.train(n_epoch=40,evaluation=True,save=True)
File "/home/qjc/文档/zero_anaphora/main/libs/models.py", line 188, in train
self.clf.fit(x,y)
File "/home/qjc/文档/zero_anaphora/main/lstm_units.py", line 120, in fit
state = tl.layers.initialize_rnn_state(lstm.initial_state)
File "/home/qjc/文档/zero_anaphora/main/tensorln/layers.py", line 59, in initialize_rnn_state
c = state.c.eval(session=session)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 575, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3633, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: <exception str() failed>

i know i should try to solve this problem by myself,but i have to finish it in two days, so please help me,thank you very much!

@zsdonghao
Copy link
Member

@qjc937044867 I suggest you to find help on QQ group or waffle.

@jcqu
Copy link

jcqu commented Dec 5, 2016

@zsdonghao ok,thank you.

@zsdonghao zsdonghao changed the title class DynamicRNNLayer(Layer) How to use DynamicRNNLayer ? Dec 27, 2016
zsdonghao pushed a commit that referenced this issue May 4, 2019
tl.cost refactored, tl.initializers refactored tested doced
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants