Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remember initial state #3

Merged
merged 4 commits into from
Dec 23, 2016
Merged

remember initial state #3

merged 4 commits into from
Dec 23, 2016

Conversation

vivanov879
Copy link

No description provided.

@vivanov879
Copy link
Author

trying out the change --
had an error right in the end oftraining

@vivanov879
Copy link
Author

/usr/bin/python3.5 /home/vivanov/PycharmProjects/PLSTM/simplePhasedLSTM.py
Compiling RNN...
DONE!
Compiling cost functions...
DONE!
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Calculating gradients...
DONE!
Initializing variables...
DONE!
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1021, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1003, in _run_fn
status, run_metadata)
File "/usr/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [320,1] vs. shape[1] = [32,100]
[[Node: 0/RNN/while/PhasedLSTMCell/concat = Concat[N=2, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](0/RNN/while/PhasedLSTMCell/concat/concat_dim, 0/RNN/while/PhasedLSTMCell/Slice_1, 0/RNN/while/Identity_3)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/vivanov/PycharmProjects/PLSTM/simplePhasedLSTM.py", line 276, in
if name == "main":
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/vivanov/PycharmProjects/PLSTM/simplePhasedLSTM.py", line 265, in main
y: test_ys,
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [320,1] vs. shape[1] = [32,100]
[[Node: 0/RNN/while/PhasedLSTMCell/concat = Concat[N=2, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](0/RNN/while/PhasedLSTMCell/concat/concat_dim, 0/RNN/while/PhasedLSTMCell/Slice_1, 0/RNN/while/Identity_3)]]

Caused by op '0/RNN/while/PhasedLSTMCell/concat', defined at:
File "/home/vivanov/PycharmProjects/PLSTM/simplePhasedLSTM.py", line 276, in
if name == "main":
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/vivanov/PycharmProjects/PLSTM/simplePhasedLSTM.py", line 190, in main
print ("Compiling RNN...",)
File "/home/vivanov/PycharmProjects/PLSTM/simplePhasedLSTM.py", line 150, in RNN
initial_states = [tf.nn.rnn_cell.LSTMStateTuple(tf.zeros([FLAGS.batch_size, FLAGS.n_hidden], tf.float32), tf.zeros([FLAGS.batch_size, FLAGS.n_hidden], tf.float32)) for _ in range(FLAGS.n_layers)]
File "/home/vivanov/PycharmProjects/PLSTM/PhasedLSTMCell.py", line 360, in multiPLSTM
outputs, initial_states[k] = tf.nn.dynamic_rnn(cell, newX, dtype=tf.float32, sequence_length=lens, initial_state=initial_states[k])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py", line 845, in dynamic_rnn
dtype=dtype)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py", line 1012, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2636, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2469, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2419, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py", line 995, in _time_step
skip_conditionals=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py", line 403, in _rnn_step
new_output, new_state = call_cell()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py", line 983, in
call_cell = lambda: cell(input_t, state)
File "/home/vivanov/PycharmProjects/PLSTM/PhasedLSTMCell.py", line 292, in call
cell_inputs = array_ops.concat(1, [filtered_inputs, m_prev])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 1005, in concat
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 438, in _concat
values=values, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1128, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [320,1] vs. shape[1] = [32,100]
[[Node: 0/RNN/while/PhasedLSTMCell/concat = Concat[N=2, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](0/RNN/while/PhasedLSTMCell/concat/concat_dim, 0/RNN/while/PhasedLSTMCell/Slice_1, 0/RNN/while/Identity_3)]]

@vivanov879
Copy link
Author

now it remembers states between runs


outputs = multiPLSTM(_X, lens, FLAGS.n_layers, FLAGS.n_hidden, n_input)
initial_states = [tf.nn.rnn_cell.LSTMStateTuple(tf.zeros([FLAGS.batch_size, FLAGS.n_hidden], tf.float32), tf.zeros([FLAGS.batch_size, FLAGS.n_hidden], tf.float32)) for _ in range(FLAGS.n_layers)]
outputs, initial_states = multiPLSTM(_X, lens, FLAGS.n_layers, FLAGS.n_hidden, n_input, initial_states)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway to make it work with tf.nn.rnn_cell.MultiRNNCell? it's more intitutive and can use cell.zero_state to generate the initial_states

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's possible but wont make initial_state any simpler
same number of words and same sense

@vivanov879
Copy link
Author

i tracked the bug in my code -- it's wrong shapes during test -- will fix now

@vivanov879
Copy link
Author

different batch size during test -- fixed that -- now testing

@vivanov879
Copy link
Author

vivanov879 commented Dec 23, 2016

I want it to remember states like an ordinary multilayer LSTM would. I keep each layer's state in a list. Now PLSTM uses previous end state as initial states of the current run

@vivanov879
Copy link
Author

/usr/bin/python3.5 /home/vivanov/PycharmProjects/PLSTM/simplePhasedLSTM.py
Compiling RNN...
DONE!
Compiling cost functions...
DONE!
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Calculating gradients...
DONE!
Initializing variables...
DONE!
+-----------+----------+------------+
| Epoch=0 | Cost | Accuracy |
+===========+==========+============+
| Train | 0.354964 | 0.832031 |
+-----------+----------+------------+
| Test | 0.137772 | 0.96875 |
+-----------+----------+------------+

@vivanov879
Copy link
Author

here's a full run -- it works

@vivanov879
Copy link
Author

i wonder how to keep track of initial_states in a tensorflow's summary -- can you please help me with that? That would prove it works correctly

@vivanov879
Copy link
Author

+-----------+----------+------------+
| Epoch=0 | Cost | Accuracy |
+===========+==========+============+
| Train | 0.354964 | 0.832031 |
+-----------+----------+------------+
| Test | 0.137772 | 0.96875 |
+-----------+----------+------------+
+-----------+-----------+------------+
| Epoch=1 | Cost | Accuracy |
+===========+===========+============+
| Train | 0.137392 | 0.956641 |
+-----------+-----------+------------+
| Test | 0.0635801 | 0.96875 |
+-----------+-----------+------------+

@Enny1991
Copy link
Owner

Hey! Very nice job,
Can we make the initial state argument optional? I'd like to keep it such that people can also not specify the initial state.

@vivanov879
Copy link
Author

vivanov879 commented Dec 23, 2016 via email

@Enny1991
Copy link
Owner

Yeah actually I can do it now! I'll merge, update with this flag and add the summary for the initial state.

@Enny1991 Enny1991 merged commit 7540014 into Enny1991:master Dec 23, 2016
@vivanov879
Copy link
Author

that's great -- looking forward to try it out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants