Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not working anymore on Tensorflow 1.0 version #40

Open
bingoko opened this issue Apr 1, 2017 · 2 comments
Open

not working anymore on Tensorflow 1.0 version #40

bingoko opened this issue Apr 1, 2017 · 2 comments

Comments

@bingoko
Copy link

bingoko commented Apr 1, 2017

The first error is about tf.nn has no attribute rnn_cell for models/dual_encoder.py line 45.
I fixed this error by changing from tf.nn.rnn_cell to tf.contrib.rnn.core_rnn_cell.

Then the second error is about TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

Some parameter types are not matched anymore.
Can anyone fix this?

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcudnn.so.5. LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:
I tensorflow/stream_executor/cuda/cuda_dnn.cc:3517] Unable to load cuDNN DSO
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
INFO:tensorflow:Using config: {'_tf_random_seed': None, '_task_id': 0, '_save_summary_steps': 100, '_keep_checkpoint_max': 5, '_save_checkpoints_secs': 600, '_master': '', '_environment': 'local', '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0aa666a358>, '_evaluation_master': '', '_task_type': None, '_num_ps_replicas': 0, '_keep_checkpoint_every_n_hours': 10000, '_tf_config': gpu_options {
per_process_gpu_memory_fraction: 1
}
, '_save_checkpoints_steps': None, '_is_chief': True}
WARNING:tensorflow:From /home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitors.py:267: BaseMonitor.init (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
Traceback (most recent call last):
File "udc_train.py", line 64, in
tf.app.run()
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "udc_train.py", line 61, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
return func(*args, **kwargs)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 426, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 934, in _train_model
model_fn_ops = self._call_legacy_get_train_ops(features, labels)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1003, in _call_legacy_get_train_ops
train_ops = self._get_train_ops(features, labels)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1162, in _get_train_ops
return self._call_model_fn(features, labels, model_fn_lib.ModeKeys.TRAIN)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1133, in _call_model_fn
model_fn_results = self._model_fn(features, labels, **kwargs)
File "/home/ucl/chatbot-retrieval/udc_model.py", line 39, in model_fn
targets)
File "/home/ucl/chatbot-retrieval/models/dual_encoder.py", line 54, in dual_encoder_model
tf.concat(0, [context_embedded, utterance_embedded]),
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1029, in concat
dtype=dtypes.int32).get_shape(
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 637, in convert_to_tensor
as_ref=False)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 702, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 110, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 99, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).name))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

@KristenMoore
Copy link

KristenMoore commented Apr 3, 2017

You need to fix all instances of tf.concat and tf.split as follows.
For tf.concat, Eg. in dual_encody.py, line 54, change:
tf.concat(0, [context_embedded, utterance_embedded]),
to this:
tf.concat([context_embedded, utterance_embedded], 0),
(ie switch order of arguments)

Same thing for all the tf.split cases, Eg. line 57 in same file, change:
encoding_context, encoding_utterance = tf.split(0, 2, rnn_states.h)
to this:
encoding_context, encoding_utterance = tf.split(rnn_states.h, 2, 0)
by switching the first and last arguments.

There were a few more things I had to change to get training running, too.

@bingoko
Copy link
Author

bingoko commented Apr 21, 2017

Thanks, I have updated all of these, as well as
tf.histogram_summary -> tf.summary.histogram
and
tf.scalar_summary -> tf.summary.scalar

However, there is a new error:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla M60, pci bus id: 88f8:00:00.0)
W tensorflow/core/framework/op_kernel.cc:993] Out of range: Reached limit of 1
[[Node: read_batch_features_eval/file_name_queue/limit_epochs/CountUpTo = CountUpToT=DT_INT64, _class=["loc:@read_batch_features_eval/file_name_queue/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Any idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants