Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with AdamOptimizer? #11

Closed
gangeshwark opened this issue Mar 22, 2017 · 5 comments
Closed

Problem with AdamOptimizer? #11

gangeshwark opened this issue Mar 22, 2017 · 5 comments

Comments

@gangeshwark
Copy link

Hi,
You have a great tutorial. Thanks!
While running your tutorial 2 code, i ran into this error:

InvalidArgumentError Traceback (most recent call last)
in ()
5 for batch in range(max_batches):
6 fd = next_feed()
----> 7 _, l = sess.run([train_op, loss], fd)
8 loss_track.append(l)
9

InvalidArgumentError: Cannot assign a device to node 'Adam/update_Variable/sub_3': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices:
NoOp: GPU CPU
AssignSub: GPU CPU
ScatterAdd: GPU CPU
StridedSlice: GPU CPU
Shape: GPU CPU
Unique: CPU
Sub: GPU CPU
Const: GPU CPU
VariableV2: GPU CPU
UnsortedSegmentSum: GPU CPU
Identity: GPU CPU
Gather: GPU CPU
Mul: GPU CPU
RealDiv: GPU CPU
Assign: GPU CPU
Sqrt: GPU CPU
Enter: GPU CPU
Add: GPU CPU
Switch: GPU CPU
[[Node: Adam/update_Variable/sub_3 = Sub[T=DT_FLOAT, _class=["loc:@variable"]](Adam/update_Variable/sub_3/x, Adam/beta2)]]

Caused by op u'Adam/update_Variable/sub_3', defined at:
File "/home/gangeshwark/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/home/gangeshwark/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/ipykernel/main.py", line 3, in
app.launch_new_instance()
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/ipykernel/kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tornado/ioloop.py", line 887, in start
handler_func(fd_obj, events)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 390, in execute_request
user_expressions, allow_stdin)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/ipykernel/zmqshell.py", line 501, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2717, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
if self.run_code(code, result):
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 7, in
train_op = tf.train.AdamOptimizer(lr).minimize(loss)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 289, in minimize
name=name)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 413, in apply_gradients
update_ops.append(processor.update_op(self, grad))
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 66, in update_op
return optimizer._apply_sparse_duplicate_indices(g, self._v)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 557, in _apply_sparse_duplicate_indices
return self._apply_sparse(gradient_no_duplicate_indices, var)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/adam.py", line 156, in _apply_sparse
v_scaled_g_values = (grad.values * grad.values) * (1 - beta2_t)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 808, in r_binary_op_wrapper
return func(x, y, name=name)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2775, in _sub
result = _op_def_lib.apply_op("Sub", x=x, y=y, name=name)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/gangeshwark/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1226, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'Adam/update_Variable/sub_3': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices:
NoOp: GPU CPU
AssignSub: GPU CPU
ScatterAdd: GPU CPU
StridedSlice: GPU CPU
Shape: GPU CPU
Unique: CPU
Sub: GPU CPU
Const: GPU CPU
VariableV2: GPU CPU
UnsortedSegmentSum: GPU CPU
Identity: GPU CPU
Gather: GPU CPU
Mul: GPU CPU
RealDiv: GPU CPU
Assign: GPU CPU
Sqrt: GPU CPU
Enter: GPU CPU
Add: GPU CPU
Switch: GPU CPU
[[Node: Adam/update_Variable/sub_3 = Sub[T=DT_FLOAT, _class=["loc:@variable"]](Adam/update_Variable/sub_3/x, Adam/beta2)]]

I think it is something to do with AdamOptimizer(). However, the code works with GradientDescentOptimizer.
Any idea how to solve this?

Thanks in advance!

@ematvey
Copy link
Owner

ematvey commented Mar 22, 2017

Hi! Traceback you've shown indicates that you're running ops on GPU. I guess you've changed device manually (unless TF have some environment override for default device I'm not aware of). Could you post the code you modified?

Also, do you by any change run optimizer like tf.train.AdamOptimizer().minimize(loss, colocate_gradients_with_ops=True)?

@gangeshwark
Copy link
Author

Hi,
I did not modify your code at all. Just a dry run gave me that error. So I don't think it was due to any changes in the code.
Also, no I didn't run the optimizer like the one you mentioned.

@ematvey
Copy link
Owner

ematvey commented Mar 22, 2017

This is extremely strange. Trace clearly shows you are running the tutorial on GPU, while nothing in it changes the device (and it doesn't change it for me).

Please try running with soft device placement: sess = tf.InteractiveSession(config=tf.ConfigProto(allow_soft_placement=True))

@ematvey
Copy link
Owner

ematvey commented Mar 25, 2017

I assume the issue was resolved. Closing it now.

@ematvey ematvey closed this as completed Mar 25, 2017
@ravikg
Copy link

ravikg commented Jul 14, 2017

I had similar issue and after looking into the issue it seems the issue is with embedding. It seems there is no GPU implementation of embedding. One can declare the embedding variable to use cpu then it works fine, e.g.:

with tf.device("/cpu:0"):
    embeddings = tf.Variable(tf.random_uniform([vocab_size, input_embedding_size], -1.0, 1.0), dtype=tf.float32)

Ref: tensorflow/tensorflow#5117

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants