examples/openai_gym_async.py broken #253

sdab · 2017-12-04T00:06:33Z

When I run the example as stated in the file's documentation:
python examples/openai_gym_async.py Pong-ram-v0 -a examples/configs/vpg.json -n examples/configs/mlp2_network.json -e 50000 -m 2000 -W 3

the workers fail with:
TypeError: Input 'value' of 'Assign' Op has type int32 that does not match type float32 of argument 'ref'.

This is at current HEAD with tensorflow version 1.4.0.

The text was updated successfully, but these errors were encountered:

sdab · 2017-12-04T00:08:41Z

It looks like global_variables and local_variables do not match in the following section [1]. The error above comes from local_init_op when a local variable with one dtype is assigned to a global variable with another dtype.

https://github.com/reinforceio/tensorforce/blob/35e35253cea26e869a7ef75907b558c5672b824d/tensorforce/models/model.py#L297-L302

michaelschaarschmidt · 2017-12-04T08:33:24Z

Thanks for raising, we will look into it this week, we just moved a lot of code from python to tensorflow so a few hickups. 0.3.2 should be stable for a3c (hopefully)

michaelschaarschmidt · 2017-12-04T15:29:26Z

Could you post a full stacktrace?

sdab · 2017-12-05T20:25:56Z

Sure, the script opens up a server with several workers. The server seems fine, its the workers that get the following stacktrace:
CUDA_VISIBLE_DEVICES= /usr/bin/python /home/ubuntu/git/tensorforce/examples/openai_gym_async.py Pong-ram-v0 --agent-config /home/ubuntu/git/tensorforce/examples/configs/vpg.json --network-spec /home/ubuntu/git/tensorforce/examples/confi\ gs/mlp2_network.json --num-workers 3 --child --task-index 0 --episodes 50000 --max-episode-timesteps 2000 Traceback (most recent call last): File "/home/ubuntu/git/tensorforce/examples/openai_gym_async.py", line 233, in <module> main() File "/home/ubuntu/git/tensorforce/examples/openai_gym_async.py", line 191, in main network_spec=network_spec File "/home/ubuntu/git/tensorforce/tensorforce/agents/agent.py", line 250, in from_spec kwargs=kwargs File "/home/ubuntu/git/tensorforce/tensorforce/util.py", line 173, in get_object return obj(*args, **kwargs) File "/home/ubuntu/git/tensorforce/tensorforce/agents/vpg_agent.py", line 144, in __init__ keep_last_timestep=keep_last_timestep File "/home/ubuntu/git/tensorforce/tensorforce/agents/batch_agent.py", line 61, in __init__ batched_observe=batched_observe File "/home/ubuntu/git/tensorforce/tensorforce/agents/agent.py", line 97, in __init__ self.model = self.initialize_model() File "/home/ubuntu/git/tensorforce/tensorforce/agents/vpg_agent.py", line 169, in initialize_model gae_lambda=self.gae_lambda File "/home/ubuntu/git/tensorforce/tensorforce/models/pg_model.py", line 86, in __init__ entropy_regularization=entropy_regularization, File "/home/ubuntu/git/tensorforce/tensorforce/models/distribution_model.py", line 74, in __init__ reward_preprocessing_spec=reward_preprocessing_spec File "/home/ubuntu/git/tensorforce/tensorforce/models/model.py", line 119, in __init__ self.setup() File "/home/ubuntu/git/tensorforce/tensorforce/models/model.py", line 302, in setup local_init_op = tf.group(*(local_var.assign(value=global_var) for local_var, global_var in zip(local_variables, global_variables))) File "/home/ubuntu/git/tensorforce/tensorforce/models/model.py", line 302, in <genexpr> local_init_op = tf.group(*(local_var.assign(value=global_var) for local_var, global_var in zip(local_variables, global_variables))) File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 573, in assign return state_ops.assign(self._variable, value, use_locking=use_locking) File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign validate_shape=validate_shape) File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 57, in assign use_locking=use_locking, name=name) File "/home/ubuntu/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 546, in _apply_op_helper inferred_from[input_arg.type_attr])) TypeError: Input 'value' of 'Assign' Op has type int32 that does not match type float32 of argument 'ref'.

michaelschaarschmidt · 2017-12-06T09:22:33Z

Thanks, will have a look soon

slundell · 2017-12-06T19:24:19Z

This is the same one as I mentioned in gitter.

michaelschaarschmidt · 2017-12-08T20:45:22Z

Variables were not being sorted in optimizer, thus causing indeterministic assignments. Now running for me with latest commit:

[2017-12-08 20:41:47,332] Making new env: CartPole-v0
2017-12-08 20:41:51.882975: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-08 20:41:51.883023: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-12-08 20:41:51.883041: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-08 20:41:51.883055: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-12-08 20:41:51.894743: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> 127.0.0.1:12222}
2017-12-08 20:41:51.894799: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> localhost:12223, 1 -> 127.0.0.1:12224}
2017-12-08 20:41:51.895239: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:12223
2017-12-08 20:41:52.690747: I tensorflow/core/distributed_runtime/master_session.cc:998] Start master session 0bc448dfc7fb6aa8 with config:
[2017-12-08 20:41:52,776] Starting distributed agent for OpenAI Gym 'CartPole-v0'
[2017-12-08 20:41:52,776] Config:
[2017-12-08 20:41:52,776] {u'optimizer': {u'learning_rate': 0.01, u'type': u'adam'}, u'baseline': None, u'entropy_regularization': None, u'batch_size': 4000, u'gae_lambda': None, u'discount': 0.99, 'distributed_spec': {'device': '/job:worker/task:0', 'parameter_server': False, 'task_index': 0, 'cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1101c9ed0>}, u'baseline_optimizer': None, u'baseline_mode': None, u'type': u'vpg_agent'}
[2017-12-08 20:41:52,823] Finished episode 1 after overall 13 timesteps. Steps Per Second 275.237636607
[2017-12-08 20:41:52,823] Episode reward: 13.0
[2017-12-08 20:41:52,823] Average of last 500 rewards: 13.0
[2017-12-08 20:41:52,823] Average of last 100 rewards: 13.0
[2017-12-08 20:41:52,876] Finished episode 2 after overall 31 timesteps. Steps Per Second 310.739675742
[2017-12-08 20:41:52,876] Episode reward: 18.0
[2017-12-08 20:41:52,876] Average of last 500 rewards: 15.5
[2017-12-08 20:41:52,877] Average of last 100 rewards: 15.5
[2017-12-08 20:41:52,959] Finished episode 3 after overall 55 timesteps. Steps Per Second 301.204784039
[2017-12-08 20:41:52,959] Episode reward: 24.0
[2017-12-08 20:41:52,959] Average of last 500 rewards: 18.3333333333
[2017-12-08 20:41:52,959] Average of last 100 rewards: 18.3333333333
[2017-12-08 20:41:53,054] Finished episode 4 after overall 83 timesteps. Steps Per Second 298.711660685
[2017-12-08 20:41:53,054] Episode reward: 28.0
[2017-12-08 20:41:53,054] Average of last 500 rewards: 20.75
[2017-12-08 20:41:53,054] Average of last 100 rewards: 20.75
[2017-12-08 20:41:53,101] Finished episode 5 after overall 108 timesteps. Steps Per Second 331.898111927
[2017-12-08 20:41:53,101] Episode reward: 25.0
[2017-12-08 20:41:53,102] Average of last 500 rewards: 21.6
[2017-12-08 20:41:53,102] Average of last 100 rewards: 21.6

michaelschaarschmidt closed this as completed Dec 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples/openai_gym_async.py broken #253

examples/openai_gym_async.py broken #253

sdab commented Dec 4, 2017

sdab commented Dec 4, 2017

michaelschaarschmidt commented Dec 4, 2017 •

edited

michaelschaarschmidt commented Dec 4, 2017

sdab commented Dec 5, 2017

michaelschaarschmidt commented Dec 6, 2017

slundell commented Dec 6, 2017

michaelschaarschmidt commented Dec 8, 2017

examples/openai_gym_async.py broken #253

examples/openai_gym_async.py broken #253

Comments

sdab commented Dec 4, 2017

sdab commented Dec 4, 2017

michaelschaarschmidt commented Dec 4, 2017 • edited

michaelschaarschmidt commented Dec 4, 2017

sdab commented Dec 5, 2017

michaelschaarschmidt commented Dec 6, 2017

slundell commented Dec 6, 2017

michaelschaarschmidt commented Dec 8, 2017

michaelschaarschmidt commented Dec 4, 2017 •

edited