DQN loss calculation error when using Dict Action space #297

JaCoderX · 2020-02-01T11:07:55Z

followup issue to #276

I'm trying to convert a custom gym project (called BTgym) to work as a tf-agent env.

as I mentioned in the previous issue, the action space is of type gym.spaces.Dict.

Action Spec:
OrderedDict([('default_asset', BoundedTensorSpec(shape=(), dtype=tf.int64, name='action/default_asset', minimum=array(0), maximum=array(3)))])

following the DQN tutorial I reached the point for the agent to calculate the loss.
but I get an error that the action space is missing the shape attribute.
tracing the code back to the gym_wrapper.py it seems that dict space doesn't have shape attribute

...
elif isinstance(space, gym.spaces.Dict):
   return collections.OrderedDict([
       (key, nested_spec(s, key)) for key, s in space.spaces.items()])
...

this is the original error:

Traceback (most recent call last):
  File "home/Experimental RL/ResearchTF-Agents/Env/envTest.py", line 260, in <module>
    train_loss = agent.train(experience).loss
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 457, in __call__
    result = self._call(*args, **kwds)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 503, in _call
    self._initialize(args, kwds, add_initializers_to=initializer_map)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 408, in _initialize
    *args, **kwds))
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1848, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2150, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2041, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 358, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/home/TF-Agents/tf_agents/agents/tf_agent.py", line 219, in train
    loss_info = self._train_fn(experience=experience, weights=weights)
  File "/homeTF-Agents/tf_agents/utils/common.py", line 131, in with_check_resource_vars
    return fn(*fn_args, **fn_kwargs)
  File "/home/TF-Agents/tf_agents/agents/dqn/dqn_agent.py", line 354, in _train
    training=True)
  File "/home//TF-Agents/tf_agents/agents/dqn/dqn_agent.py", line 427, in _loss
    q_values = self._compute_q_values(time_steps, actions, training=training)
  File "/home/TF-Agents/tf_agents/agents/dqn/dqn_agent.py", line 519, in _compute_q_values
    multi_dim_actions = self._action_spec.shape.rank > 0
AttributeError: 'collections.OrderedDict' object has no attribute 'shape'

how can I resolve this?

The text was updated successfully, but these errors were encountered:

kbanoop · 2020-02-05T01:38:22Z

Thanks for raising this. I think it is a bug.

agents/tf_agents/agents/dqn/dqn_agent.py

Line 519 in 9057dd6

multi_dim_actions = self._action_spec.shape.rank > 0

has to be changed to something like:

agents/tf_agents/agents/dqn/dqn_agent.py

Line 552 in 9057dd6

multi_dim_actions = tf.nest.flatten(self._action_spec)[0].shape.rank > 0

Would you like to submit a PR?

JaCoderX · 2020-02-05T08:32:19Z

@kbanoop, I applied your suggested fix and it seem to work fine.
but when i run the code I crash right on the following line when trying to perform the cast operation, again probably because of the dict action space.

agents/tf_agents/agents/dqn/dqn_agent.py

Lines 520 to 523 in 9057dd6

    
           return common.index_with_actions( 
        
               q_values, 
        
               tf.cast(actions, dtype=tf.int32), 
        
               multi_dim_actions=multi_dim_actions)

this is the actions to be cast
<class 'dict'>: {'default_asset': <tf.Tensor 'Squeeze_4:0' shape=(64,) dtype=int64>}

this is what i get now:

Traceback (most recent call last):
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/tensor_util.py", line 324, in _AssertCompatible
    fn(values)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/tensor_util.py", line 276, in _check_not_tensor
    _ = [_check_failed(v) for v in nest.flatten(values)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/tensor_util.py", line 277, in <listcomp>
    if isinstance(v, ops.Tensor)]
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/tensor_util.py", line 248, in _check_failed
    raise ValueError(v)
ValueError: Tensor("Squeeze_4:0", shape=(64,), dtype=int64)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 503, in _call
    self._initialize(args, kwds, add_initializers_to=initializer_map)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 408, in _initialize
    *args, **kwds))
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1848, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2150, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2041, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 358, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/home/jack/TF-Agents/tf_agents/agents/tf_agent.py", line 219, in train
    loss_info = self._train_fn(experience=experience, weights=weights)
  File "/home/jack/TF-Agents/tf_agents/utils/common.py", line 131, in with_check_resource_vars
    return fn(*fn_args, **fn_kwargs)
  File "/home/jack/TF-Agents/tf_agents/agents/dqn/dqn_agent.py", line 354, in _train
    training=True)
  File "/home/jack/TF-Agents/tf_agents/agents/dqn/dqn_agent.py", line 427, in _loss
    q_values = self._compute_q_values(time_steps, actions, training=training)
  File "/home/jack/TF-Agents/tf_agents/agents/dqn/dqn_agent.py", line 522, in _compute_q_values
    tf.cast(actions, dtype=tf.int32),
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py", line 702, in cast
    x = ops.convert_to_tensor(x, name="x")
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1184, in convert_to_tensor
    return convert_to_tensor_v2(value, dtype, preferred_dtype, name)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1242, in convert_to_tensor_v2
    as_ref=False)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1296, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/constant_op.py", line 286, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/constant_op.py", line 227, in constant
    allow_broadcast=True)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/constant_op.py", line 265, in _constant_impl
    allow_broadcast=allow_broadcast))
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/tensor_util.py", line 449, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/home/jack/anaconda3/envs/deep/lib/python3.6/site-packages/tensorflow_core/python/framework/tensor_util.py", line 328, in _AssertCompatible
    raise TypeError("List of Tensors when single Tensor expected")
TypeError: List of Tensors when single Tensor expected

kbanoop · 2020-02-05T22:14:33Z

Yes that sounds like the same issue. Can you try adding actions = tf.nest.flatten(actions)[0], perhaps at the beginning of the _compute_q_values function?

JaCoderX · 2020-02-06T13:29:42Z

@kbanoop,
I have tested the solution and it works good.
I made a PR for this issue and #276 as they are both addressing the problem of unsupported Dict Action Space.

tfboyd added the type:support label Feb 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DQN loss calculation error when using Dict Action space #297

DQN loss calculation error when using Dict Action space #297

JaCoderX commented Feb 1, 2020

kbanoop commented Feb 5, 2020

JaCoderX commented Feb 5, 2020

kbanoop commented Feb 5, 2020

JaCoderX commented Feb 6, 2020

DQN loss calculation error when using Dict Action space #297

DQN loss calculation error when using Dict Action space #297

Comments

JaCoderX commented Feb 1, 2020

kbanoop commented Feb 5, 2020

JaCoderX commented Feb 5, 2020

kbanoop commented Feb 5, 2020

JaCoderX commented Feb 6, 2020