New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DQN Agent Issue With Custom Environment #258
Comments
FWIW Isn't it a Is there a specific reason why you are using |
It is an array spec in the provided examples, but as far as I understand, the spec doesn't matter much in this case as long as what I'm passing as observations fits the aforementioned spec, no? I experimented with both tensor spec and boundedarrayspec and with different shapes for those specs, all yield the same error. And I'm not certain whether it's a conversion issue or not. |
I would like to see a full trace of your code, but I believe the problem is that you're performing training (correct me if I'm wrong), which requires getting batches of data from your environment. So for example, if the observatoin spec is |
For a better answer, we'd need to look at your copy of the repo; so we can understand the diff. Also ensure that you can run the original example before you made any changes and that it doesn't lead to an error. |
The error occurs whenever I request an action from the dqn agent's collect_policy, passing one observation as you stated. As I understood from 1_dqn_tutorial under collab I have to pass only one observation, the current one, from the environment to get an action. I haven't yet gotten to training the agent. This isn't a copy of this repo, it's me attempting to adapt the solution to an entirely different environment. Attached below is the link to the repo, code is in "Traffic Program ML.py". Please note: I'm handling the "step" function of the environment manually; in term of getting next Timestep and saving it in memory, since I'm attempting to run multiple agents at the same time in one custom environment, and since in my case it wouldn't be possible to get the next state immediately after requesting an action from the agent. I'll be attempting to ensure that I can run the original example, will get back to you with results. Thank you for reaching out and helping me resolve the issue. |
Have you tried ensuring that your observations are properly batched? If right now you're using a single py environment, and wrapping it in a TFPyEnvironment, try adding a batch dimension: first wrap the py environment in a batched env: py_env = BatchedPyEnvironment([py_env]). |
Batching my observations? So for instance if my batch size is 32, the shape of my observations would become (32,16, ), correct? Can't I just pass a single observation to the agent, when requesting an action and collecting data? I think I wasn't very clear about what I'm doing in the previous comment. In the repo above, what I'm attempting to do is pass a single observation to each DQN agent, each with their own TFPYEnvironment to keep track of their variables, but all of which are interacting with a single simulation run. A good analogy would be multiple players(the agents) playing a single Co-OP game run, each taking in their own set of observations every X Milliseconds of gameplay time. I may have to consider altering my architecture if batched observations are a requirement. |
For collect or evaluation, in general your |
Does that resolve your issue? |
Oh, I apologize. Unfortunately, after messing around with it for a bit, I still couldn't get it to work. I ended up migrating to Tensorforce. |
Have you been able to solve the issue? and if so how? |
I unfortunately wasn't able to. I ended up using a different reinforcement learning library with a different DQN implementation. Though looking back at it now, I probably wasn't shaping my observations correctly at the time. It'd help if you posted your configuration, showing what your observations look like and the specific error message you're getting, as that might shed more light on the issue you're having. |
In my case, this is the shape of the
and this is my custom agent class class singleAgent(DqnAgent):
and in the end, I am calling them like this
where the
The exact error message is the following
So, basically it is failing in the |
I haven't dealt a lot with the TF-Agents or TF in general, as I use a different library, but I think the issue lies within your observation spec, where your observations are a Tensor of rank 2, meaning it has 2 dimensions, while it expects a 3 dimensional shape, probably because your network uses a 2D convolutional neural net which takes input tensors of rank 3 (rank 4 actually if you count the batch dimension). If your observations are indeed of shape (10,10), then try instead specifying your observation_spec as (1,10,10) such that it's a 3 dimensional shape, and of course, make sure your input is actually of that exact shape. |
@IbraheemNofal thanks you very much for the response; there is no convolutional layer, and the input shape is indeed |
@onurcanbektas Try feeding (10, 10, 1) as input shape. The images are represented as (height, width, num_channels), for example RGB image of 512x512 will be (512, 512, 3) as RGB has 3 channels. |
So I've been following the DQN agent example / tutorial and I set it up like in the example, only difference is that I built my own custom python environment which I then wrapped in TensorFlow. However, no matter how I shape my observations and action specs, I can't seem to get it to work whenever I give it an observation and request an action. Here's the error that I get:
Here's how I'm setting up my agent:
And here's how I'm setting up my environment:
`class SumoEnvironment(py_environment.PyEnvironment):
And here is what my input / observations look like:
I've tried experimenting with the shape and depth of my Q_Net and it seems to me that the [10] in the error is related to the shape of my q network. Setting its layer parameters to (4,) yields an error of:
The text was updated successfully, but these errors were encountered: