A (serious) bug preventing RL algorithms to work

Hello all, and thank you for this beautiful library.

If have found a serious bug that would prevent any RL algorithm to work. The bug is in `select_softmax`, used in e.g. `breakout_stdp.py`
During the first iteration (time=0), `pipeline.spike_record[output]` has a `shape` os `[100, 4]` , `spike` is `tensor([0., 0., 0., 0.])`, so softmax is `tensor([0.25, 0.,25, 0.25, 0.25])` and it's all good.

However at any other timestep the dimension of `pipeline.spike_record[output]` has a shape of `tensor([100, 1, 4])`, `spike` is something like `tensor([[17., 17., 17., 17.])` which mean that `torch.softmax(spikes, dim=0)` is computing the softmax _on the wrong dimension_, and the result is always `probabilities = tensor([[1., 1., 1., 1.]])`, **meaning that the agent will always perform a random action.**

I am happy to fix this. In the meantime, I think it's good to share it as some people might get confused.
This issue is not impacting other examples such as `eth_minst.py`, that's why this net can learn supervised.  




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A (serious) bug preventing RL algorithms to work #577

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

A (serious) bug preventing RL algorithms to work #577

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions