You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for sharing the code and just wanted to say I've enjoyed your paper. I was reading your code and noticed that there might be a subtle bug in the grid-env dag script. I might also have read it wrong...
On line 316, we zip two things: zip([e for d, e in zip(done, self.envs) if not d], acts)
Here done is a vector of bools of length batch-size, self.envs is a list of GridEnv of length n-envs or buffer-size, and acts is a vector of ints of length (n-envs or buffer-size,).
By default, all the lengths of the above objects should be 16.
I was reading through the code, and noticed that if any of the elements in done are True, then on line 316 we filter them out with if not d. If env[0] was "done", then we would have a list of 15 envs, basically self.envs[1:]. Then when you zip up the actions and the shorter list envs, the actions will be aligned incorrectly... We will basically end up with self.envs[1:] being aligned to actions act[:-1]. As a result, step is now length 15, and on the next line, we again line up the incorrect actions of length 16 with our step list of length 16.
Perhaps we need to filter act based on the done vector? E.g act = act[done] after line 316?
Maybe I've got this wrong, so apologies for the noise if that's the case, but thought I'd leave a note in case what I'm suggesting is the case.
All the best!
The text was updated successfully, but these errors were encountered:
Hi! Sorry, this is certainly not the clearest code I've ever written.
acts in this scenario will be of length 15, because it is a function of s, itself filtered down to only include active environments. Note line 324 which updates s:
Hi there!
Thanks for sharing the code and just wanted to say I've enjoyed your paper. I was reading your code and noticed that there might be a subtle bug in the grid-env dag script. I might also have read it wrong...
https://github.com/bengioe/gflownet/blob/dddfbc522255faa5d6a76249633c94a54962cbcb/grid/toy_grid_dag.py#L316-L320
On line 316, we zip two things:
zip([e for d, e in zip(done, self.envs) if not d], acts)
Here
done
is a vector of bools of length batch-size,self.envs
is a list ofGridEnv
of length n-envs or buffer-size, andacts
is a vector of ints of length (n-envs or buffer-size,).By default, all the lengths of the above objects should be 16.
I was reading through the code, and noticed that if any of the elements in
done
areTrue
, then on line 316 we filter them out withif not d
. If env[0] was "done", then we would have a list of 15 envs, basicallyself.envs[1:]
. Then when you zip up the actions and the shorter list envs, the actions will be aligned incorrectly... We will basically end up withself.envs[1:]
being aligned to actionsact[:-1]
. As a result,step
is now length 15, and on the next line, we again line up the incorrect actions of length 16 with ourstep
list of length 16.Perhaps we need to filter
act
based on thedone
vector? E.gact = act[done]
after line 316?Maybe I've got this wrong, so apologies for the noise if that's the case, but thought I'd leave a note in case what I'm suggesting is the case.
All the best!
The text was updated successfully, but these errors were encountered: