New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Callback returns aren't handled properly by either OffPolicyAlgorithm or OnPolicyAlgorithm #1706
Comments
Just FYI for anyone with this issue, wrap your callbacks return value with |
I agree that
In this respect, the example you give is a misuse. >>> isinstance(np.bool_(0), bool)
False |
Hello, |
The full story:
|
If we allow that change, there will be some weird behavior allowed, like returning |
Ah okay, my bad. So it was intended to be ONLY |
what about something like |
Well, technically a empty sequence is supposed to be Falsy so, other than it being an incorrect type, it's not technically incorrect. |
馃悰 Bug
OffPolicyAlgorithm
andOnPolicyAlgorithm
evaluate the callback returns incorrectly usingis False
which only works if the returned type is a singleton boolean. This was fine for the default environments which us the singletonFalse
, but this breaks down if the returned value is as a result of a numpy or torch operation such asnp.any(np.ones(10)<0)
.To Reproduce
Relevant log output / Error message
System Info
Checklist
The text was updated successfully, but these errors were encountered: