You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How should we revamp the policy representation to be more generic? One idea is to make it more like a numpy array where you "index" things by state or action.
The text was updated successfully, but these errors were encountered:
definitely think making the most of python's syntax to make policy access simpler/easier makes a lot of sense. I think __getitem__ (policy[s]) is a cool solution. __call__ is another option, but seems less ideal in way that I'm struggling to articulate.
How should we revamp the policy representation to be more generic? One idea is to make it more like a numpy array where you "index" things by state or action.
The text was updated successfully, but these errors were encountered: