-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Action spaces where actions are tuples #172
Comments
The best way is to convert the tuple to a dictionary, say you have the action tuple |
Thanks a lot for your quick response ! I was thinking of maybe an easier way to hack around this : I just keep an index between my real tuple actions and the actions I give to the policy/neural net. I need the real tuple action for reward calculation in my env, so my custom env would receive an int action, and I would simply take the real corresponding action in the list. I think that should work. What do you think ? |
Could you give an example code snippet? Because your description is not so intuitive. |
Okay, I think I get your point. Your solution is okay but my proposal can be generalized to the continuous action space (infinite action), for example, using A2C to compute multiple actions in a tuple of continuous action space. |
Oh, sorry, yes of course. In my custom env I define all the possible actions using this function : def process_actions(self):
M = 5
N = 5
itemNUM = (M + 1) + N - 1
pointer = 0
seq = np.arange(itemNUM)
actions = []
for c in combinations(seq, M):
action = np.zeros(M+1)
for i in range(len(c)-1):
action[i+1] = c[i+1] - c[i] - 1
action[0] = c[0]
action[M] = itemNUM - c[M-1] - 1
for i in range(M+1):
action[i] = action[i]/N
actions.append(action)
self.actions = np.array(actions) And I set action_shape equal to (len(actions)). Tell me if I'm clear enough, otherwise I could provide more code snippets. |
What is |
Sorry I forget : from itertools import combinations |
Yes, it should definetly work. |
Hi there, does Tianshou support action spaces where actions are tuples ? I've in mind some hacks that could do the trick, but I'd first like to know if there is already some smooth integration ? (finite action space)
The text was updated successfully, but these errors were encountered: