Taking account of selected action in next action space #455
-
Hi all, So, my code is as follows. I have 4 elements in my state space, the first two representing the position of a robot on an x-y axis and the latter two representing an internal state , which is based on whatever action was selected. I think of it like steering your car to the left being the action, and the internal state variables here (isx and isy) representing the position of your steering wheel.
This doesn't seem to work unfortunately. To test it, I choose s[3] to be 1 by default as you see. Therefore the first if statement is true. The second if statement should also become true when the position of the robot on the x-axis (s[1]) reaches 5. This does not happen, and instead the code carries on returning [1,0] as the action space. In other words, after the first iteration, the action space does not update anymore (is not returned by the second snippet), to take account of the changed state space. So my question is simply if anyone sees something wrong with my approach here. If anyone does have a suggestion (which would be hugely appreciated of course), please do give the most explicit detail you can. Sadly, if there is a mistake to be made, I am quite capable of finding it! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 11 replies
-
Hi @DonalOCois , I think you have the right idea - you should augment the state space to include the previous action if the current action space depends on the previous action (the steering wheel you mention is the right idea). However, it doesn't look like you are implementing it quite right. It might be easier to debug and the syntax might be easier to understand if you break the function out separately from the
Then you can debug the |
Beta Was this translation helpful? Give feedback.
Hi @DonalOCois , I think you have the right idea - you should augment the state space to include the previous action if the current action space depends on the previous action (the steering wheel you mention is the right idea).
However, it doesn't look like you are implementing it quite right. It might be easier to debug and the syntax might be easier to understand if you break the function out separately from the
QuickMDP
constructor like this:Then you can debug the
my_actions
function more easily. Note that the function should only take one argume…