Skip to content

schorm/RL-Markov_Decision_Process

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

RL-Markov_Decision_Process🐰

image

In it, for all the states (cells in the grid), each one of the actions (north, south, east, and west) is chosen with probability ¼. The agent then moves with probability 1 to the chosen direction. While moving, if the agent hits a wall, it cannot move and it receives a reward of -1; if moving to a cell in the grid, the reward is 0.; if it reaches cell A (1,2), it moves to cell A’ (5,2) and receives a reward +10; and, if it reaches cell B (1,4) it moves to cell B’ (3,4) and receives a reward +5.

Using two different methods

** A linear solver to solve the system Ax=b.

** A dynamic programming approach (policy iteration or value iteration).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages