RL-Markov_Decision_Process🐰

In it, for all the states (cells in the grid), each one of the actions (north, south, east, and west) is chosen with probability ¼. The agent then moves with probability 1 to the chosen direction. While moving, if the agent hits a wall, it cannot move and it receives a reward of -1; if moving to a cell in the grid, the reward is 0.; if it reaches cell A (1,2), it moves to cell A’ (5,2) and receives a reward +10; and, if it reaches cell B (1,4) it moves to cell B’ (3,4) and receives a reward +5.

Using two different methods

** A linear solver to solve the system Ax=b.

** A dynamic programming approach (policy iteration or value iteration).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
iteration.py		iteration.py
linear.py		linear.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RL-Markov_Decision_Process🐰

Using two different methods

About

Uh oh!

Releases

Packages

Languages

schorm/RL-Markov_Decision_Process

Folders and files

Latest commit

History

Repository files navigation

RL-Markov_Decision_Process🐰

Using two different methods

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages