Skip to content

Project 1: Markov Processes and Dynamic Programming Applied to Motion Planning in a Maze

License

Notifications You must be signed in to change notification settings

roumenguha/PR1_SP20_MDP

Repository files navigation

PR1_SP20_MDP

Project 1: Markov Processes and Dynamic Programming/Label-Correcting Applied to Motion Planning in a Maze. For ECE 276B at UCSD.

Consult my Project Report in the code folder for more background.

We did not solve the problem of the shortest path from START to GOAL, but instead solved two simpler sub-problems: START to KEY, and KEY to GOAL. The logic here is is that once the agent has the key, the door becomes traversable, and hence once we are at the KEY we simply need to focus on getting to the GOAL. This leads to sub-optimal solutions, but in most situations not much worse (except for the 'doorkey-8x8-normal' environment).

One of the issues with splitting the problem into two sub-problems is that our agent is forced to enter the KEY's cell after picking up the key. We were able to generate an action sequence from the shortest sequence of nodes that avoided this problem, and for the provided example environment generated an identical action sequence as the example action sequence. However the agent did not move in the same path that the label-correcting algorithm dictated it should for some other environments, and in the case of the environment 'doorkey-6x6-normal', the agent failed completely to reach the DOOR, much less the GOAL. We included these in the zip file 'gifs-incorrect.7z' in case it is of interest to the reader. Here is an example of the difference between the true optimal path and what the optimal path I chose might look like:

Provided action sequence Generated action sequence
8x8-example 8x8-mine

The action sequence we include in this report and generate in our implementation is therefore less optimal than it could have been, but it is guaranteed to reach the GOAL eventually, and in most cases, it is able to do this with $\epsilon$-optimality. We suspect we would be able to resolve this issue if we had slightly more time or slightly more brainpower, but nothing in life is ever perfect.

Maze Traversal Value Iteration (Start-to-Key) Value Iteration (Key-to-Goal)
5x5-normal 5x5-normal-s-k 5x5-normal-k-g
:-------------------------: :-------------------------: :-------------------------:
6x6-direct 6x6-direct-s-k 6x6-direct-k-g
:-------------------------: :-------------------------: :-------------------------:
6x6-normal 6x6-normal-s-k 6x6-normal-k-g
:-------------------------: :-------------------------: :-------------------------:
6x6-shortcut 6x6-shortcut-s-k 6x6-shortcut-k-g
:-------------------------: :-------------------------: :-------------------------:
8x8-direct 8x8-direct-s-k 8x8-direct-k-g
:-------------------------: :-------------------------: :-------------------------:
8x8-normal 8x8-normal-s-k 8x8-normal-k-g
:-------------------------: :-------------------------: :-------------------------:
8x8-shortcut 8x8-shortcut-s-k 8x8-shortcut-k-g

About

Project 1: Markov Processes and Dynamic Programming Applied to Motion Planning in a Maze

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages