Markov Decision Processes is among the most popular algorithms in robotics for indoor robot navigation problems. They allow computing optimal policies in order to achieve a given goal, accounting for actuators’ uncertainties. In this work, MDP has been implemented for two homogenous robots in an indoor environment, the goal of which is to take two victims out of a corrupted building, autonomously, without collision to obstacles or to one another. Three different policies are derived for this purpose. Robot’s trajectories for each policy, plots of value functions and the results of simulation are presented.
Value function for the first goal Value function for the second goal Trajectories to reach the second goal from different starting points