# Simple Room Environment
We start from the simple room, no obstacles and fully observable. Agent's target are the stairs, but it will get a reward for eating apples.

In [None]:
from simulator import *

simple_room = """
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
"""

simple_room_env = create_env(make_map(simple_room, 5, premapped=True, start=(0, 0)), apple_reward=0.75,
                             penalty_time=-0.1)

# Lava Room Environment
We then move to the lava room, where the agent has to avoid lava tiles while trying to reach the stairs and collect apples.
This is a more complex environment, where the agent has to account for obstacles on its way.
For this environment, we also use a bfs-based distance since the manhattan distance can't account for the lava tiles.

In [None]:
lava_maze = """
-----------------
|..L....L....L..|
|..L..LLL..L.LL.|
|..L..L.......L.|
|.....L.....L...|
|..L....L....L..|
|..LLL..L.LLLL..|
|..L.L..L..L....|
|..L....L....L..|
|.....L.....L...|
-----------------
"""

lava_room_env = create_env(make_map(lava_maze, 5, premapped=True, start=(0, 0)), apple_reward=0.75, penalty_time=-0.1)

# Benchmarking Offline Pathfinding Algorithms
## A* Star-based Algorithms
We tried two approaches based on A* Star algorithm:
* Forcing the agent to path from apple to apple, using the A* Star algorithm, then to the stairs.
    * This is a greedy approach, where the agent will always try to reach the nearest apple first until all apples are collected.
* Finding a path to the stairs, but with a Weighted A* Star algorithm modified to gives more weight to tiles with/near apples.
    * This is a more exploratory approach, where the agent will try to find a path to the stairs, but will also try to collect apples along the way without being greedy.
    * Uses two parameters: `weight` and `apple_bonus`. The `weight` is the weight of the heuristic, while the `apple_bonus` is the bonus given to tiles with/near apples.

In [None]:
from algorithms import *

env = simple_room_env
# A star with bonus to tiles that have apples nearby
simulate_with_heuristic(env, a_star_apple, heuristic=manhattan_distance, apple_bonus=3)

## Monte Carlo Tree Search (MCTS)

Without relying on heuristic, we can use MCTS to explore the environment and find a path to the stairs while collecting apples. The MCTS will simulate multiple paths and choose the one that maximizes the reward.



## Greedy Best First Search


## Potential Fields
Mostly used in robotics, potential fields can be used to guide the agent

## Beam Search

Beam search is a heuristic search algorithm that explores a graph by expanding the most promising nodes in a limited set. It is similar to breadth-first search but only keeps a limited number of best nodes at each level.