# Determinist Approach to the problem

In [1]:
%pip -q install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [13]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


I. Explanation 

This is a determinist approch to the problem. Our solution combines 4 key elements:

* Memory mapping: Each agent builds its own map of the environment as it explores
* Collaborative intelligence: Agents share their observations within communication range (ONLY intant t information and ONLY when they are in range)
* A pathfinding:* A classic search algorithm that guarantees optimal paths
* Fallback strategies

**MAPPING**

Each agent maintains its own memory map of the environment, which starts mostly unknown (-1) and gets updated as the agent explores:

* -1: Unknown areas
* 0: Free space
* 1: Walls
* 2: Dynamic obstacles
* 3: Other agents
* 4: Goal location

The agents progressively build this map through:

* Their own LIDAR observations
* Information shared by other agents within communication range

The system tracks "danger zones" around dynamic obstacles: When a dynamic obstacle is detected, the surrounding cells are marked as dangerous. The pathfinding algorithms avoid these danger zones when possible. This creates a safety buffer around moving obstacles

**What is (A-Star)?**

A* is like a smart GPS for grid environments. Here's how it works in simple terms:

* Imagine you're in a maze trying to find the exit
* At each intersection, A* evaluates all possible paths based on:

    * How far you've already traveled from your starting point
    * How far you estimate you are from the goal using manhanthan distance 


* A* always explores the most promising path first
* It maintains a list of intersections to visit, always prioritizing the most promising one
* Once the goal is reached, it traces back the optimal path

A* make sure to find the shortest possible path when one exists (in general it is more perfoment than other approch such as breadth-first search).

When the standard A* can't find a path (due to unknown areas or obstacles), we implemented 2 fallback strategies:

* a_star_allow_unknown: A modified A* that applies penalties to unknown or dangerous cells rather than avoiding them completely
* greedy_path: A simple greedy approach that always moves toward the goal, used as a last resort

Finally, several mechanisms prevent agents from getting stuck:

stuck_counter: Tracks how long an agent stays in the same position
action_repeat_counter: Tracks how many times the same action is repeated
Random actions are taken if stagnation is detected

**Process :**

For each step, the agent:

1. Updates its memory based on current observations and shared information
2. Checks if it has reached the goal
3. Calculates a path to the goal using A* (with fallbacks)
4. Determines the next action based on:
    * The next position in the calculated path
    * The current orientation (may need to rotate first)
5. Applies anti-stagnation checks to avoid getting stuck
6. Executes the selected action





II. Results : 

In [17]:
import simulate
from astaragent import MyAgent

eval_config_paths = [f"./EVAL/config_{i}.json" for i in range(1, 11)]
trained_agent = MyAgent()
all_results = simulate.evaluate(eval_config_paths, trained_agent)


--- Evaluating Configuration: ./EVAL/config_1.json ---
Episode 1/10, Step 29, Reward: 39.91, Evacuated: 4, Deactivated: 0
Episode 2/10, Step 32, Reward: 39.88, Evacuated: 4, Deactivated: 0
Episode 3/10, Step 31, Reward: 39.93, Evacuated: 4, Deactivated: 0
Episode 4/10, Step 24, Reward: 40.10, Evacuated: 4, Deactivated: 0
Episode 5/10, Step 29, Reward: 39.90, Evacuated: 4, Deactivated: 0
Episode 6/10, Step 26, Reward: 40.00, Evacuated: 4, Deactivated: 0
Episode 7/10, Step 29, Reward: 39.86, Evacuated: 4, Deactivated: 0
Episode 8/10, Step 25, Reward: 40.00, Evacuated: 4, Deactivated: 0
Episode 9/10, Step 24, Reward: 40.06, Evacuated: 4, Deactivated: 0
Episode 10/10, Step 30, Reward: -220.19, Evacuated: 2, Deactivated: 2

--- Evaluating Configuration: ./EVAL/config_2.json ---
Episode 1/10, Step 35, Reward: 39.84, Evacuated: 4, Deactivated: 0
Episode 2/10, Step 28, Reward: 39.84, Evacuated: 4, Deactivated: 0
Episode 3/10, Step 28, Reward: -110.12, Evacuated: 3, Deactivated: 1
Episode 4/10

In [18]:
display(all_results)

Unnamed: 0,config_path,episode,steps,reward,evacuated,deactivated
0,./EVAL/config_1.json,1,29,39.9074,4,0
1,./EVAL/config_1.json,2,32,39.8832,4,0
2,./EVAL/config_1.json,3,31,39.9302,4,0
3,./EVAL/config_1.json,4,24,40.0952,4,0
4,./EVAL/config_1.json,5,29,39.9014,4,0
...,...,...,...,...,...,...
95,./EVAL/config_10.json,6,48,-993.3761,0,4
96,./EVAL/config_10.json,7,17,-331.5811,0,4
97,./EVAL/config_10.json,8,110,-3143.4576,1,3
98,./EVAL/config_10.json,9,86,-1875.4780,0,4


In [21]:
# Calculate averages for each configuration
averages = all_results.groupby('config_path').mean().reset_index().drop(columns=['episode'])
averages = averages.rename(columns={
    'steps': 'avg_steps',
    'reward': 'avg_reward',
    'evacuated': 'avg_evacuated',
    'deactivated': 'avg_deactivated'})

display(averages)
averages.to_csv('averages.csv', index=False)

Unnamed: 0,config_path,avg_steps,avg_reward,avg_evacuated,avg_deactivated
0,./EVAL/config_1.json,27.9,13.94438,3.8,0.2
1,./EVAL/config_10.json,60.5,-1336.47709,0.2,3.8
2,./EVAL/config_2.json,29.8,9.8887,3.8,0.2
3,./EVAL/config_3.json,49.1,-494.97197,2.4,1.6
4,./EVAL/config_4.json,50.3,-355.19786,2.5,1.5
5,./EVAL/config_5.json,58.2,-465.18696,2.2,1.8
6,./EVAL/config_6.json,68.4,-662.5643,2.3,1.7
7,./EVAL/config_7.json,63.8,-692.01293,1.0,3.0
8,./EVAL/config_8.json,64.7,-538.61068,2.0,2.0
9,./EVAL/config_9.json,63.7,-1298.49511,0.5,3.5


In [20]:
# Calculate averages for each configuration
averages = all_results.groupby('config_path').mean().reset_index().drop(columns=['episode'])
averages = averages.rename(columns={
    'steps': 'avg_steps',
    'reward': 'avg_reward',
    'evacuated': 'avg_evacuated',
    'deactivated': 'avg_deactivated'})

display(averages)
averages.to_csv('averages.csv', index=False)

Unnamed: 0,config_path,avg_steps,avg_reward,avg_evacuated,avg_deactivated
0,./EVAL/config_1.json,27.9,13.94438,3.8,0.2
1,./EVAL/config_10.json,60.5,-1336.47709,0.2,3.8
2,./EVAL/config_2.json,29.8,9.8887,3.8,0.2
3,./EVAL/config_3.json,49.1,-494.97197,2.4,1.6
4,./EVAL/config_4.json,50.3,-355.19786,2.5,1.5
5,./EVAL/config_5.json,58.2,-465.18696,2.2,1.8
6,./EVAL/config_6.json,68.4,-662.5643,2.3,1.7
7,./EVAL/config_7.json,63.8,-692.01293,1.0,3.0
8,./EVAL/config_8.json,64.7,-538.61068,2.0,2.0
9,./EVAL/config_9.json,63.7,-1298.49511,0.5,3.5
