Skip to content

Multi-Agent SAC with Curriculum Learning for Path Planning

Notifications You must be signed in to change notification settings

Wanhao-Liu/AC-MASAC

Repository files navigation

Multi-Agent Path Planning with Curriculum Learning

A reinforcement learning-based multi-agent path planning system using Soft Actor-Critic (SAC) with curriculum learning and attention mechanisms.

πŸ“‹ Table of Contents

✨ Features

  • Multi-Agent Coordination: Leader-follower formation control with dynamic agent count adaptation
  • Curriculum Learning: Progressive task difficulty with automatic knowledge transfer
  • Attention Mechanism: Structured attention networks for agent communication
  • Multiple Algorithms:
    • SAC (Soft Actor-Critic)
    • MASAC (Multi-Agent SAC with attention)
    • H-CRRT (Hierarchical RRT* baseline)
  • Flexible Environment: Customizable obstacles, goals, and agent configurations
  • Visualization: Real-time rendering and trajectory analysis

πŸ“ Project Structure

path planning2/
β”œβ”€β”€ main_SAC.py                    # Basic SAC implementation
β”œβ”€β”€ main_SAC_curriculum.py         # Curriculum learning with MASAC
β”œβ”€β”€ requirements.txt               # Python dependencies
β”‚
β”œβ”€β”€ rl_env/                        # Reinforcement learning environment
β”‚   β”œβ”€β”€ path_env.py               # Main environment interface
β”‚   └── components/               # Entity management, rewards, rendering
β”‚
β”œβ”€β”€ masac_adapter/                # Multi-agent SAC with attention
β”‚   β”œβ”€β”€ actor_networks.py         # Leader and follower actor networks
β”‚   β”œβ”€β”€ critic_networks.py        # Structured attention critic
β”‚   β”œβ”€β”€ smer_memory.py            # Experience replay buffer
β”‚   └── masac_controller.py       # MASAC controller
β”‚
β”œβ”€β”€ curriculum/                    # Curriculum learning framework
β”‚   β”œβ”€β”€ curriculum_manager.py     # Task progression manager
β”‚   β”œβ”€β”€ task_generator.py         # Task difficulty generator
β”‚   β”œβ”€β”€ task_sequencer.py         # Task ordering
β”‚   └── knowledge_transfer.py     # Policy transfer between tasks
β”‚
β”œβ”€β”€ H_CRRT/                       # Baseline RRT* planner
β”‚   β”œβ”€β”€ rrtstar.py               # RRT* path planning
β”‚   β”œβ”€β”€ formation.py             # Formation control
β”‚   └── tracking.py              # Path tracking controller
β”‚
└── masac_no_curriculum/          # Ablation study versions
    └── masac_no_attention/

πŸ”§ Installation

Prerequisites

  • Python 3.7+
  • CUDA-capable GPU (recommended for training)

Install Dependencies

pip install -r requirements.txt

Key Dependencies

  • PyTorch >= 1.8.0
  • NumPy
  • Pygame (for visualization)
  • Matplotlib (for plotting)
  • TensorBoard (for training logs)

πŸš€ Quick Start

1. Basic Training

Train a basic SAC agent:

python main_SAC.py --mode train

2. Curriculum Learning Training

Train with progressive curriculum:

python main_SAC_curriculum.py --use_curriculum

3. Test Trained Model

python main_SAC_curriculum.py --test --model_path Path_SAC_curriculum_step4_ep80

πŸ“– Usage

Basic SAC Training

Train without rendering

python main_SAC.py --mode train

Train with visualization

python main_SAC.py --mode train --render

Test trained model

python main_SAC.py --mode test

Test with custom model path

python main_SAC.py --mode test --model_path D:/pa/path planning2/Path_SAC_actor_L1.pth

Curriculum Learning

Train with curriculum

python main_SAC_curriculum.py --use_curriculum

Train with visualization

python main_SAC_curriculum.py --use_curriculum --render

Test mode

python main_SAC_curriculum.py --test

Adjust logging level

python main_SAC_curriculum.py --log_level warning

Test with custom configuration

python main_SAC_curriculum.py --test \
    --model_path D:/pa/path planning2/Path_SAC_curriculum_step4_ep80 \
    --test_episodes 100 \
    --hero_count 1 \
    --enemy_count 3 \
    --obstacle_count 2

Available Arguments

Argument Type Default Description
--mode str train Mode: train or test
--render flag False Enable visualization
--use_curriculum flag False Enable curriculum learning
--test flag False Run in test mode
--test_episodes int 100 Number of test episodes
--hero_count int 1 Number of leader agents
--enemy_count int 4 Number of follower agents
--obstacle_count int 1 Number of obstacles
--model_path str - Path to saved model
--log_level str info Logging level: debug, info, warning, error
--test_speed float 1.0 Test simulation speed
--analyze flag False Generate analysis plots
--result_path str results/ Path to save results

Baseline Methods

Run H-CRRT baseline

cd H_CRRT
python run_hcrrt.py --hero_count 1 --enemy_count 3 --obstacle_count 3 --test_episodes 20

βš™οΈ Configuration

Environment Parameters

Edit configuration in rl_env/components/entities.py:

SCREEN_W = 800          # Environment width
SCREEN_H = 600          # Environment height
AREA_X = 100           # Valid area boundaries
AREA_Y = 100
AREA_WITH = 600
AREA_HEIGHT = 500

Curriculum Settings

Modify curriculum parameters in curriculum/utils/config.py:

curriculum_manager:
  max_curriculum_steps: 20
  max_episodes_per_task: 200
  evaluation_window: 15
  success_rate_threshold: 0.9

Training Hyperparameters

Adjust in main_SAC_curriculum.py:

batch_size = 256
gamma = 0.99
tau = 0.01
value_lr = 3e-4
policy_lr = 1e-4

πŸ“Š Results

Training results are saved in:

  • results/: Test results and analysis
  • models/: Saved model checkpoints
  • TensorBoard logs for training curves

Visualize Training

tensorboard --logdir=runs

🎯 Key Algorithms

1. MASAC (Multi-Agent SAC)

  • Role-specific actor networks (leader/follower)
  • Structured attention critic for agent coordination
  • Shared replay buffer with experience prioritization

2. Curriculum Learning

  • Fixed task progression with increasing difficulty
  • Knowledge transfer via policy parameter reuse
  • Adaptive agent count scaling

3. H-CRRT Baseline

  • Hierarchical RRT* for global path planning
  • Distributed formation tracking control
  • Pure pursuit and speed synchronization

πŸ“ Notes

  • First training run will be slower due to environment initialization
  • GPU is highly recommended for training (10-100x speedup)
  • Rendering significantly slows down training
  • Saved models include both actor and critic networks

πŸ› Troubleshooting

CUDA Out of Memory

Reduce batch size or replay buffer size:

python main_SAC_curriculum.py --batch_size 128

Pygame Display Issues

Run without rendering:

python main_SAC_curriculum.py --use_curriculum

Slow Training

  • Disable rendering during training
  • Reduce number of training episodes
  • Use GPU acceleration

πŸ“„ License

This project is available for academic and research purposes.

🀝 Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

πŸ“§ Contact

For questions or collaboration, please open an issue on GitHub.


Last Updated: February 2026

About

Multi-Agent SAC with Curriculum Learning for Path Planning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages