Skip to content

MAX is a highly experimental JAX RL library focusing on online adaptation and information-gathering. It supports model-based control with parameter beliefs, model-free methods, and has first-class multi-agent support for rapid research.

Notifications You must be signed in to change notification settings

fernandopalafox/max

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚠️ MAX: A JAX-based Research Library for Online RL ⚠️

MAX is a highly experimental and rapidly evolving modular reinforcement learning library built on JAX. It is primarily designed to prioritize online adaptation algorithms and information-gathering strategies in reinforcement learning, with a focus on both model-based and model-free control, and first-class support for multi-agent systems.

Features

  • Pure JAX Implementation: Leverage JIT compilation, automatic differentiation, and GPU/TPU acceleration for fast iteration.
  • Emphasis on Online Adaptation: Core design centers around algorithms and components for efficient adaptation to changing or uncertain dynamics.
  • Model-Based Algorithms with Parameter Belief: Supports model-based control where the dynamics components maintain a distribution or belief over uncertain parameters (e.g., in a Bayesian context).
  • Multi-Agent RL: Built-in support for IPPO (Independent PPO) and multi-agent environments.
  • Modular Design: Mix and match components (environments, policies, trainers, normalizers) for rapid prototyping of novel online algorithms.

Installation

From source

git clone <repository-url>
cd max
pip install -e .

Library Structure

Core Modules

  • environments: Multi-agent tracking and pursuit-evasion environments
  • dynamics: Learned dynamics models (MLP-based, analytical models)
  • policies: Actor-critic policies and model-based planners
  • policy_trainers: PPO and IPPO training algorithms
  • trainers: Dynamics model training (gradient descent, EKF, PETS)
  • normalizers: State/action/reward normalization utilities
  • buffers: JAX-based replay buffers for efficient data storage
  • planners: Model-based planning algorithms (CEM, iCEM)
  • policy_evaluators: Policy evaluation and rollout utilities
  • evaluation: Dynamics model evaluation metrics

Auxiliary Modules

  • estimators: Extended Kalman Filter for online Bayesian optimization

Examples

Pursuit-Evasion

Multi-agent pursuit-evasion policy visualization

Figure 1: Multi-agent pursuit-evasion policy

  • scripts/ippo_pe.py: Train IPPO agents on pursuit-evasion task

  • scripts/visualize_pe.py: Visualize trained policies

Multi-Agent Goal Tracking

Multi-agent goal tracking with switching targets

Figure 2: Multi-agent goal tracking with dynamic target switching

  • scripts/ippo_tracking.py: Train IPPO agents for goal tracking
  • scripts/visualize_tracking.py: Visualize trained tracking policies

Architecture Highlights

Functional Design

All components follow JAX's functional programming paradigm:

  • Immutable state containers (NamedTuples, PyTreeNodes)
  • Pure functions for transformations
  • JIT-compiled operations for performance

Multi-Agent Support

The library is designed with multi-agent systems as a first-class citizen:

  • Independent parameter sets per agent
  • Shared or separate training
  • Flexible observation/action spaces

Composability

Mix and match components easily:

# Use model-based planner as policy
policy = create_planner_policy(planner, dynamics_model)

# Or use model-free actor-critic
policy = create_actor_critic_policy(config)

# Same trainer interface for both!
trainer = init_policy_trainer(config, policy)

License

MIT License

About

MAX is a highly experimental JAX RL library focusing on online adaptation and information-gathering. It supports model-based control with parameter beliefs, model-free methods, and has first-class multi-agent support for rapid research.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages