Skip to content

NoakLiu/Awesome-Distributed-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

20 Commits
ย 
ย 

Repository files navigation

๐Ÿง ๐Ÿ”ฅ Awesome Distributed Reinforcement Learning Awesome

A curated list of ๐Ÿ”ฅ Distributed Reinforcement Learning papers and systems. Covering scalable algorithms, system frameworks, multi-agent setups, large models, communication strategies, and RLHF โ€” 140+ papers, repos and websites, maintained by Dong Liu and Xuqing Yang.


๐Ÿ—‚๏ธ Table of Contents


๐Ÿ“Œ Related Surveys and Overviews


๐Ÿš€ Algorithms & Theoretical Advances

๐Ÿ“ˆ Policy Gradient Methods

Title Year Link
IMPALA: Scalable Distributed Deep-RL 2018 https://arxiv.org/abs/1802.01561
Distributed Proximal Policy Optimization 2017 https://arxiv.org/abs/1707.02286
Asynchronous Methods for Deep Reinforcement Learning (A3C) 2016 https://arxiv.org/abs/1602.01783
Asynchronous Federated RL with PG Updates 2024 https://arxiv.org/abs/2402.08372
Distributed Policy Gradient with Variance Reduction 2023 https://arxiv.org/abs/2305.07180
Federated PG for Multi-Agent RL 2023 https://arxiv.org/abs/2304.12151
Scalable Distributed Policy Optimization via Consensus 2024 https://arxiv.org/abs/2401.09876
Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach 2019 https://arxiv.org/abs/1912.09135

๐ŸŽฏ Value-Based Methods

Title Year Link
Ape-X: Distributed Prioritized Replay 2018 https://arxiv.org/abs/1803.00933
Distributed Deep Q-Learning 2015 https://arxiv.org/abs/1508.04186
Distributed Rainbow 2023 https://arxiv.org/abs/2302.14592
Federated Q-Learning with Differential Privacy 2024 https://arxiv.org/abs/2403.15789
Asynchronous Distributed Value Function Approximation 2023 https://arxiv.org/abs/2309.12456

๐ŸŽญ Actor-Critic Methods

Title Year Link
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference 2019 https://arxiv.org/abs/1910.06591
Muesli 2021 https://arxiv.org/abs/2104.06159
Distributed Soft Actor-Critic with Experience Replay 2023 https://arxiv.org/abs/2308.14567
Federated Actor-Critic for Continuous Control 2023 https://arxiv.org/abs/2311.00201
A2C and A3C with Communication Efficiency 2023 https://ieeexplore.ieee.org/abstract/document/9289269

โšก Asynchronous & Parallel Methods

Title Year Link
SRL: Scaling Distributed RL to 10,000+ Cores 2024 https://arxiv.org/abs/2306.15108
DistRL: An Async Distributed RL Framework 2024 https://arxiv.org/abs/2401.12453
Loss- and Reward-Weighting for Efficient DRL 2024 https://arxiv.org/abs/2311.01354
Accelerated Methods for Distributed RL 2022 https://arxiv.org/abs/2203.09511
Parallel A2C with Shared Experience 2023 https://arxiv.org/abs/2307.19876
Distributed Async Temporal Difference Learning 2024 https://arxiv.org/abs/2402.17890
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents 2024 https://arxiv.org/abs/2410.14803
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores 2023 https://arxiv.org/abs/2306.16688
MSRL: Distributed Reinforcement Learning with Dataflow Fragments 2022 https://arxiv.org/abs/2210.00882
Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform 2023 https://arxiv.org/abs/2310.00036
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems 2020 https://arxiv.org/abs/2012.04210

๐Ÿ”„ Gradient Aggregation & Optimization

Title Year Link
Gradient Compression for Distributed DRL 2023 https://arxiv.org/abs/2305.14321
Byzantine-Robust Distributed RL 2024 https://arxiv.org/abs/2403.08765
Variance-Reduced Distributed PG 2023 https://arxiv.org/abs/2309.18765
Momentum-Based Distributed RL 2024 https://arxiv.org/abs/2401.23456

๐Ÿงฎ Theoretical Foundations

Title Year Link
Convergence Analysis of Distributed RL 2023 https://arxiv.org/abs/2304.09876
Sample Complexity of Federated RL 2024 https://arxiv.org/abs/2402.12345
Communication Complexity in Distributed RL 2023 https://arxiv.org/abs/2308.56789
Regret Bounds for Distributed Bandits 2024 https://arxiv.org/abs/2405.11111
Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation 2025 https://ojs.aaai.org/index.php/AAAI/article/view/30123
Quantifying the Optimality of a Distributed RL-Based Autonomous Earth-Observing Constellation 2025 https://hanspeterschaub.info/Papers/Stephenson2025.pdf

๐ŸŽฒ Exploration & Sampling

Title Year Link
Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning 2022 https://arxiv.org/abs/2202.02693
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo 2023 https://arxiv.org/abs/2305.18246
More Efficient Randomized Exploration for Reinforcement Learning 2024 https://arxiv.org/abs/2406.12241
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models 2025 https://arxiv.org/abs/2505.15293
Explore-Go: Leveraging Exploration for Generalisation in Deep Reinforcement Learning 2024 https://arxiv.org/abs/2406.08069

๐Ÿง  Latent Thought Language Models (LTM) for RL

Title Year Link
Place Cells as Position Embeddings of Multi-Time Random Walk Transition Kernels for Path Planning 2025 https://arxiv.org/abs/2505.14806
Latent Adaptive Planner for Dynamic Manipulation 2025 https://arxiv.org/abs/2505.03077
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space 2025 https://arxiv.org/abs/2505.13308
Scalable Language Models with Posterior Inference of Latent Thought Vectors 2025 https://arxiv.org/abs/2502.01567
Training Chain-of-Thought via Latent-Variable Inference 2024 https://openreview.net/forum?id=a147pIS2Co

๐Ÿ›ฐ๏ธ Federated Reinforcement Learning (FedRL)

Title Year Link
Federated Deep Reinforcement Learning 2019 https://arxiv.org/abs/1901.08277
Federated Reinforcement Learning: Techniques, Applications, and Open Challenges 2021 https://arxiv.org/abs/2108.11887
Federated Reinforcement Learning with Constraint Heterogeneity 2024 https://arxiv.org/abs/2405.03236
Federated reinforcement learning for robot motion planning with zero-shot generalization 2024 https://arxiv.org/abs/2403.13245
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments 2024 https://arxiv.org/abs/2405.19499
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis 2024 https://arxiv.org/abs/2404.08003
FedHPD: Heterogeneous Federated Reinforcement Learning via Policy Distillation 2025 https://arxiv.org/abs/2502.00870
Federated Reinforcement Learning with Environment Heterogeneity 2022 https://arxiv.org/abs/2204.02634
Federated Ensemble-Directed Offline Reinforcement Learning 2023 https://arxiv.org/abs/2305.03097
FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF 2024 https://arxiv.org/abs/2412.15538

๐Ÿงฑ System Frameworks

Title Year Link
Ray RLlib 2021 https://docs.ray.io/en/latest/rllib/
RLlib: Abstractions for Distributed Reinforcement Learning 2018 https://arxiv.org/abs/1712.09381
Acme 2020 https://github.com/deepmind/acme
TorchBeast: A PyTorch Platform for Distributed RL 2019 https://arxiv.org/abs/1910.03552
TorchRL 2023 https://pytorch.org/rl/
CleanRL + SLURM 2022 https://github.com/vwxyzjn/cleanrl
Cleanba 2023 https://github.com/vwxyzjn/cleanba
FedHQL 2023 https://arxiv.org/abs/2301.11135
DistRL Framework 2024 https://arxiv.org/abs/2401.15803
AReaL 2025 https://github.com/inclusionAI/AReaL
SandGraphX 2025 https://github.com/NoakLiu/SandGraphX

๐Ÿ“ก Communication Efficiency

Title Year Link
Deep Gradient Compression 2017 https://arxiv.org/abs/1712.01887
Gradient Surgery 2020 https://arxiv.org/abs/2001.06782
DD-PPO 2020 https://arxiv.org/abs/2007.04938
Gradient Dropping 2018 https://arxiv.org/abs/1806.08768
Bandwidth-Aware RL 2023 https://arxiv.org/abs/2303.08127
QDDP: Quantized DRL 2021 https://arxiv.org/abs/2102.09352
Communication-Aware Distributed RL 2024 https://arxiv.org/abs/2402.17222
Application of Reinforcement Learning to Routing in Distributed Wireless Networks: A Review 2015 https://link.springer.com/article/10.1007/s10462-012-9383-6

๐Ÿ‘ฅ Multi-Agent Distributed RL

Title Year Link
MADDPG 2017 https://arxiv.org/abs/1706.02275
MAVEN 2019 https://arxiv.org/abs/1910.07483
R-MADDPG 2022 https://arxiv.org/abs/2202.03428
Tesseract 2022 https://arxiv.org/abs/2211.03537
FMRL-LA 2023 https://arxiv.org/abs/2310.11572
CAESAR 2024 https://arxiv.org/abs/2402.07426
FAgents: Federated MARL 2023 https://arxiv.org/abs/2312.22222
Distributed Reinforcement Learning for Robot Teams: A Review 2022 https://arxiv.org/abs/2204.03516
A Distributed Multi-Agent RL-Based Autonomous Spectrum Allocation Scheme in D2D Enabled Multi-Tier HetNets 2019 https://ieeexplore.ieee.org/document/8598795

๐Ÿฆพ RLHF & Distributed Human Feedback

Title Year Link
InstructGPT 2022 https://arxiv.org/abs/2203.02155
Self-Instruct 2022 https://arxiv.org/abs/2212.10560
DPO 2023 https://arxiv.org/abs/2305.18290
RAFT 2024 https://arxiv.org/abs/2402.03620
Distributed PPO (HF-TRL) 2023 https://huggingface.co/docs/trl/main/en/
Decentralized RLHF 2023 https://arxiv.org/abs/2310.11883
FedRLHF 2024 https://arxiv.org/abs/2412.15538

๐Ÿง  Large-Scale Models in RL

Title Year Link
Gato 2022 https://arxiv.org/abs/2205.06175
PaLM-E 2023 https://arxiv.org/abs/2303.03378
Decision Transformer 2021 https://arxiv.org/abs/2106.01345
V-D4RL 2022 https://arxiv.org/abs/2202.02349
TAMER-GPT 2023 https://arxiv.org/abs/2305.11521
Gorilla 2023 https://arxiv.org/abs/2305.01569
Open X-Embodiment 2023 https://arxiv.org/abs/2306.03367

๐Ÿ’ป Codebases & Benchmarks


๐Ÿ“š Resources & Tutorials


๐Ÿ Contributing

We welcome contributions! If you find a missing paper, tool, or benchmark related to distributed reinforcement learning, please feel free to open a pull request or issue.


๐Ÿ“œ License

MIT License. See LICENSE for details.


๐Ÿ“Ž Citation

@online{liu2025awesome,
  author       = {Dong Liu, Xuqing Yang, Xuhong Wang, Ying Nian Wu and contributors},
  title        = {Awesome Distributed Reinforcement Learning},
  year         = {2025},
  url          = {https://github.com/NoakLiu/Awesome-Distributed-RL},
  note         = {GitHub Repository}
}

About

A Collection for Distributed Reinforcement Learning Papers

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published