Skip to content

celarex/RLcases

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 

Repository files navigation

RLcases (Timelines)

1.Platform for reinforcement learning research

Objectives

  • v1.0
  • Modular design (independent environment, agent, training arena, learning algorithm, approximate function and auxiliary modules)
  • Open closed principle (implemented by wrapper design)
  • Multi-platform (support Pytorch and Tensorflow for convenient performance comparison with other open sources)
  • Multi-algorithm; Multi-model (implemented A2C,Acktr,PPO,... algorithms; CNN,RNN,... models)
  • Multi-CPU sampling; Multi-GPU calculation
  • Modules for custom environments (used pygame for 2d cases and Panda3D for 3d cases)
  • Automatic hyperparameter search (implemented by Optuna)
  • v2.0
  • Modules for POMDPs (Inference in vision)
  • Modules for Multi-agent problems (Training arena for multi-player multi-type multi-agent without explicitly communication)
  • v3.0
  • Distributional calculation
  • Model based algorithms
  • Explicitly communications

Results

  • v1.0
  • Surpass or do as well as OpenAI's Baselines on Atari and Mujoco cases
  • v2.0
  • Performance similar with state-of-the-art algorithms on flickering Atari cases (POMDP cases)
  • Solved Starcraft minigames like micro management problems (Multi-agent cases)

Animations

Left:BeamRider's feature maps, Middle:Breakout's feature maps, Right:Visualizations of filters for MNIST

2.Unmanned underwater and airborne monitorings

Objectives

  • Optimize controlling of multiple kinds of unmanned vehicles
  • Capture targets efficiently while avoiding collision with obstacles
  • Adapting for POMDP environments

Results

Sketches

Task descriptions by stages

3.Power network running

Objectives

  • Training topology controllers to control electricity transportation in power grids, while keeping people and equipment safe from irregular wave motion or natural disasters.

Results

  • Ranked 4th in Learning to run a power network challenge https://l2rpn.chalearn.org/
  • Invited talk for the Annual Meeting of IEEJ (The Institute of Electrical Engineers of Japan), 2020

Animations

A tiny example case of power grids

4.Maze Escaping

Objectives

  • General agent to escape from different mazes

Results

  • Solved

Animations

5.Starcraft 2

Objectives

  • General agents to win in very complex environments

Results

  • Familiar with pysc2 APIs

6.Micromanagement in Real-Time-Strategy Game

Objectives

  • Starcraft's minigame like setting for optimal tactic learning

Results

  • Long range units learned hit and run tactic, Short range units learned surround tactic.

Animations

Left:Hit and run,Right:Surround

7.Controllable Cellular Automaton

Objectives

  • Control controllable cellular to guide controllable cellular automaton to a specific target state

Results

  • Foundational cases are Solved.

Animations

Left: Small case, Middle: Large case, Right: Loop tree case

8.Halite 4

Objectives

  • Kaggle competition, a resource management game where agent build and control a small armada of ships to collect more halite, a luminous energy source. https://www.kaggle.com/c/halite

Results

  • Purely machine learned agent was an excellent collector with high collection efficiency, but not aggressive enough to beat top agents (with some hand-crafted features).

Animations

Custom rendering for agent development

9.Scan robots

Objectives

  • Control multiple types of multiple robots to scan an area as soon as possible while avoiding static and dynamic obstacles.

Results

  • Well learned.

Animations

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published