Database of RL algorithms
Atari Space Invaders Scores | MuJoCo Walker2d Scores |
---|---|
You can use rldb.find_all({})
to retrieve all existing entries in rldb
.
import rldb
all_entries = rldb.find_all({})
You can also filter entries by specifying key-value pairs that the entry must match:
import rldb
dqn_entries = rldb.find_all({'algo-nickname': 'DQN'})
breakout_noop_entries = rldb.find_all({
'env-title': 'atari-breakout',
'env-variant': 'No-op start',
})
You can also use rldbl.find_one(filter_dict)
to find one entry that matches the key-value pair specified in filter_dict
:
import rldb
import pprint
entry = rldb.find_one({
'env-title': 'atari-pong',
'algo-title': 'Human',
})
pprint.pprint(entry)
Output
{
'algo-nickname': 'Human',
'algo-title': 'Human',
'env-title': 'atari-pong',
'env-variant': 'No-op start',
'score': 14.6,
'source-arxiv-id': '1511.06581',
'source-arxiv-version': 3,
'source-authors': [ 'Ziyu Wang',
'Tom Schaul',
'Matteo Hessel',
'Hado van Hasselt',
'Marc Lanctot',
'Nando de Freitas'],
'source-bibtex': '@article{DBLP:journals/corr/WangFL15,\n'
' author = {Ziyu Wang and\n'
' Nando de Freitas and\n'
' Marc Lanctot},\n'
' title = {Dueling Network Architectures for Deep '
'Reinforcement Learning},\n'
' journal = {CoRR},\n'
' volume = {abs/1511.06581},\n'
' year = {2015},\n'
' url = {http://arxiv.org/abs/1511.06581},\n'
' archivePrefix = {arXiv},\n'
' eprint = {1511.06581},\n'
' timestamp = {Mon, 13 Aug 2018 16:48:17 +0200},\n'
' biburl = '
'{https://dblp.org/rec/bib/journals/corr/WangFL15},\n'
' bibsource = {dblp computer science bibliography, '
'https://dblp.org}\n'
'}',
'source-nickname': 'DuDQN',
'source-title': 'Dueling Network Architectures for Deep Reinforcement '
'Learning'
}
Here is the format of every entry:
{
# BASICS
"source-title": "",
"source-nickname": "",
"source-authors": [],
# MISC.
"source-bibtex": "",
# ALGORITHM
"algo-title": "",
"algo-nickname": "",
"algo-source-title": "",
# SCORE
"env-title": "",
"score": 0,
}
source-title
is the full title of the source of the score: it can be the title of the paper or GitHub repository title.source-nickname
is a popular nickname or acronym for that title if it exists, otherwise it is the same assource-title
.source-authors
are a list of authors or contributors.source-bibtex
is a BibTeX-format citation.algo-title
is the full title of the algorithm used.algo-nickname
is the nickname or acronym for that algorithm if it exists, otherwise it is the same asalgo-nickname
.algo-source-title
is the title of the source of the algorithm. It can and often is different fromsource-title
.
For example, the Space Invaders score of Asynchronous Advantage Actor Critic (A3C) algorithm in the Noisy Networks for Exploration (NoisyNet) paper is represented by the following entry:
{
# BASICS
"source-title": "Noisy Networks for Exploration",
"source-nickname": "NoisyNet",
"source-authors": [
"Meire Fortunato",
"Mohammad Gheshlaghi Azar",
"Bilal Piot",
"Jacob Menick",
"Ian Osband",
"Alex Graves",
"Vlad Mnih",
"Remi Munos",
"Demis Hassabis",
"Olivier Pietquin",
"Charles Blundell",
"Shane Legg",
],
# ARXIV
"source-arxiv-id": "1706.10295",
"source-arxiv-version": 2,
# MISC.
"source-bibtex": """
@article{DBLP:journals/corr/FortunatoAPMOGM17,
author = {Meire Fortunato and
Mohammad Gheshlaghi Azar and
Bilal Piot and
Jacob Menick and
Ian Osband and
Alex Graves and
Vlad Mnih and
R{\'{e}}mi Munos and
Demis Hassabis and
Olivier Pietquin and
Charles Blundell and
Shane Legg},
title = {Noisy Networks for Exploration},
journal = {CoRR},
volume = {abs/1706.10295},
year = {2017},
url = {http://arxiv.org/abs/1706.10295},
archivePrefix = {arXiv},
eprint = {1706.10295},
timestamp = {Mon, 13 Aug 2018 16:46:11 +0200},
biburl = {https://dblp.org/rec/bib/journals/corr/FortunatoAPMOGM17},
bibsource = {dblp computer science bibliography, https://dblp.org}
}""",
# ALGORITHM
"algo-title": "Asynchronous Advantage Actor Critic",
"algo-nickname": "A3C",
"algo-source-title": "Asynchronous Methods for Deep Reinforcement Learning",
# HYPERPARAMETERS
"algo-frames": 320 * 1000 * 1000, # Number of frames
# SCORE
"env-title": "atari-space-invaders",
"env-variant": "No-op start",
"score": 1034,
"stddev": 49,
}
Note that, as shown here, the entry can contain additional information.
- Playing Atari with Deep Reinforcement Learning (Mnih et al., 2013)
- Human-level control through deep reinforcement learning (Mnih et al., 2015)
- Deep Recurrent Q-Learning for Partially Observable MDPs (Hausknecht and Stone, 2015)
- Massively Parallel Methods for Deep Reinforcement Learning (Nair et al., 2015)
- Deep Reinforcement Learning with Double Q-learning (Hasselt et al., 2015)
- Prioritized Experience Replay (Schaul et al., 2015)
- Dueling Network Architectures for Deep Reinforcement Learning (Wang et al., 2015)
- Noisy Networks for Exploration (Fortunato et al., 2017)
- A Distributional Perspective on Reinforcement Learning (Bellemare et al., 2017)
- Rainbow: Combining Improvements in Deep Reinforcement Learning (Hessel et al., 2017)
- Distributional Reinforcement Learning with Quantile Regression (Dabney et al., 2017)
- Implicit Quantile Networks for Distributional Reinforcement Learning (Dabney et al., 2018)
- Distributed Prioritized Experience Replay (Horgan et al., 2018)
- Asynchronous Methods for Deep Reinforcement Learning (Mnih et al., 2016)
- Trust Region Policy Optimization (Schulman et al., 2015)
- Proximal Policy Optimization Algorithms (Schulman et al., 2017)
- Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (Wu et al., 2017)
- Addressing Function Approximation Error in Actor-Critic Methods (Fujimoto et al., 2018)
- IMPALA: Importance Weighted Actor-Learner Architectures (Espeholt et al., 2018)
- The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning (Gruslys et al., 2017)