rldb

Database of RL algorithms

Atari Space Invaders Scores	MuJoCo Walker2d Scores

Examples

You can use rldb.find_all({}) to retrieve all existing entries in rldb.

import rldb


all_entries = rldb.find_all({})

You can also filter entries by specifying key-value pairs that the entry must match:

import rldb


dqn_entries = rldb.find_all({'algo-nickname': 'DQN'})
breakout_noop_entries = rldb.find_all({
    'env-title': 'atari-breakout',
    'env-variant': 'No-op start',
})

You can also use rldbl.find_one(filter_dict) to find one entry that matches the key-value pair specified in filter_dict:

import rldb
import pprint


entry = rldb.find_one({
    'env-title': 'atari-pong',
    'algo-title': 'Human',
})
pprint.pprint(entry)

Output

{
    'algo-nickname': 'Human',
    'algo-title': 'Human',
    'env-title': 'atari-pong',
    'env-variant': 'No-op start',
    'score': 14.6,
    'source-arxiv-id': '1511.06581',
    'source-arxiv-version': 3,
    'source-authors': [   'Ziyu Wang',
                          'Tom Schaul',
                          'Matteo Hessel',
                          'Hado van Hasselt',
                          'Marc Lanctot',
                          'Nando de Freitas'],
    'source-bibtex': '@article{DBLP:journals/corr/WangFL15,\n'
                     '    author    = {Ziyu Wang and\n'
                     '                 Nando de Freitas and\n'
                     '                 Marc Lanctot},\n'
                     '    title     = {Dueling Network Architectures for Deep '
                     'Reinforcement Learning},\n'
                     '    journal   = {CoRR},\n'
                     '    volume    = {abs/1511.06581},\n'
                     '    year      = {2015},\n'
                     '    url       = {http://arxiv.org/abs/1511.06581},\n'
                     '    archivePrefix = {arXiv},\n'
                     '    eprint    = {1511.06581},\n'
                     '    timestamp = {Mon, 13 Aug 2018 16:48:17 +0200},\n'
                     '    biburl    = '
                     '{https://dblp.org/rec/bib/journals/corr/WangFL15},\n'
                     '    bibsource = {dblp computer science bibliography, '
                     'https://dblp.org}\n'
                     '}',
    'source-nickname': 'DuDQN',
    'source-title': 'Dueling Network Architectures for Deep Reinforcement '
                    'Learning'
}

Entry Structure

Here is the format of every entry:

{
    # BASICS
    "source-title": "",
    "source-nickname": "",
    "source-authors": [],

    # MISC.
    "source-bibtex": "",

    # ALGORITHM
    "algo-title": "",
    "algo-nickname": "",
    "algo-source-title": "",

    # SCORE
    "env-title": "",
    "score": 0,
}

source-title is the full title of the source of the score: it can be the title of the paper or GitHub repository title. source-nickname is a popular nickname or acronym for that title if it exists, otherwise it is the same as source-title.
source-authors are a list of authors or contributors.
source-bibtex is a BibTeX-format citation.
algo-title is the full title of the algorithm used. algo-nickname is the nickname or acronym for that algorithm if it exists, otherwise it is the same as algo-nickname.
algo-source-title is the title of the source of the algorithm. It can and often is different from source-title.

For example, the Space Invaders score of Asynchronous Advantage Actor Critic (A3C) algorithm in the Noisy Networks for Exploration (NoisyNet) paper is represented by the following entry:

{
    #  BASICS
    "source-title": "Noisy Networks for Exploration",
    "source-nickname": "NoisyNet",
    "source-authors": [
        "Meire Fortunato",
        "Mohammad Gheshlaghi Azar",
        "Bilal Piot",
        "Jacob Menick",
        "Ian Osband",
        "Alex Graves",
        "Vlad Mnih",
        "Remi Munos",
        "Demis Hassabis",
        "Olivier Pietquin",
        "Charles Blundell",
        "Shane Legg",
    ],

    #  ARXIV
    "source-arxiv-id": "1706.10295",
    "source-arxiv-version": 2,

    #  MISC.
    "source-bibtex": """
@article{DBLP:journals/corr/FortunatoAPMOGM17,
    author    = {Meire Fortunato and
                 Mohammad Gheshlaghi Azar and
                 Bilal Piot and
                 Jacob Menick and
                 Ian Osband and
                 Alex Graves and
                 Vlad Mnih and
                 R{\'{e}}mi Munos and
                 Demis Hassabis and
                 Olivier Pietquin and
                 Charles Blundell and
                 Shane Legg},
    title     = {Noisy Networks for Exploration},
    journal   = {CoRR},
    volume    = {abs/1706.10295},
    year      = {2017},
    url       = {http://arxiv.org/abs/1706.10295},
    archivePrefix = {arXiv},
    eprint    = {1706.10295},
    timestamp = {Mon, 13 Aug 2018 16:46:11 +0200},
    biburl    = {https://dblp.org/rec/bib/journals/corr/FortunatoAPMOGM17},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}""",

    # ALGORITHM
    "algo-title": "Asynchronous Advantage Actor Critic",
    "algo-nickname": "A3C",
    "algo-source-title": "Asynchronous Methods for Deep Reinforcement Learning",

    # HYPERPARAMETERS
    "algo-frames": 320 * 1000 * 1000,  # Number of frames

    # SCORE
    "env-title": "atari-space-invaders",
    "env-variant": "No-op start",
    "score": 1034,
    "stddev": 49,
}

Note that, as shown here, the entry can contain additional information.

Sources

Papers

Deep Q-Networks

Policy Gradients

Exploration

Exploration by Random Network Distillation (Burda et al., 2018)

Misc.

Trust-PCL: An Off-Policy Trust Region Method for Continuous Control (Nachum et al., 2017)

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
docs		docs
markdown		markdown
parsers		parsers
pdfs		pdfs
rldb		rldb
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
example_plot.py		example_plot.py
generate_md_atari.py		generate_md_atari.py
generate_md_helper.py		generate_md_helper.py
generate_md_mujoco.py		generate_md_mujoco.py
meta_parser.py		meta_parser.py
meta_template.txt		meta_template.txt
parser.py		parser.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rldb

Examples

Entry Structure

Sources

Papers

Deep Q-Networks

Policy Gradients

Exploration

Misc.

Repositories

About

Releases

Packages

Languages

License

seungjaeryanlee/rldb

Folders and files

Latest commit

History

Repository files navigation

rldb

Examples

Entry Structure

Sources

Papers

Deep Q-Networks

Policy Gradients

Exploration

Misc.

Repositories

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages