Skip to content

A lightweight library to build and train Deep Reinforcement Learning agents using Theano+Lasagne

License

Notifications You must be signed in to change notification settings

hydercps/AgentNet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

88 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgentNet

A lightweight library to build and train neural networks for reinforcement learning using Theano+Lasagne

Warning

The library is in active development. We maintain a set of runnable examples and a fixed interface, but it may still change once in a while.

Linux and Mac OS Installation

This far the instalation was only tested on Ubuntu, yet an experienced user is unlikely to have problems installing it onto other Linux or Mac OS Machine Currently the minimal dependencies are bleeding edge Theano and Lasagne. You can find a guide to installing them here

If you have both of them, you can install agentnet with these commands

 git clone https://github.com/justheuristic/AgentNet
 cd AgentNet
 python setup.py install

Windows installation

Technically if you managed to get Lasagne working on Windows, you can follow the Linux instruction. However, we cannot guarantee that this will work consistently.

Demos

If you wish to get acquainted with the current library state, view some of the ./examples

If you wish to join the development, we would be eager to accept your help. Current priority development anchors are maintained at the bottom of this readme.

If you wish to contribute your own architecture or experiment, please contact me via github or justheuristic@gmail.com. In fact, please contact me if you have any questions, or ideas, i'd be eager to see them.

What?

The final framework is planned to be built on and fully compatible with awesome Lasagne[6] with some helper functions to facilitate learning.

The main objectives are:

  • easy way of tinkering with reinforcement learning architectures
  • just as simple prototyping of Attention and Long Term Memory architectures
  • ease of experiment conduction and reproducibility
  • full integration with Lasagne and Theano

Why?

[long story short: create a platform to play with *QN, attentive and LTM architectures without spending months reading code]

[short story long:

The last several years have marked the rediscovery of neural networks applied to Reinforcement Learning domain. The idea has first been introduced in early 90's [0] or even earlier, but was mostly forgotten soon afterwards.

Years later, these methods were reborn under Deep Learning sauce and popularized by Deepmind [1,2]. Several other researchers have already jumped into the domain with their architectures [3,4] and even dedicated playgrounds [5] to play with them.

The problem is that all these models exist in their own problem setup and implementation bubbles. Simply comparing your new architecture the ones you know requires

  • 10% implementing architecture
  • 20% implementing experiment setup
  • 70% reimplementing all the other network architectures

This process is not only inefficient, but also very unstable, since a single mistake while implementing 'other' architecture can lead to incorrect results.

So here we are, attempting to build yet another bridge between eager researchers [primarily ourselves so far] and deep reinforcement learning.

The key objective is to make it easy to build new architectures and test is against others on a number of problems. The easier it is to reproduce the experiment setup, the simpler it is to architect something new and wonderful, the quicker we get to solutions directly applicable to real world problems.

]

Current state & priorities

The library is currently in active development and there is much to be done yet.

[priority] Component; no priority means "done"

  • Core components

  • Environment

  • Objective

  • Agent architecture

    • MDP (RL) agent
    • Generator
    • Fully customizable agent
  • Experiment platform

    • [high] Experiment setup zoo
    • [medium] Pre-trained model zoo
    • [medium] quick experiment running ( * experiment is defined as (environment, objective function, NN architecture, training algorithm)
  • Layers

  • Memory

    • Simple RNN done as Lasagne.layers.DenseLayer
    • One-step GRU memory
    • [half-done, medium] Custom LSTM-like constructor
    • Stack Augmentation
    • [low] List augmentation
    • [low] Neural Turing Machine controller
  • Resolvers

    • Greedy resolver (as BaseResolver)
    • Epsilon-greedy resolver
    • Probablistic resolver
  • Learning objectives algorithms

    • Q-learning
    • SARSA
    • k-step learning
    • k-step Advantage Actor-critic methods
    • Can use any theano/lasagne expressions for loss, gradients and updates
    • Experience replay pool
  • Experiment setups

    • boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
    • Wikicat - guessing person's traits based on wikipedia biographies
    • [half-done] 2048 in the browser - playing 2048 using Selenium only
    • [high] openAI gym training/evaluation api and demos
    • [medium] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
  • Visualization tools

    • basic monitoring tools
    • [medium] generic tunable session visualizer
  • Explanatory material

  • [medium] readthedocs pages

  • [global] MOAR sensible examples

  • [medium] report on basic research (optimizer comparison, training algorihtm comparison, layers, etc)

About

A lightweight library to build and train Deep Reinforcement Learning agents using Theano+Lasagne

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.8%
  • Other 0.2%