<a href="https://colab.research.google.com/github/gyyang/neurogym/blob/master/neuroGym_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# NeuroGym

NeuroGym is a comprehensive toolkit that allows training any network model on many established neuroscience tasks using Reinforcement Learning techniques. It includes working memory tasks, value-based decision tasks and context-dependent perceptual categorization tasks.
Bellow we show an example in which we trained the A3C algorithm [Mnih et al. 2016 ](https://arxiv.org/abs/1602.01783) on the Random Dots Motion task. The implementation of the A3C algorithm is based on the one explained in [this post](https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-8-asynchronous-actor-critic-agents-a3c-c88f72a5e9f2).

# Install Gym and NeuroGym

In [1]:
! pip install gym



In [None]:
cd

In [None]:
! rm -rf neurogym/

In [0]:
! git clone https://github.com/gyyang/neurogym.git

fatal: destination path 'neurogym' already exists and is not an empty directory.


In [None]:
cd neurogym

In [None]:
pip install -e .

# Example

In [0]:
import threading
import multiprocessing
import os
import gym
import neurogym
import examples
import tensorflow as tf

def example(train=True, gamma=.8, learning_rate=1e-3, num_units=32, dt=100,
            task='RDM-v0'):
    #TASK
    #initialize task
    env = gym.make(task, **{'dt': dt})
    
    a_size = env.action_space.n  # number of actions
    state_size = env.observation_space.shape[0]  # number of inputs
    
    tf.reset_default_graph()
    with tf.device("/cpu:0"):
        global_episodes = tf.Variable(0, dtype=tf.int32,
                                      name='global_episodes',
                                      trainable=False)
        trainer = tf.train.AdamOptimizer(learning_rate=learning_rate)
        examples.agent.AC_Network(a_size, state_size, 'global',
                             None, num_units)  # Generate global net
        # Set workers to number of available CPU threads
        num_workers = multiprocessing.cpu_count()
        workers = []
        # Create worker classes
        for i in range(num_workers):
            workers.append(examples.agent.Worker(env, i, a_size, state_size,
                            trainer, global_episodes,
                            num_units))
        saver = tf.train.Saver(max_to_keep=5)

    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        sess.run(tf.global_variables_initializer())

        worker_threads = []
        for worker in workers:
            worker_work = lambda: worker.work(gamma, sess, coord, saver, train)
            thread = threading.Thread(target=(worker_work))
            thread.start()
            worker_threads.append(thread)
        coord.join(worker_threads)


# Run it!

In [0]:
example(train=True, gamma=.8, learning_rate=1e-3, num_units=32, dt=100, task='RDM-v0')

------------------
RDM
time step: 100
------------------
mean trial duration: 1330 (max num. steps: 13.3)
Starting worker 0
Starting worker 1
0
average performance: 0.4904904904904905
0
average performance: 0.4844844844844845
1
average performance: 0.47347347347347346
1
average performance: 0.5035035035035035
2
average performance: 0.4924924924924925
2
average performance: 0.47347347347347346
3
average performance: 0.4904904904904905
3
average performance: 0.4804804804804805
4
average performance: 0.5185185185185185
4
average performance: 0.4904904904904905
5
average performance: 0.4804804804804805
5
average performance: 0.5105105105105106
6
average performance: 0.5195195195195195
6
average performance: 0.4924924924924925
7
average performance: 0.48348348348348347
7
average performance: 0.48148148148148145
8
average performance: 0.48348348348348347
8
average performance: 0.4984984984984985
9
average performance: 0.47347347347347346
9
average performance: 0.4964964964964965
10
average p