Group Project

General Notes

To implement a new agent and/or model one should just need to create a new file as outlined below with a couple functions implemented. Most things like a general training loop and collecting statistics during training are taken care of by agent_runner.py or agent.py. Hopefully this allows us to just implement model/agent specific things and not have to worry about the infrastructure.

Project Goals

Benchmark numerous agent types across discrete and continuous games
Models considered: -- DQN Models:
"Vanilla" DQN
Dueling DQN
Dueling Double DQN

-- Policy Gradient Models:

MC (REINFORCE)
PPO
"Vanilla"

Tasks:

Implementing a new agent

Copy agents/sample_agent.py to a new file

Create a new agent file in the agents directory

Implement init

Create your optimizer and assign to self.optimizer
Create your loss function and assign to self.criterion
Perform any other agent specific items needed
- EX: In DQN you would create your target model as seen in sample_agent.py here

Implement make_action

Note that epsilon and test are passed as arguments here

Implement push

The buffer already exists for your agent so you could just use copy the line in the sample_agent.py unless you have specific functionality to perform

Implement replay_buffer

Again, you likely will just need to copy the method from sample_agent.py

Implement can_train

This called at each step within an episode
Return True if the agent is able to be trained
Return False if the agent is not able to be trained
EX: For DQN, you would return False until you have enough entries in your replay buffer as seen in the sample_agent

Implement train

This is called at each step of the training loop if can_train is True
Pseudocode from agent_runner.py:

for episode:
    for step:
       train() <--- This method

The current tuple for the step is provided and any training should be performed here
This is where you calculate loss and perform gradient descent
EX: In sample_agent you will see this is where we use the replay buffer etc...
NOTE: The loss should be returned at the end of this function

Implementing a new model

This really just involves creating a normal torch model in the models directory
See sample_model.py for an example

Get Listing of Available Models and Agents

python main.py --list

Executing training

This remains largely unchanged from Project 3 though there are some required arguments and lots more optional arguments

General Command

python main.py --train_dqn --agent sample_agent.SampleAgent --model sample_model.SampleModel --run_name training_run

Executing test

This uses the test_dqn argument as well as arguments to indicate which save model and optimizer to use

python main.py --test_dqn --agent sample_agent.SampleAgent --model sample_model.SampleModel --run_name test_run --model_path ./archive/training_run/my_model.pth --optimizer_path ./archive/training/my_optimizer.pth

Name		Name	Last commit message	Last commit date
Latest commit History 238 Commits
final		final
progress		progress
project		project
proposal		proposal
.DS_Store		.DS_Store
.gitigore		.gitigore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Group Project

General Notes

Project Goals

Implementing a new agent

Copy agents/sample_agent.py to a new file

Implement init

Implement make_action

Implement push

Implement replay_buffer

Implement can_train

Implement train

Implementing a new model

Get Listing of Available Models and Agents

Executing training

General Command

Archive

Executing test

About

Releases

Packages

Contributors 5

Languages

alexander-moore/Reinforcement-Learning-Benchmarking

Folders and files

Latest commit

History

Repository files navigation

Group Project

General Notes

Project Goals

Implementing a new agent

Copy agents/sample_agent.py to a new file

Implement init

Implement make_action

Implement push

Implement replay_buffer

Implement can_train

Implement train

Implementing a new model

Get Listing of Available Models and Agents

Executing training

General Command

Archive

Executing test

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages