Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
action_policy
agent
configuration
environment
function_approximator
gcloud
memory
optimizer
retro-contest
trainer
util
.DS_Store
.gitignore
DQN.py
IQN-Dueling.py
IQN.py
README.md
__init__.py
environment.yml
load_model.py
run.sh
setup.sh
writeup.md

README.md

jeju-dl-camp-2018

Authors: Valliappa Chockalingam and Rishab Gargeya

Code Overview

  • Trainer

    • train.py: Execute this file to run the entire project.
      • Flags:
      • Configuration Script: Location of configuration file.
    • Manager Class:
      • Self Variables: Env, Agent.
      • Run Method: Train multiple episodes. Visualize distributions (saves graph) with distagent.
  • Configuration

    • All variabes for configuration: Network, Environment, Agent, Optimizer, Experience Replay.
    • Configuration file is propagated through trainer.py into each submodule folder (Agent, Optimizer, Memory, Environment).
  • Agent

    • Agent creates the Network, Optimizer, Memory and explores the Environment.
    • Types: Agent, Categorical Agent, Quantile Regression, VAE
    • Methods: Act, Greedly Learning, Distribution, Learn
  • Action Policy

    • Implements Policies
    • Methods: Policy, Epsilon Greedy
  • Optimizer

    • Implements Optimizers for Network
    • Methods: Adam
    • Features: Gradient Clipping
  • Memory

    • Implements Experience Replay
    • Methods: Add, Sample Size
  • Environment

    • Wrapper for OpenAI Gym
    • Methods: Step, Reset, Render, Num_Actions, Observation_Dims
  • Util

    • Util Functions
    • Methods: TF Copy OP
  • GCP Config

    • Config files for running experiments on GCP
    • setup.py, config.yml, submit.sh : To run GCP CMLE ./submit.sh
  • Function Approximator

    • Networks to learn relationships. Agent creates the network,
    • Head: Action Layer for Policy or Value.
    • Network: Algorithmic Function Approximator. Generic MLP / CNN implementation.
    • Types: Categorical DQN, Quantile Regression DQN, Implicit Quantile Networks (IQN)
  • environment.yml : dependencies list

  • run on GCP: nohup xvfb-run -s "-screen 0 1400x900x24" python -m trainer.train params_*.json &

This was supported by Deep Learning Camp Jeju 2018 which was organized by TensorFlow Korea User Group. We also thank our mentors Yu-Han Liu and Taehoon Kim.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.