Skip to content

Collaborative Evolutionary Reinforcement Learning for NIPS AI for Prosthetics Challenge 2018

Notifications You must be signed in to change notification settings

ShawK91/nips_prosthetics_18

Repository files navigation

nips_prosthetics_18

Collaborative Evolutionary Reinforcement Learning for NIPS AI for Prosthetics Challenge 2018

################################# Code labels #################################

main.py: Neureovolution learner that generates data --> Data Storage and bootstraps off policies from policy storage

full_pg.py: Off-policy policy gradient trainer using (Advantage) DDPG/TD3 with/or without Trust Region constraints and HER

expert_q.py: Expert Agent that uses an ensemble of critics and actors to take expert decisions (simiular to a simple MPC but using critics instead of a model)

train_vae.py: Trains a Variational Autoencoder Forward Model using data from data storage

core/runner.py: Rollout worker

core/env_wrapper.py: Wrapper around the Prosthetics Opensim env exposing a cleaner interface for RL algo

core/her.py: Implements Hindsight Experience Replay

core/models.py: Actor model

core/neuroevolution.py: Implements Sub-Structured Based Neuroevolution (SSNE) with a dynamic population

core/off_policy_gradient.py: Implements the off_policy_gradient learner (TD3/DDPG) with/or without Advantage functions, Trust Regions and HER

core/reward_shaping.py: Implements Behavioral Reward shaping spanning dynamic/static trajectory-wide shaping functions with and without relaxations

core/ounoise.py: Implements Ornstein–Uhlenbeck process for generating temporally correlated noise

core/vae_fm.py: Variational Autoencoder based Forward model with Inverse kinematic losses

core/mod_utils.py: Helper functions

###################################### Auxiliary scripts: ######################################

viz.py: Visualize policies (rendering)

score_net.py: Benchmark policies without rendering

submission.py: Submit a policy to the AI for Prosthetics server for scoring

q_test.py: Test the q-values encoded by the critic

action_cluster.py: Cluster the actions generated by the actors and analyze them (WIP)

About

Collaborative Evolutionary Reinforcement Learning for NIPS AI for Prosthetics Challenge 2018

Resources

Stars

Watchers

Forks

Packages

No packages published