Skip to content
No description, website, or topics provided.
Python
Branch: master
Clone or download
Latest commit f078711 Dec 18, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md Update README.md Dec 18, 2018
requirements.txt Updated requirements and training script Dec 17, 2018
train.py Updated requirements and training script Dec 17, 2018

README.md

OpenAI-final-project

Introducing the GoalGridWorld Environment project that was inspired by the grounded language learning (here is the reference paper https://arxiv.org/abs/1706.06551). This project was executed within the OpenAI scholarship program. The GoalGridWorld environment is a matrix where an agent navigates according to commands [go, avoid] to hit or avoid target cells that are represented as randomly displaced pairs of colored objects [triangle, square, circle/red, green, blue]. Basically, there are 18 possible commands but only 8 actual "words". Thanks to an NLP-like approach in this case, an agent can handle more complex/combinatorial commands.

Feel free to use the code to run your experiments.

I want to thank Alec Radford for his mentorship and introducing me to NLP and Joshua Achiam for his valuable advice on reinforcement learning and creating the Spinning Up (https://spinningup.openai.com/en/latest/index.html) that was extremely helpful for my project.

Setup

  • conda install --yes --file requirements.txt
  • Source activate spinningup
  • Python3 train.py
You can’t perform that action at this time.