To install create a conda environment:
$ conda create -n rhucrl python=3.7
$ conda activate rhucrl$ pip install -e .[test,logging,experiments]For Mujoco (license required) Run:
$ pip install -e .[mujoco]On clusters run:
$ sudo apt-get install -y --no-install-recommends --quiet build-essential libopenblas-dev python-opengl xvfb xauth$ python exps/run $ENVIRONMENT $AGENTFor help, see
$ python exps/run.py --helpinstall pre-commit with
$ pip install pre-commit
$ pre-commit installRun pre-commit with
$ pre-commit run --all-filesTo run locally circleci run:
$ circleci config process .circleci/config.yml > process.yml
$ circleci local execute -c process.yml --job testEnvironment goals are passed to the agent through agent.set_goal(goal). If a goal moves during an episode, then include it in the observation space of the environment. If a goal is to follow a trajectory, it might be a good idea to encode it in the reward model.
Continuous Policies are "bounded" between [-1, 1] via a tanh transform unless otherwise defined. For environments with action spaces with different bounds, up(down)-scale the action after sampling it.