Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
envs
example_checkpoints
tools
README.md
__init__.py
config_maml.py
eval_maml.py
maml_rl.py
networks.py
networks_test.py
policies.py
policies_test.py
requirements.txt
rollout_service.py
run.sh
train_maml.py

README.md

NoRML: No-Reward Meta Learning

This repository contains code released for the paper NoRML: No-Reward Meta Learning.

First, install all dependencies by

pip install -r norml/requirements.txt

The HalfCheetah environment requires Mujoco, so make sure you also followed the proper instructions to install mujoco and mujoco-py.

You can start training from scratch by

python -m norml.train_maml --config MOVE_POINT_ROTATE_MAML --logs maml_checkpoints

Where config should be one of the configs defined in config_maml.py. The config string is of the type {ENV_NAME}_{ALG_NAME}, where ENV_NAME is one of MOVE_POINT_ROTATE, MOVE_POINT_ROTATE_SPARSE, CARTPOLE_SENSOR, HALFCHEETAH_MOTOR and ALG_NAME is one of DR, MAML, MAML_OFFSET, MAML_LAF, NORML as mentioned in the paper.

MOVE_POINT_ROTATE are fast to train and can converge within minutes. Training MOVE_POINT_ROTATE_SPARSE and CARTPOLE_SENSOR can take as long as a day. The Halfcheetah training was done via parallelized workers on a cloud server, and can take a long time on a single machine.

We also provide a convenient script to evaluate the training performance:

python -m norml.eval_maml \
--model_dir norml/example_checkpoints/move_point_rotate_sparse/norml/all_weights.ckpt-991 \
--output_dir maml_eval_results \
--render=True \
--num_finetune_steps 1 \
--test_task_index 0 \
--eval_finetune=True

You should be able to see states/actions logs and an optional rendered video in the maml_eval_results folder.

Citing

If you use this code in your research, please cite the following paper:

Yang, Y., Caluwaerts, K., Iscen, A., Tan, J. & Finn, C. (2019). NoRML: No-Reward Meta Learning.

@article{yang2019norml,
  title={NoRML: No-Reward Meta Learning},
  author={Yang, Yuxiang and Caluwaerts, Ken and Iscen, Atil and Tan, Jie and Finn, Chelsea},
  journal={arXiv preprint arXiv:1903.01063},
  year={2019}
}

Disclaimer: This is not an official Google product.

You can’t perform that action at this time.