**Overview**

This notebook explains how to run experiments using Sample-Factory, as well as upload and download models from the Hugging Face Hub. We will use OpenAI Gym's Lunar Lander environment as an example.


**Step 1: Install Dependencies**

To run this notebook, we need to install both `sample-factory` and the Lunar Lander environment using `pip`. Additional setup is required to use the Hugging Face Hub and instructions can be found at https://alex-petrenko.github.io/sample-factory/get-started/huggingface/ 

In [None]:
!pip install sample-factory
!pip install gym[box2d]

**Step 2: Create Lunar Lander Environemnt and Specify Training Parameters**

First, we need to create the Lunar Lander training environment. We can do so using Sample-Factory's `make_gym_env_func` to register the environment.

We also need to specify some parameters for our experiment. All experiments need to specify `algo` which is the algorithm used to train, `env` which is the environment we are running on, and `experiment` which is where to save the model after running the experiment.

Other training parameters can be specified as well. A full list of parameters can be found by running Sample-Factory with the `--help` flag.

In [3]:
from sf_examples.train_gym_env import make_gym_env_func, parse_custom_args
from sample_factory.envs.env_utils import register_env

# Register Lunar Lander environment
register_env("LunarLanderContinuous-v2", make_gym_env_func)

# Initialize basic arguments for running the experiment. These parameters are required to run any experiment
# The parameters can also be specified in the command line
experiment_name = "lunar_lander_example"
argv = ["--algo=APPO", "--env=LunarLanderContinuous-v2", f"--experiment={experiment_name}"]
cfg = parse_custom_args(argv=argv, evaluation=False)

# The following parameters can be changed from the default
cfg.reward_scale = 0.05
cfg.train_for_env_steps = 5000000
cfg.gae_lambda = 0.99
cfg.num_workers = 20
cfg.num_envs_per_worker = 6
cfg.seed = 0

**Step 3: Run Experiment**

Next, we train the experiment using the parameters we specified above.

In [9]:
from sample_factory.train import run_rl

run_rl(cfg)

[33m[2022-10-26 22:36:54,887][12030] Saved parameter configuration for experiment lunar_lander_example not found![0m
[33m[2022-10-26 22:36:54,889][12030] Starting experiment from scratch![0m
[36m[2022-10-26 22:36:54,894][12030] Experiment dir /home/andrew_huggingface_co/sample-factory/train_dir/lunar_lander_example already exists![0m
[36m[2022-10-26 22:36:54,894][12030] Resuming existing experiment from /home/andrew_huggingface_co/sample-factory/train_dir/lunar_lander_example...[0m
[36m[2022-10-26 22:36:54,895][12030] Weights and Biases integration disabled[0m
[37m[1m[2022-10-26 22:36:54,897][12030] Environment var CUDA_VISIBLE_DEVICES is 0
[0m
[36m[2022-10-26 22:36:55,856][20411] Env info: EnvInfo(obs_space=Dict('obs': Box([-1.5       -1.5       -5.        -5.        -3.1415927 -5.
 -0.        -0.       ], [1.5       1.5       5.        5.        3.1415927 5.        1.
 1.       ], (8,), float32)), action_space=Box(-1.0, 1.0, (2,), float32), num_agents=1, gpu_actions=Fal

2

[36m[2022-10-26 22:52:49,952][20457] Stopping RolloutWorker_w4...[0m
[36m[2022-10-26 22:52:49,953][20459] Stopping RolloutWorker_w5...[0m


**Step 4: Evaluating the Model and Uploading to the Hugging Face Hub**

After training the model, we can use the `enjoy` function to see how well it did. `enjoy` also allows us to generate a video with the `--save_video` flag, as well as upload the model to the Hub using the `--push_to_hub` flag. Make sure you also specify `--hf_repository` with your Hugging Face username and repository name in the form `<username>/<repo_name>`

In [2]:
from sample_factory.enjoy import enjoy

## CHANGE THIS
username = "andrewzhang505"

enjoy_args = ["--no_render", "--max_num_episodes=5", "--push_to_hub", f"--hf_repository={username}/{experiment_name}", "--save_video"]
cfg = parse_custom_args(argv=argv+enjoy_args, evaluation=True)
enjoy(cfg)

[33m[2022-10-27 19:50:18,805][14511] Loading existing experiment configuration from /home/andrew_huggingface_co/sample-factory/train_dir/lunar_lander2/cfg.json[0m
[36m[2022-10-27 19:50:18,807][14511] Adding new argument 'fps'=0 that is not in the saved config file![0m
[36m[2022-10-27 19:50:18,808][14511] Adding new argument 'eval_env_frameskip'=None that is not in the saved config file![0m
[36m[2022-10-27 19:50:18,808][14511] Adding new argument 'no_render'=True that is not in the saved config file![0m
[36m[2022-10-27 19:50:18,809][14511] Adding new argument 'save_video'=True that is not in the saved config file![0m
[36m[2022-10-27 19:50:18,809][14511] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file![0m
[36m[2022-10-27 19:50:18,810][14511] Adding new argument 'video_name'=None that is not in the saved config file![0m
[36m[2022-10-27 19:50:18,810][14511] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config 

(0, 126.57706604003906)

In [1]:
from IPython.display import Video

Video(f"./train_dir/{experiment_name}/replay.mp4")

**Step 5: Downloading Models from the Hub**

You can also download other models from the Hub for your own use as well. The following will download the model to `./train_dir/sf2-lunar-lander/` and you can use the model by specifying `--experiment=sf2-lunar-lander`

In [5]:
from sample_factory.huggingface.huggingface_utils import load_from_hf

load_from_hf("./train_dir", "andrewzhang505/sf2-lunar-lander")

download_args = ["--algo=APPO", "--env=LunarLanderContinuous-v2", "--experiment=sf2-lunar-lander", "--no_render", "--max_num_episodes=1", "--save_video"]
cfg = parse_custom_args(argv=download_args, evaluation=True)
enjoy(cfg)

Video(f"./train_dir/sf2-lunar-lander/replay.mp4")

/home/andrew_huggingface_co/sample-factory/./train_dir/sf2-lunar-lander is already a clone of https://huggingface.co/andrewzhang505/sf2-lunar-lander. Make sure you pull the latest changes with `repo.git_pull()`.
[37m[1m[2022-10-27 21:07:10,839][31193] The repository andrewzhang505/sf2-lunar-lander has been cloned to ./train_dir/sf2-lunar-lander[0m
[33m[2022-10-27 21:07:10,862][31193] Loading existing experiment configuration from /home/andrew_huggingface_co/sample-factory/train_dir/sf2-lunar-lander/cfg.json[0m
[36m[2022-10-27 21:07:10,864][31193] Overriding arg 'experiment' with value 'sf2-lunar-lander' passed from command line[0m
[36m[2022-10-27 21:07:10,864][31193] Adding new argument 'fps'=0 that is not in the saved config file![0m
[36m[2022-10-27 21:07:10,865][31193] Adding new argument 'eval_env_frameskip'=None that is not in the saved config file![0m
[36m[2022-10-27 21:07:10,865][31193] Adding new argument 'no_render'=True that is not in the saved config file![0m
[3

**Additional Resources**

For more information on using Sample-Factory, check out our website at https://alex-petrenko.github.io/sample-factory/ and our github at https://github.com/alex-petrenko/sample-factory/