**Overview**

This notebook explains how to run experiments using Sample-Factory, as well as upload and download models from the Hugging Face Hub. We will use OpenAI Gym's Lunar Lander environment as an example.


**Step 1: Install Dependencies**

To run this notebook, we need to install both `sample-factory` and the Lunar Lander environment using `pip`. Additional setup is required to use the Hugging Face Hub and instructions can be found at https://alex-petrenko.github.io/sample-factory/get-started/huggingface/ 

In [None]:
!pip install sample-factory
!pip install gym[box2d]

In [1]:
# Imports and setup

from sf_examples.train_gym_env import make_gym_env_func, parse_custom_args
from sample_factory.envs.env_utils import register_env
from sample_factory.train import run_rl
from sample_factory.enjoy import enjoy
from IPython.display import Video
from sample_factory.huggingface.huggingface_utils import load_from_hf

  from .autonotebook import tqdm as notebook_tqdm


**Step 2: Create Lunar Lander Environemnt and Specify Training Parameters**

First, we need to create the Lunar Lander training environment. We can do so using Sample-Factory's `make_gym_env_func` to register the environment.

We also need to specify some parameters for our experiment. All experiments need to specify `algo` which is the algorithm used to train, `env` which is the environment we are running on, and `experiment` which is where to save the model after running the experiment.

Other training parameters can be specified as well. A full list of parameters can be found by running Sample-Factory with the `--help` flag.

In [3]:
# Register Lunar Lander environment
register_env("LunarLanderContinuous-v2", make_gym_env_func)

# Initialize basic arguments for running the experiment. These parameters are required to run any experiment
# The parameters can also be specified in the command line
experiment_name = "lunar_lander_example"
argv = ["--algo=APPO", "--env=LunarLanderContinuous-v2", f"--experiment={experiment_name}"]
cfg = parse_custom_args(argv=argv, evaluation=False)

# The following parameters can be changed from the default
cfg.reward_scale = 0.05
cfg.train_for_env_steps = 1000000
cfg.gae_lambda = 0.99
cfg.num_workers = 20
cfg.num_envs_per_worker = 6
cfg.seed = 0

**Step 3: Run Experiment**

Next, we train the experiment using the parameters we specified above.

In [3]:
run_rl(cfg)

[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]


0

**Step 4: Evaluating the Model and Uploading to the Hugging Face Hub**

After training the model, we can use the `enjoy` function to see how well it did. `enjoy` also allows us to generate a video with the `--save_video` flag, as well as upload the model to the Hub using the `--push_to_hub` flag. Make sure you also specify `--hf_repository` with your Hugging Face username and repository name in the form `<username>/<repo_name>`

In [4]:
## Change this to your Hugging Face username
username = "andrewzhang505"

enjoy_args = ["--no_render", "--max_num_episodes=5", "--push_to_hub", f"--hf_repository={username}/{experiment_name}", "--save_video"]
cfg = parse_custom_args(argv=argv+enjoy_args, evaluation=True)
enjoy(cfg)

ffmpeg version 5.0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 10.3.0 (conda-forge gcc 10.3.0-16)
  configuration: --prefix=/home/conda/feedstock_root/build_artifacts/ffmpeg_1657987167490/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1657987167490/_build_env/bin/x86_64-conda-linux-gnu-cc --cxx=/home/conda/feedstock_root/build_artifacts/ffmpeg_1657987167490/_build_env/bin/x86_64-conda-linux-gnu-c++ --nm=/home/conda/feedstock_root/build_artifacts/ffmpeg_1657987167490/_build_env/bin/x86_64-conda-linux-gnu-nm --ar=/home/conda/feedstock_root/build_artifacts/ffmpeg_1657987167490/_build_env/bin/x86_64-conda-linux-gnu-ar --disable-doc --disable-openssl --enable-demuxer=dash --enable-hardcoded-tables --enable-libfreetype --enable-libfontconfig --enable-libopenh264 --enable-gnu

(0, 93.18016967773437)

In [5]:
Video(f"./train_dir/{experiment_name}/replay.mp4", embed=True)


**Step 5: Downloading Models from the Hub**

You can also download other models from the Hub for your own use as well. The following will download the model to `./train_dir/sf2-lunar-lander/` and you can use the model by specifying `--experiment=sf2-lunar-lander`

In [7]:
load_from_hf("./train_dir", "andrewzhang505/sf2-lunar-lander")

download_args = ["--algo=APPO", "--env=LunarLanderContinuous-v2", "--experiment=sf2-lunar-lander", "--no_render", "--max_num_episodes=1", "--save_video"]
cfg = parse_custom_args(argv=download_args, evaluation=True)
enjoy(cfg)

Video(f"./train_dir/sf2-lunar-lander/replay.mp4", embed=True)

**Additional Resources**

For more information on using Sample-Factory, check out our website at https://alex-petrenko.github.io/sample-factory/ and our github at https://github.com/alex-petrenko/sample-factory/