# 2024 NeurIPS - MyoChallenge

## <center> Welcome to the [**2024 NeurIPS - MyoChallenge:  Physiological Dexterity and Agility in Enhanced Humans**](https://sites.google.com/view/myosuite/myochallenge/myochallenge-2024) </center>

# 1. Setting the environment

In [1]:
!pip install myosuite==2.5.0
!pip install stable-baselines3[extra]  --quiet
!pip install tqdm  --quiet
!pip install sk-video
%env MUJOCO_GL=egl
import mujoco


Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-api-core 2.14.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0.dev0,>=3.19.5, but you have protobuf 5.28.0 which is incompatible.
google-cloud-speech 2.22.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 5.28.0 which is incompatible.
googleapis-common-protos 1.61.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0.

### Define a method to show the rendering inside the Colab

In [2]:
from IPython.display import HTML
from base64 import b64encode

def show_video(video_path, video_width = 400):

  video_file = open(video_path, "r+b").read()

  video_url = f"data:video/mp4;base64,{b64encode(video_file).decode()}"
  return HTML(f"""<video autoplay width={video_width} controls><source src="{video_url}"></video>""")


### All the `MyoSuite` imports needed to run this tutorial

In [3]:
import myosuite
from myosuite.utils import gym
import skvideo.io
import numpy as np
import os
from stable_baselines3 import PPO
from tqdm import tqdm_notebook as tqdm

  from pkg_resources import resource_stream, resource_exists
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(

pygame 2.6.0 (SDL 2.28.4, Python 3.10.12)
Hello from the pygame community. https://www.pygame.org/contribute.html
MyoSuite:> Registering Myo Envs


## Creating the MyoChallenge '24 Environment and train your model

### Loading the Locomotion Challenge Env

As a basic example, we use PPO to train the locomotion environment on a task to walk over different types of terrain. Some helper functions are also provided for the environment initialization, so feel free to explore them.

In [4]:
# Create the MyoChallenge environment and train the model with PPO

"""
Preset environment modes are as given as ['init', 'random', 'osl_init']:
These can be activated by passing an argument "reset_type"

1. init - Resets the model in a neutral standing pose. OSL State Machine is initialized with 'e_stance'

2. random - Resets the model in a random pose. OSL State Machine initialized with 'e_stance'.
IMPT: State Machine not guranteed to be stable, since there joint positions do not match the thresholds in the state transitions.

3. osl_init - Resets the model with a pose sampled from a sample gait trajectory.
IMPT: State Machine initialized according to the pose sampled. Sampled poses are within state transition thresholds. More stable than "random"
"""

env = gym.make('myoChallengeRunTrackP1-v0', reset_type='init')

model = PPO("MlpPolicy", env, verbose=0)
model.learn(total_timesteps=100)

[36m    MyoSuite: A contact-rich simulation suite for musculoskeletal motor control
        Vittorio Caggiano, Huawei Wang, Guillaume Durandau, Massimo Sartori, Vikash Kumar
        L4DC-2019 | https://sites.google.com/view/myosuite
    [0m


  logger.warn(f"{pre} is not within the observation space.")
  logger.warn(f"{pre} is not within the observation space.")


<stable_baselines3.ppo.ppo.PPO at 0x76f77551e2c0>

In [5]:
# evaluate policy
all_rewards = []
for _ in tqdm(range(5)): # Randomization over different terrain types
  ep_rewards = []
  done = False
  obs = env.reset()
  while not done:
      obs = env.obsdict2obsvec(env.obs_dict, env.obs_keys)[1]
      # get the next action from the policy
      action, _ = model.predict(obs, deterministic=True)
      # take an action based on the current observation
      obs, reward, done, info, _ = env.step(action)
      ep_rewards.append(reward)
  all_rewards.append(np.sum(ep_rewards))
print(f"Average reward: {np.mean(all_rewards)} over 20 episodes")

Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`
  for _ in tqdm(range(5)): # Randomization over different terrain types


  0%|          | 0/5 [00:00<?, ?it/s]

  logger.warn(
  logger.warn(
  logger.warn(


Average reward: 3.5469735733090557 over 20 episodes


## Rendering your policy

You can render your policy on the task with the built-in renderer below.

In [6]:
# Render trained policy
frames = []
for _ in tqdm(range(5)): # Randomization over different terrain types
  env.reset()
  ep_rewards = []
  done = False
  obs = env.reset()
  for _ in range(20):
      obs = env.obsdict2obsvec(env.obs_dict, env.obs_keys)[1]
      # get the next action from the policy
      action, _ = model.predict(obs, deterministic=True)
      geom_1_indices = np.where(env.sim.model.geom_group == 1)
      env.sim.model.geom_rgba[geom_1_indices, 3] = 0
      frame = env.sim.renderer.render_offscreen(
                        width=400,
                        height=400,
                        camera_id=1)
      frames.append(frame)
      # take an action based on the current observation
      obs, reward, done, info, _ = env.step(action)

env.close()

os.makedirs('videos', exist_ok=True)
# make a local copy
skvideo.io.vwrite('videos/test_policy.mp4', np.asarray(frames),outputdict={"-pix_fmt": "yuv420p"})
show_video('videos/test_policy.mp4')

Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`
  for _ in tqdm(range(5)): # Randomization over different terrain types


  0%|          | 0/5 [00:00<?, ?it/s]

  logger.warn(
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
  self._proc.stdin.write(vid.tostring())
 