# RL Environment Setup
### The first practical part of this course focuses on using **Isaac Gym** through **LeggGym** to train a policy based on **Reinforcement Learning (RL)**.
### We start by setting up the environment on an **NVIDIA-powered Ubuntu PC**, configured for reinforcement learning with **Isaac Gym** and **Unitree RL Gym**.

### We have to install some drivers, set up a Conda environment for the policy training, install PyTorch with CUDA, install Isaac Gym, and verify the setup using Unitree RL Gym.

### The System Requirements and GPU Check before ###
### - OS: Ubuntu 18.04 or later (this PC: Ubuntu 22.04)
### - GPU: NVIDIA GPU
### - Driver: Recommended 525+ (ensure installed)

# run the command to check the GPU and driver

In [None]:

nvidia-smi

# New GPUs (like RTX 5070, sm_120) require CUDA 13, but official stable PyTorch builds aren’t available yet.

# Create and Activate Conda Environment

In [None]:

### If not exist, install miniconda

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
~/miniconda3/bin/conda init --all
source ~/.bashrc


### Create and activate the environment:
conda remove --name leggGym-rl --all
conda create -n leggGym-rl python=3.8 -y
conda activate leggGym-rl



# Install PyTorch with CUDA 12.1

In [None]:

conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia

## Specific case
./sh_cmd/install_rl_comp.sh conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia

## Quick check:

python3 -c "import torch; print('PyTorch version:', torch.__version__); print('CUDA available:', torch.cuda.is_available()); print('CUDA version:', torch.version.cuda); print('Current GPU:', torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU only')"
# Or
import torch
print(torch.__version__, torch.cuda.is_available())

## Specific case
./sh_cmd/install_rl_comp.sh python -c "import torch; print(torch.__version__, torch.cuda.is_available())" 
## OR
./sh_cmd/run_cmd_in_conda38.sh python -c "import torch; print(torch.__version__, torch.cuda.is_available())" 

#  Download and Install Isaac Gym

In [None]:

# Reference
https://github.com/unitreerobotics/unitree_rl_gym

## Dowload link
https://developer.nvidia.com/isaac-gym 

 ## Install script

cd isaacgym/python
pip install -e .


## Verify (Run the Isaac Gym example (a window with 1080 balls should open)):
cd examples
python 1080_balls_of_solitude.py

# When running IsaacGym with Python 3.8 in a Conda environment, you might see an error like:
ImportError: libpython3.8.so.1.0: cannot open shared object file

#This happens because some compiled modules (like gym_38.so) need the Python shared library (libpython3.8.so.1.0), and the system cannot find it. 
#Even if the file exists in the Conda environment, Linux’s dynamic linker needs to know where to look via LD_LIBRARY_PATH.

# Temporary fix (per session):
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
python your_script.py

# Permanent fix (Create a file in the environment’s activate.d folder:)
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
nano $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

# Add this line:
export LD_LIBRARY_PATH="$CONDA_PREFIX/lib:$LD_LIBRARY_PATH"


## In case of error, it is necessary to check this file: isaacgym/docs/index.html or if you need to install Python 3.8:
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install libpython3.8 libpython3.8-dev

# Install rsl_rl RL Algorithms the recommended branch is v1.0.2

In [None]:
cd ~/RL/LeggGym
git clone https://github.com/leggedrobotics/rsl_rl.git
cd rsl_rl
git checkout v1.0.2
pip install -e .

# Install unitree_rl_gym (the Unitree RL Gym framework)

In [None]:
cd ~/RL/LeggGym
git clone https://github.com/unitreerobotics/unitree_rl_gym.git
cd unitree_rl_gym
pip install -e .

# Install unitree_sdk2_python for Sim2Real

# Needed only if there is a plan for Sim2Real deployment (physical robot)

In [None]:
cd ~/RL/LeggGym
git clone https://github.com/unitreerobotics/unitree_sdk2_python.git
cd unitree_sdk2_python
pip install -e .

# Training Process

## The recommended environment configuration is as follows:

* Operating System: Windows 8 / Ubuntu 20.04
* GPU: NVIDIA (tested); AMD GPUs unverified
* Driver Version: ≥ 525 (ideally 535)
* Virtual Environment: Created using conda

In [None]:
conda activate leggGym-rl
./sh_cmd/run_cmd_in_conda38.sh

# Basic training and the results are stored under logs inside the unitree_rl_gym folder
python unitree_rl_gym/legged_gym/scripts/train.py --task=g1

# Visualizing with TensorBoard

In [None]:
# Install TensorBoard if needed
pip install tensorboard

# Launch TensorBoard, better in another terminal using python3.9 or later
tensorboard --logdir=logs/g1

# Open browser at http://localhost:6006

# Additional parameters
## To run on CPU add following arguments: --sim_device=cpu, --rl_device=cpu (sim on CPU and rl on GPU is possible).
## To run headless (no rendering) add --headless.

# The following command line arguments override the values set in the config files:

* --task TASK: Task name.
* --resume: Resume training from a checkpoint
* --experiment_name EXPERIMENT_NAME: Name of the experiment to run or load.
* --run_name RUN_NAME: Name of the run.
* --load_run LOAD_RUN: Name of the run to load when resume=True. If -1: will load the last run.
* --checkpoint CHECKPOINT: Saved model checkpoint number. If -1: will load the last checkpoint.
* --num_envs NUM_ENVS: Number of environments to create.
* --seed SEED: Random seed.
* --max_iterations MAX_ITERATIONS: Maximum number of training iterations.

# Advanced Training Options

In [None]:
# Headless mode (faster, no GUI)
python legged_gym/scripts/train.py --task=g1 --headless

# Custom number of environments
python legged_gym/scripts/train.py --task=g1 --num_envs=2048 --headless

# Resume from checkpoint
python legged_gym/scripts/train.py --task=g1 --resume --load_run="Jan01_12-00-00"

# Custom experiment name
python legged_gym/scripts/train.py --task=g1 --experiment_name="g1_walking" --run_name="test_v1"

# Specific GPU device
python legged_gym/scripts/train.py --task=g1 --sim_device=cuda:0 --rl_device=cuda:0

# Set maximum iterations
python legged_gym/scripts/train.py --task=g1 --max_iterations=5000

# Testing Trained Policies (Play Mode)
### The policy inside the logs aAfter training, can be tested with visualization in Isaac Sim to see if it makes sense:
## **NOte: After checking, the play script will create a file called policy_lstm_1.pt as the final policy of the training, and place it at the unitree_rl_gym/logs/g1/exported/policies directory**

In [None]:
# Play with latest checkpoint 
python unitree_rl_gym/legged_gym/scripts/play.py --task=g1 --num_envs=1

# Sim2Sim: validate the policy in Mujoco

In [None]:
python deploy/deploy_mujoco/deploy_mujoco.py {config_name} # here the config_name is the Configuration file; default search path is unitree_rl_gym/deploy/deploy_mujoco/configs/

#Important: place the policy inside the unitree_rl_gym/deploy/pre_train/g1/ and adapt the policy's name in the .yaml file
python unitree_rl_gym/deploy/deploy_mujoco/deploy_mujoco.py g1.yaml

## Possible issue

Exported policy as jit script to:  /home/jean/RL/LeggGym/unitree_rl_gym/logs/g1/exported/policies
^CTraceback (most recent call last):
  File "unitree_rl_gym/legged_gym/scripts/play.py", line 51, in <module>
    play(args)
  File "unitree_rl_gym/legged_gym/scripts/play.py", line 44, in play
    obs, _, rews, dones, infos = env.step(actions.detach())
  File "/home/jean/RL/LeggGym/unitree_rl_gym/legged_gym/envs/base/legged_robot.py", line 59, in step
    self.render()
  File "/home/jean/RL/LeggGym/unitree_rl_gym/legged_gym/envs/base/base_task.py", line 111, in render
    self.gym.draw_viewer(self.viewer, self.sim, True)
KeyboardInterrupt

(leggGym-rl) ~/RL/LeggGym$ python unitree_rl_gym/deploy/deploy_mujoco/deploy_mujoco.py g1.yaml
libGL error: MESA-LOADER: failed to open radeonsi: /usr/lib/dri/radeonsi_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: radeonsi
libGL error: MESA-LOADER: failed to open radeonsi: /usr/lib/dri/radeonsi_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: radeonsi
libGL error: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: swrast
/home/jean/anaconda3/envs/leggGym-rl/lib/python3.8/site-packages/glfw/__init__.py:917: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
ERROR: could not create window

Press Enter to exit ...


### These errors **GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'** and **ERROR: could not create window** from my computer indicates that **MuJoCo (via GLFW/OpenGL)** failed to create an **OpenGL** context because my **OpenGL drivers** are missing or not correctly loaded.

### I use to run this command to check the GPU (either NVIDIA o AMD) lspci | grep VGA, and the result shows:



In [None]:
lspci | grep VGA

# Result 

01:00.0 VGA compatible controller: NVIDIA Corporation Device 28a0 (rev a1)
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 1900 (rev c4)

# The dual-GPU setup can confuse OpenGL and MuJoCo unless the drivers and environment variables are configured correctly

# Then 

nvidia-smi
glxinfo | grep "OpenGL"

# Result 

OpenGL vendor string: AMD
OpenGL renderer string: GFX1103_R1 (gfx1103_r1, LLVM 15.0.7, DRM 3.57, 6.8.0-87-generic)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 23.2.1-1ubuntu3.1~22.04.3

# hybrid AMD/NVIDIA laptops, so AMD GPU: also active for display, and OpenGL is currently bound to AMD (Mesa), then MuJoCo was trying to use AMD’s Mesa driver instead of NVIDIA’s EGL context to create a valid OpenGL context and fails

## I used to create a python venv only for MuJoCo 

In [None]:
## Create a new virtual environnement for MuJoCo
cd ~/RL/LeggGym 
python3.10 -m venv ~/mujoco-rl
source ~/mujoco-rl/bin/activate

# Install dependancies 
pip install mujoco==3.2.3 gym numpy torch==2.3.1 matplotlib opencv-python tqdm
# Then export
export PYTHONPATH=$PYTHONPATH:/home/jean/RL/LeggGym/unitree_rl_gym

# In case to make it permenant add to ~/.bashrc
if [[ "$CONDA_DEFAULT_ENV" == "mujoco-rl" ]]; then
    export PYTHONPATH=/home/jean/RL/LeggGym/unitree_rl_gym:$PYTHONPATH
    export MUJOCO_GL=egl
    export EGL_DEVICE_ID=0
    export DRI_PRIME=1
fi
# And source
source ~/.bashrc

# Then run
python ~/RL/LeggGym/unitree_rl_gym/deploy/deploy_mujoco/deploy_mujoco.py g1.yaml



# =========================================
# Setup MuJoCo environment for LeggedGym
# Universal version for Linux (venv or conda)
# =========================================

# Step 1: Create & activate virtual environment (example using Python venv)
cd "$HOME/RL/LeggGym" || exit
python3.10 -m venv "$HOME/mujoco-rl"
source "$HOME/mujoco-rl/bin/activate"

# Step 2: Install dependencies
pip install mujoco==3.2.3 gym numpy torch==2.3.1 matplotlib opencv-python tqdm

# Step 3: Set PYTHONPATH dynamically
export LEGGM_PATH="$HOME/RL/LeggGym/unitree_rl_gym"
export PYTHONPATH="$LEGGM_PATH:$PYTHONPATH"

# Step 4: Set MuJoCo rendering variables if in mujoco-rl venv or conda env
if [[ -n "$VIRTUAL_ENV" && "$(basename "$VIRTUAL_ENV")" == "mujoco-rl" ]] || \
   [[ -n "$CONDA_DEFAULT_ENV" && "$CONDA_DEFAULT_ENV" == "mujoco-rl" ]]; then
    export MUJOCO_GL=egl
    export EGL_DEVICE_ID=0
    export DRI_PRIME=1
fi

# Step 5: Optional - make permanent by adding to ~/.bashrc
bashrc_entry=$(cat <<'EOF'

# LeggedGym + MuJoCo environment setup
export LEGGM_PATH="$HOME/RL/LeggGym/unitree_rl_gym"
export PYTHONPATH="$LEGGM_PATH:$PYTHONPATH"
if [[ -n "$VIRTUAL_ENV" && "$(basename "$VIRTUAL_ENV")" == "mujoco-rl" ]] || \
   [[ -n "$CONDA_DEFAULT_ENV" && "$CONDA_DEFAULT_ENV" == "mujoco-rl" ]]; then
    export MUJOCO_GL=egl
    export EGL_DEVICE_ID=0
    export DRI_PRIME=1
fi

EOF
)
grep -qxF "$bashrc_entry" ~/.bashrc || echo "$bashrc_entry" >> ~/.bashrc
source ~/.bashrc

# Step 6: Run deployment script
python "$LEGGM_PATH/deploy/deploy_mujoco/deploy_mujoco.py" g1.yaml


# Sim2Real: transfer the training controller to the real robot

## Make sure that:
#### The result from the second SIM (Mujoco) looked correct, before try on the real robot. Otherwise go to training again by tuning the parameters
#### The robot it’s in debug mode

In [None]:
python deploy/deploy_real/deploy_real.py {net_interface} {config_name}

# net_interface: Network card name connected to the robot, e.g., enp3s0
# config_name: Configuration file located in deploy/deploy_real/configs/, e.g., g1.yaml