<a href="https://colab.research.google.com/github/dhruv0000/neural-robot-dynamics/blob/main/train_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural Robot Dynamics Training on Colab

This notebook demonstrates how to setup the environment, generate a dataset, and train the NeRD model.

In [9]:
# 1. Setup Environment
!git clone https://github.com/dhruv0000/neural-robot-dynamics.git
%cd neural-robot-dynamics
!pip install -r requirements.txt
!pip install warp-lang
!pip install rl_games

Cloning into 'neural-robot-dynamics'...
remote: Enumerating objects: 457, done.[K
remote: Counting objects: 100% (457/457), done.[K
remote: Compressing objects: 100% (334/334), done.[K
remote: Total 457 (delta 110), reused 425 (delta 79), pack-reused 0 (from 0)[K
Receiving objects: 100% (457/457), 17.29 MiB | 17.49 MiB/s, done.
Resolving deltas: 100% (110/110), done.
Filtering content: 100% (11/11), 202.03 MiB | 78.45 MiB/s, done.
/content/neural-robot-dynamics/train/neural-robot-dynamics
Collecting pyglet==2.1.6 (from -r requirements.txt (line 2))
  Using cached pyglet-2.1.6-py3-none-any.whl.metadata (7.7 kB)
Collecting ipdb (from -r requirements.txt (line 3))
  Using cached ipdb-0.13.13-py3-none-any.whl.metadata (14 kB)
Collecting h5py==3.11.0 (from -r requirements.txt (line 4))
  Using cached h5py-3.11.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.5 kB)
Collecting pyyaml==6.0.2 (from -r requirements.txt (line 5))
  Using cached PyYAML-6.0.2-cp312-cp312

In [10]:
# 2. Generate Dataset
# We generate a smaller dataset for demonstration purposes.

%cd generate

# Generate Training Data
!python generate_dataset_contact_free.py --env-name Cartpole --num-transitions 10000 --dataset-dir ../data/datasets/ --dataset-name trajectory_len-100_train.hdf5 --trajectory-length 100 --num-envs 64 --seed 0

# Generate Validation Data
!python generate_dataset_contact_free.py --env-name Cartpole --num-transitions 2000 --dataset-dir ../data/datasets/ --dataset-name trajectory_len-100_valid.hdf5 --trajectory-length 100 --num-envs 64 --seed 10

%cd ..

/content/neural-robot-dynamics/train/neural-robot-dynamics/generate
Warp 1.8.0 initialized:
   CUDA Toolkit 12.8, Driver 12.4
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "Tesla T4" (15 GiB, sm_75, mempool enabled)
   Kernel cache:
     /root/.cache/warp/1.8.0
[96m [NeuralEnvironment] Creating abstract contact environment: Cartpole. [0m
Creating 64 environments: 100% 64/64 [00:00<00:00, 198.69it/s]
Module warp.sim.integrator_featherstone 18b3327 load on device 'cuda:0' took 21390.81 ms  (compiled)
Module envs.abstract_contact_environment 8e8d790 load on device 'cuda:0' took 550.49 ms  (compiled)
Module integrators.integrator_neural ee402cd load on device 'cuda:0' took 758.75 ms  (compiled)
[91m [NeuralEnvironment] Created a DUMMY Neural Integrator. [0m
  0% 0/10000 [00:00<?, ?it/s]Module utils.warp_utils 294c46a load on device 'cuda:0' took 654.62 ms  (compiled)
Module warp.sim.articulation 770a52a load on device 'cuda:0' took 16015.36 ms  (compiled)
12800it [00:17, 861

In [11]:
# 3. Train Baseline Model (Transformer)
%cd train

import yaml
import os

# Load default config
with open('cfg/Cartpole/transformer.yaml', 'r') as f:
    cfg = yaml.safe_load(f)

# Override dataset paths to point to the generated data
cfg['algorithm']['dataset']['train_dataset_path'] = '../data/datasets/Cartpole/trajectory_len-100_train.hdf5'
cfg['algorithm']['dataset']['valid_datasets']['exp_trajectory'] = '../data/datasets/Cartpole/trajectory_len-100_valid.hdf5'

# Reduce training parameters for quick demonstration
cfg['algorithm']['num_epochs'] = 5
cfg['algorithm']['num_iters_per_epoch'] = 100

# Save the modified config
with open('colab_config.yaml', 'w') as f:
    yaml.dump(cfg, f)

# Run training
!python train.py --cfg colab_config.yaml --logdir ../data/logs/baseline

/content/neural-robot-dynamics/train/neural-robot-dynamics/train
Warp 1.8.0 initialized:
   CUDA Toolkit 12.8, Driver 12.4
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "Tesla T4" (15 GiB, sm_75, mempool enabled)
   Kernel cache:
     /root/.cache/warp/1.8.0
2025-11-23 22:43:51.544652: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1763937831.563765    2581 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1763937831.569606    2581 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1763937831.584377    2581 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more tha

In [12]:
# 4. Train Mamba Model
# We use the same config but add the --novelty mamba flag
!python train.py --cfg colab_config.yaml --novelty mamba --logdir ../data/logs/mamba

Warp 1.8.0 initialized:
   CUDA Toolkit 12.8, Driver 12.4
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "Tesla T4" (15 GiB, sm_75, mempool enabled)
   Kernel cache:
     /root/.cache/warp/1.8.0
2025-11-23 22:45:08.473715: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1763937908.492860    3362 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1763937908.498738    3362 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1763937908.513662    3362 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1763937908.513690    3362 computation_placer.

In [13]:
# 5. Train Unroll Model
# We use the same config but add the --novelty unroll flag
!python train.py --cfg colab_config.yaml --novelty unroll --logdir ../data/logs/unroll

Warp 1.8.0 initialized:
   CUDA Toolkit 12.8, Driver 12.4
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "Tesla T4" (15 GiB, sm_75, mempool enabled)
   Kernel cache:
     /root/.cache/warp/1.8.0
2025-11-23 22:49:00.396747: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1763938140.415896    4776 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1763938140.421748    4776 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1763938140.436626    4776 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1763938140.436666    4776 computation_placer.

In [14]:
# 6. RL Evaluation
# Evaluate the trained models using the pretrained RL policy
import os
import glob
import subprocess

def find_latest_model(model_type):
    base_log_dir = f'../data/logs/{model_type}'
    if not os.path.exists(base_log_dir):
        print(f'Log dir not found: {base_log_dir}')
        return None
    dirs = [d for d in glob.glob(os.path.join(base_log_dir, '*')) if os.path.isdir(d)]
    if not dirs:
        print(f'No logs found for {model_type}')
        return None
    latest_dir = sorted(dirs)[-1]
    model_path = os.path.join(latest_dir, 'nn', 'best_eval_model.pt')
    if not os.path.exists(model_path):
        print(f'Model file not found: {model_path}')
        return None
    return model_path

models = ['baseline', 'mamba', 'unroll']
for model in models:
    print(f'Evaluating {model.capitalize()} Model...')
    model_path = find_latest_model(model)
    if model_path:
        print(f'Using model: {model_path}')

        # Convert paths to absolute to avoid issues with subprocess cwd
        abs_model_path = os.path.abspath(model_path)
        abs_playback_path = os.path.abspath('../pretrained_models/RL_policies/Cartpole/0/nn/CartpolePPO.pth')
        abs_rl_cfg_path = os.path.abspath('../eval/eval_rl/cfg/Cartpole/cartpole.yaml')

        cmd = [
            'python', 'run_rl.py',
            '--rl-cfg', abs_rl_cfg_path,
            '--playback', abs_playback_path,
            '--num-envs', '1',
            '--num-games', '5',
            '--env-mode', 'neural',
            '--nerd-model-path', abs_model_path
        ]

        try:
            subprocess.run(cmd, cwd='../eval/eval_rl', check=True, capture_output=True, text=True)
        except subprocess.CalledProcessError as e:
            print(f'Error running RL evaluation for {model}:')
            print('STDOUT:', e.stdout)
            print('STDERR:', e.stderr)
            raise e
    else:
        print(f'Skipping {model} evaluation.')

Evaluating Baseline Model...
Using model: ../data/logs/baseline/11-23-2025-22-43-58/nn/best_eval_model.pt
Evaluating Mamba Model...
Using model: ../data/logs/mamba/11-23-2025-22-45-14/nn/best_eval_model.pt
Evaluating Unroll Model...
Using model: ../data/logs/unroll/11-23-2025-22-49-06/nn/best_eval_model.pt


# 7. Quantitative Analysis

We now perform the quantitative analysis as described in the paper experiments.
We evaluate:
1. **Long-Horizon Passive Motion**: Accuracy of the trained NeRD models over 100, 500, and 1000 steps.
2. **RL Policy Evaluation**: Performance of the pretrained RL policy using the NeRD models.

In [15]:
# 7.1 Long-Horizon Passive Motion Evaluation
# We evaluate the Baseline, Mamba, and Unroll models on Cartpole for 100, 500, and 1000 steps.

import os
import glob

def find_latest_model(model_type):
    base_log_dir = f'../data/logs/{model_type}'
    if not os.path.exists(base_log_dir):
        return None
    dirs = [d for d in glob.glob(os.path.join(base_log_dir, '*')) if os.path.isdir(d)]
    if not dirs:
        return None
    latest_dir = sorted(dirs)[-1]
    model_path = os.path.join(latest_dir, 'nn', 'best_eval_model.pt')
    if not os.path.exists(model_path):
        return None
    return model_path

models = ['baseline', 'mamba', 'unroll']
horizons = [100, 500, 1000]

for model_name in models:
    model_path = find_latest_model(model_name)
    if not model_path:
        print(f"Skipping {model_name} (model not found)")
        continue

    print(f"\n{'='*20} Evaluating {model_name.capitalize()} Model {'='*20}")
    for horizon in horizons:
        print(f"\n--- Horizon: {horizon} ---")
        # We use !python to ensure output is printed to the cell
        !python ../eval/eval_passive/eval_passive_motion.py \
            --env-name Cartpole \
            --model-path {model_path} \
            --env-mode neural \
            --num-envs 2048 \
            --num-rollouts 2048 \
            --rollout-horizon {horizon} \
            --seed 500



--- Horizon: 100 ---
Warp 1.8.0 initialized:
   CUDA Toolkit 12.8, Driver 12.4
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "Tesla T4" (15 GiB, sm_75, mempool enabled)
   Kernel cache:
     /root/.cache/warp/1.8.0
Traceback (most recent call last):
  File "/content/neural-robot-dynamics/train/neural-robot-dynamics/train/../eval/eval_passive/eval_passive_motion.py", line 97, in <module>
    model, robot_name = torch.load(args.model_path, map_location='cuda:0')
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/serialization.py", line 1529, in load
    raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. 
	(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from 

In [16]:
# 7.2 RL Policy Evaluation (Quantitative)
# We evaluate the policy using the trained NeRD models.
# We run for more games (2048) to get a statistically significant result as in the paper.

for model_name in models:
    model_path = find_latest_model(model_name)
    if not model_path:
        continue

    print(f"\n{'='*20} RL Evaluation: {model_name.capitalize()} Model {'='*20}")

    # Absolute paths for safety
    abs_model_path = os.path.abspath(model_path)
    abs_playback_path = os.path.abspath('../pretrained_models/RL_policies/Cartpole/0/nn/CartpolePPO.pth')
    abs_rl_cfg_path = os.path.abspath('../eval/eval_rl/cfg/Cartpole/cartpole.yaml')

    # Run RL evaluation
    !python ../eval/eval_rl/run_rl.py \
        --rl-cfg {abs_rl_cfg_path} \
        --playback {abs_playback_path} \
        --num-envs 2048 \
        --num-games 2048 \
        --env-mode neural \
        --nerd-model-path {abs_model_path}


Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.
2025-11-23 22:55:03.005215: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1763938503.024862    6862 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1763938503.030778    6862 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1763938503.045544    6862 computation_placer.cc:177] computat