[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NeuracoreAI/neuracore/blob/docs/colab-notebook-bigym/examples/getting_started_with_neuracore.ipynb)

# ü§ñ **Getting started with Neuracore**

This guide walks you step-by-step through collecting robot demonstration data in simulation and streaming it to **Neuracore** for storage, visualization, model training, and deployment.

In particular, you will use the [Bigym](https://github.com/NeuracoreAI/bigym/) simulation benchmark (MuJoCo-based) with a Unitree H1 humanoid, replay expert actions, and record joint states and camera images to your Neuracore account.

Run the cells in order from top to bottom. You can watch your robot and collected data appear in the [Neuracore dashboard](https://www.neuracore.com/) as you go.

---

### **Step-by-step guide for data collection, visualization, model training and deployment:**
1. **Log in** to Neuracore and use the Python client to talk to the Neuracore API.
2. **Register a robot** with Neuracore by providing an MJCF model, and see it show up on the **Robots** page and in the dashboard overview.
3. **Create a dataset** in Neuracore and find it on the **Data** page.
4. **Run episodes** in one of Bigym simulation environments by replaying expert actions.
5. **Record and upload data**: record episodes on Neuracore with `nc.start_recording()`, and visualize logged data on the web dashboard.
6. **Retrieve a previously collected dataset** on a Python client with `nc.get_dataset()` and `dataset.synchronize()`
7. **Train and deploy models** on your dataset using the web dashboard or the Python SDK

In [None]:
#@markdown ### **Installing dependencies**
#@markdown - Neuracore from PyPI
#@markdown - Bigym from GitHub
#@markdown
#@markdown This may take a few minutes.

# Installing neuracore
!pip install neuracore==7.14.1 > /dev/null 2>&1

# Installing rust --- Why? Safetensors 0.3.3 (required by Bigym) has no pre-built wheel on Colab, therefore it builds from source. That needs Rust.
!curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y -q > /dev/null 2>&1
import os
os.environ["PATH"] = os.path.expanduser("~/.cargo/bin") + os.pathsep + os.environ.get("PATH", "")

# Installing bigym --- Commit at the time of creation: 79420b0abad6fa7d7dfd98569cb485f73a2c8e3b
!git clone https://github.com/NeuracoreAI/bigym.git /content/bigym > /dev/null 2>&1
!pip install -e /content/bigym > /dev/null 2>&1

In [None]:
# @markdown ### **Restart runtime session**
# @markdown After installing dependencies, restart your runtime session via **Runtime** > **Restart session** (CTRL+M .).

In [None]:
# @markdown ### **Imports**
# @markdown **Troubleshooting**: if you get import errors, make you sure you "Restart runtime session" after pip installing everything above. See cell above or press "CTRL+M ."

import os
# Headless OpenGL for Colab (no display server)
os.environ["MUJOCO_GL"] = "egl"

import zipfile
import gdown
import time
from pathlib import Path

import numpy as np
import neuracore as nc
import bigym
from demonstrations.demo_store import DemoStore
from demonstrations.utils import Metadata

In [None]:
# @markdown ### **Configuration and utils**
# @markdown Load Bigym helpers (joint names, observation converters, headless make_env).

# Bigym env config
from bigym.action_modes import JointPositionActionMode
from bigym.envs.reach_target import ReachTarget
from bigym.utils.observation_config import CameraConfig, ObservationConfig

FREQUENCY = 20
DT = 1.0 / FREQUENCY

JOINT_NAMES = [
    "left_shoulder_pitch",
    "left_shoulder_roll",
    "left_shoulder_yaw",
    "left_elbow",
    "right_driver_joint",
    "right_coupler_joint",
    "right_spring_link_joint",
    "right_follower_joint",
    "left_driver_joint",
    "left_coupler_joint",
    "left_spring_link_joint",
    "left_follower_joint",
    "left_wrist",
    "right_shoulder_pitch",
    "right_shoulder_roll",
    "right_shoulder_yaw",
    "right_elbow",
    "right_driver_joint",
    "right_coupler_joint",
    "right_spring_link_joint",
    "right_follower_joint",
    "left_driver_joint",
    "left_coupler_joint",
    "left_spring_link_joint",
    "left_follower_joint",
    "right_wrist",
    "pelvis_x",
    "pelvis_y",
    "pelvis_rz",
    "h1_floating_base",
]
JOINT_ACTUATORS = [
    "floating_base_x",
    "floating_base_y",
    "floating_base_z",
    "left_shoulder_pitch",
    "left_shoulder_roll",
    "left_shoulder_yaw",
    "left_elbow",
    "left_wrist",
    "right_shoulder_pitch",
    "right_shoulder_roll",
    "right_shoulder_yaw",
    "right_elbow",
    "right_wrist",
    "gripper_left",
    "gripper_right",
]


def obs_to_joint_dict(obs, joint_names):
    """Proprioception to joint position/velocity dicts (qpos..., qvel...)."""
    p = obs["proprioception"].astype(float)
    mid = len(p) // 2
    qpos = dict(zip(joint_names, p[:mid]))
    qvel = dict(zip(joint_names, p[mid:]))
    return qpos, qvel


def obs_to_imgs(obs):
    """Extract camera images from observation (CHW -> HWC)."""
    return {
        "head": obs["rgb_head"].transpose(1, 2, 0),
        "left_wrist": obs["rgb_left_wrist"].transpose(1, 2, 0),
        "right_wrist": obs["rgb_right_wrist"].transpose(1, 2, 0),
    }


def action_to_joint_action_dict(action, joint_names):
    """Convert action array to joint position dict."""
    return dict(zip(joint_names, action))


def make_env():
    """ReachTarget env with render_mode='rgb_array' for headless Colab."""
    return ReachTarget(
        action_mode=JointPositionActionMode(floating_base=True, absolute=True),
        observation_config=ObservationConfig(
            cameras=[
                CameraConfig("head", resolution=(84, 84)),
                CameraConfig("left_wrist", resolution=(84, 84)),
                CameraConfig("right_wrist", resolution=(84, 84)),
            ]
        ),
        control_frequency=FREQUENCY,
        render_mode="rgb_array",
    )

In [None]:
#@markdown ### **Utility function to run one episode**
#@markdown Function to run a single episode on the ReachTarget Bigym environment by replaying expert actions and log the data to Neuracore.

def run_episode(
    episode_idx: int,
    record: bool,
    demo_store: DemoStore,
) -> bool:
    """Run one demonstration episode and optionally record it."""
    print(f"\n=== Starting Episode {episode_idx} ===")

    env = make_env()
    metadata = Metadata.from_env(env)

    demo = demo_store.get_demos(metadata, amount=1, frequency=FREQUENCY)[0]

    obs, info = env.reset(seed=demo.seed)
    success = False
    t = time.time()

    try:
        if record:
            nc.start_recording()

        nc.log_custom_1d("my_custom_data", np.array([1, 2, 3, 4, 5]), timestamp=t)
        qpos, qvel = obs_to_joint_dict(obs, JOINT_NAMES)
        nc.log_joint_positions(qpos, timestamp=t)
        nc.log_joint_velocities(qvel, timestamp=t)
        images = obs_to_imgs(obs)
        nc.log_rgb("head", images["head"], timestamp=t)
        nc.log_language(
            "instruction",
            "Move two plates simultaneously from one draining rack to the other.",
            timestamp=t,
        )

        for step in demo._steps:
            obs, reward, terminated, truncated, info = env.step(step.info["demo_action"])
            print(f"Reward={reward}, terminated={terminated}, truncated={truncated}, info={info}")
            t += DT
            nc.log_custom_1d("my_custom_data", np.array([1, 2, 3, 4, 5]), timestamp=t)
            qpos, qvel = obs_to_joint_dict(obs, JOINT_NAMES)
            nc.log_joint_positions(qpos, timestamp=t)
            nc.log_joint_velocities(qvel, timestamp=t)
            images = obs_to_imgs(obs)
            nc.log_rgb("head", images["head"], timestamp=t)
            joint_action = action_to_joint_action_dict(step.info["demo_action"], JOINT_ACTUATORS)
            nc.log_joint_target_positions(joint_action)
            if terminated and not truncated:
                success = True
                print("Episode terminated successfully.")
                break
            if truncated:
                print("Episode truncated (likely time limit).")
                break
    finally:
        if record:
            if success:
                print("Episode successful ‚Üí finalizing recording...")
                nc.stop_recording(wait=True)
            else:
                print("Episode failed ‚Üí cancelling recording...")
                nc.cancel_recording()
        env.close()

    print(f"=== Episode {episode_idx} done | success={success} ===")
    return success

In [None]:
#@markdown ### **(One time) Download Bigym demonstrations**
#@markdown This cell only downloads the **Bigym demonstration data** for the **ReachTarget** Bigym environment.
#@markdown
#@markdown We use a **small subset** of the original Bigym demos (ReachTarget env only) for the sake of running this Notebook quickly.

demo_store = DemoStore()

# Override cache path so we can use e.g. Google Drive. Mount Drive first if using /content/drive/MyDrive/.
BIGYM_DEMO_BASE_PATH = Path("/content/bigym_demos")
SUBSET_DEMO_FILENAME = "demonstrations_subset_v2_v0.9.0"
demo_store._cache_path = BIGYM_DEMO_BASE_PATH / SUBSET_DEMO_FILENAME

# Skip download if already cached
if demo_store._cache_path.exists() and demo_store.cached:
    print("Bigym demos already cached on disk; skipping download.")
else:
    print("Downloading ReachTarget demo subset from Google Drive (one-time per session)...")
    demo_store._cache_path.parent.mkdir(parents=True, exist_ok=True)
    subset_zip_url = "https://drive.google.com/uc?id=1kLBOYRC489_fcMKCwFc4XOmBGIdkfsbV"
    subset_zip_path = BIGYM_DEMO_BASE_PATH / "demonstrations_subset_v2_v0.9.0.zip"
    gdown.download(subset_zip_url, str(subset_zip_path), quiet=True)
    with zipfile.ZipFile(subset_zip_path, "r") as zf:
        zf.extractall(demo_store._cache_path.parent)
    subset_zip_path.unlink(missing_ok=True)
    demo_store.cached = True
    print("ReachTarget demos ready.")

In [None]:
# @markdown ### **Step 0: Login to Neuracore**
# @markdown Login to Neuracore to record data.
# @markdown If you don't have an account, sign up for free at [neuracore.com](https://neuracore.com) before logging in.

# The first time, this will prompt you to type in email and password for your account.
# You can safely repeat this command multiple times to make sure you're logged in.
nc.login()

# Alternatively, from CLI
# !neuracore login

# For future logins, you may also save your API KEY for automatic login
# !export NEURACORE_API_KEY=<nrc_XXXX>

In [None]:
#@markdown ### **Step 1: Connect the robot to Neuracore**
#@markdown This step registers the H1 robot using a MuJoCo .xml configuration with your Neuracore account. Run this cell.
#@markdown
#@markdown **After running:** Open your Neuracore dashboard by logging into [Neuracore](https://www.neuracore.com/). Shortly after running this cell, the robot will appear among **Connected Robots** on the "**Overview**" page of your dashboard. You can view all registered robots on the "**Robots**" page.
#@markdown
#@markdown <br/>
#@markdown <img src="https://drive.google.com/uc?id=1IdN3BYlanJsNGNsb5mi9pqFT9XD6_DJD" width="800" />

BIGYM_ROOT = "/content/bigym"
mjcf_path = Path(BIGYM_ROOT) / "bigym" / "envs" / "xmls" / "h1" / "h1.xml"

robot = nc.connect_robot(
    robot_name="Mujoco UnitreeH1 Example",
    mjcf_path=str(mjcf_path),
    overwrite=True,
)

print(f"Connected to robot: {robot.id}")
print(f"Organisation ID: {nc.get_current_org()}")

In [None]:
# @markdown ### **Step 2: Create a dataset**
# @markdown Create a new dataset in Neuracore that will hold your recorded robot data. Run this cell.
# @markdown
# @markdown **After running:** In the dashboard, go to the "**Data**" page. You will see the new dataset listed. The dataset is currently empty and has no recordings in it.
# @markdown Move to the next step to start populating the dataset with robot data.
# @markdown
# @markdown <br/>
#@markdown <img src="https://drive.google.com/uc?id=15UDgWZmSadvPSri_nK3-Dhc9QyeJKyNy" width="800" />

nc.create_dataset(
    name="Getting started Bigym Example",
    description="Data collection on the Bigym simulation environments",
)
print("Dataset created.")

In [None]:
#@markdown ### **Step 3: Run simulation episodes and record data**
#@markdown This step runs the ReachTarget environment for a number of episodes. Under the hood it uses Bigym to retrieve and replay expert actions for this task; at each step, it logs joint positions, velocities, camera images, and joint targets to Neuracore.
#@markdown
#@markdown **After running:** In the dataset page in your Neuracore dashboard, you will now see an item with a randomly-generated name under the "**Recordings**" section. You can run this cell multiple times to add more recordings to the same dataset.
#@markdown
#@markdown <br/>
#@markdown <img src="https://drive.google.com/uc?id=1w-wOH81_jdJsX5XvKZU4t7ugkTTkaibb" width="800" />

NUM_EPISODES = 1
RECORD = True

success_count = 0
try:
    for episode_idx in range(NUM_EPISODES):
        success = run_episode(
            episode_idx=episode_idx, record=RECORD, demo_store=demo_store
        )
        if success:
            success_count += 1
            print(f"Successful demos: {success_count}/{episode_idx + 1}")
except KeyboardInterrupt:
    print("\nInterrupted by user.")
    if RECORD:
        nc.cancel_recording()
finally:
    print(f"\nFinished running {NUM_EPISODES} episodes ‚Üí {success_count} succeeded.")

In [None]:
#@markdown ### **Step 4: Visualize collected data**
#@markdown On the "**Data**" page of your Neuracore dashboard, click on the latest recording to open the **data visualizer**.
#@markdown
#@markdown Here, you can replay the episode and view the URDF animation, the RGB images, and the joint recordings.
#@markdown
#@markdown <br/>
#@markdown
#@markdown <img src="https://drive.google.com/uc?id=1TRhdLBwUPQTLlquxCk-XERl7kErfDer5" width="800" />

In [None]:
# @markdown ### **Step 5: Retrieve dataset from Neuracore cloud to a local client**
# @markdown You can pull data from Neuracore using `nc.get_dataset("my dataset")`.
# @markdown
# @markdown Run this cell to retrieve the previously recorded dataset "Getting started Bigym Example" into this notebook client.
# @markdown You can then visualize the data locally, or train your models with it.

from neuracore_types import DataType, RobotDataSpec
import imageio
from IPython.display import Video, display

DATASET_NAME = "Getting started Bigym Example"
dataset = nc.get_dataset(DATASET_NAME)

data_types_to_sync = [DataType.JOINT_POSITIONS, DataType.RGB_IMAGES]
robot_data_spec: RobotDataSpec = {}
for robot_id in dataset.robot_ids:
    full_spec = dataset.get_full_data_spec(robot_id)
    robot_data_spec[robot_id] = {
        dt: full_spec[dt] for dt in data_types_to_sync if dt in full_spec
    }

synced = dataset.synchronize(frequency=FREQUENCY, robot_data_spec=robot_data_spec)
print(f"Dataset '{DATASET_NAME}': {len(dataset)} episodes")

# First episode summary + video from head camera
first_ep = synced[0]
steps_list = list(first_ep)

print(f"First episode: {len(steps_list)} steps, duration {first_ep.end_time - first_ep.start_time:.2f} s")

frames = [step[DataType.RGB_IMAGES]["head"].frame for step in steps_list]
video_path = "/tmp/first_episode.mp4"
imageio.mimsave(video_path, frames, fps=FREQUENCY)
print("First episode video (head camera):")
display(Video(video_path, embed=True, width=420))

### **Step 6: Train a model on your dataset**

1. ‚òÅÔ∏è **Cloud training (using Neuracore credits)**

   You can start a training job on **Neuracore's cloud** in two ways:

   - **From the dashboard UI**: go to the **Training** page on your [web dashboard](https://www.neuracore.com/), click **+ New training job**, select your dataset, algorithm, and resources, then launch.
   - **From Python**: using `nc.start_training_run(...)`. For example:

     ```python
     job_data = nc.start_training_run(
       name="MyTrainingJob",
       dataset_name="Getting started Bigym Example",
       algorithm_name="diffusion_policy",
       num_gpus=1,
       frequency=50,
       ...
     )
     ```

2. üíª **Local training**

   To train **locally** on your own GPU, call the training script directly as documented in the [training docs](https://github.com/NeuracoreAI/neuracore/blob/main/docs/training.md). Here's a minimal example:

   ```bash
   python -m neuracore.ml.train algorithm=diffusion_policy dataset_name="Getting started Bigym Example"
   ```

---

### **Step 7: Model deployment and inference**

After your training job has finished, you can **use the trained model for inference** either **locally** or remotely via a **cloud endpoint**.

1. ‚òÅÔ∏è **Cloud inference (using Neuracore credits)**

   To run inference on Neuracore's servers:

   - Go to the **Endpoints** tab on your [web dashboard](https://www.neuracore.com/) and create a deployment endpoint for a trained model.
   - Once the endpoint is **active**, use `nc.policy_remote_server(...)` from Python:

     ```python
     try:
         policy = nc.policy_remote_server("MyEndpointName")
         predictions = policy.predict(timeout=5)
     except nc.EndpointError:
         print("Endpoint not available. Please start it at neuracore.com/dashboard/endpoints")
     ```

2. üíª **Local inference (run model on local GPU)**

   You can load a trained model locally via `policy.predict(...)` (see the [tutorial](https://github.com/NeuracoreAI/neuracore/blob/main/docs/tutorial.md#model-inference) and [local endpoint examples](https://github.com/NeuracoreAI/neuracore/tree/main/examples)):

   - **By training run name** (pull model from Neuracore):

     ```python
     from neuracore_types import DataSpec, DataType

     MODEL_INPUT_ORDER: DataSpec = {
        DataType.JOINT_POSITIONS: JOINT_NAMES[:-1],
        DataType.RGB_IMAGES: CAMERA_NAMES,
     }

     MODEL_OUTPUT_ORDER: DataSpec = {
        DataType.JOINT_TARGET_POSITIONS: JOINT_ACTUATORS,
     }

     policy = nc.policy(
         train_run_name="MyTrainingJob",
         model_input_order=MODEL_INPUT_ORDER,
         model_output_order=MODEL_OUTPUT_ORDER,
     )

     predictions = policy.predict(timeout=5)
     ```

   - **By local model file**:

     ```python
     policy = nc.policy(
         model_file="/path/to/model.nc.zip",
         model_input_order=MODEL_INPUT_ORDER,
         model_output_order=MODEL_OUTPUT_ORDER,
     )
     predictions = policy.predict(timeout=5)
     ```