StarkHacks 2026 — SO-101 ACT pick-and-place on MI300X

Task: "Pick the orange cell and place it into the empty slot of the green module." Stack: SO-101 bimanual teleop → LeRobot v0.4.1 → ACT policy → AMD MI300X → HF Hub. Tracks: Best Use of AMD • Ford Industrial Robotics Grippers.

TL;DR

We built an end-to-end imitation-learning pipeline for a battery-cell pick-and-place task on an SO-101 arm pair, trained the ACT policy on AMD MI300X, and target hardware autonomous playback on the same SO-101 that recorded the demos. Everything runs on AMD silicon — training on MI300X via ROCm, inference on a Radeon 890M iGPU via the same ROCm stack.

Artifacts:

Dataset → wbell7/starkhacks_cell_pickplace (58 episodes, 29,166 frames, 2 cameras)
Policy → wbell7/starkhacks_act_cell (ACT, 52M params; pushed on train completion)
Run → wandb/wrbell7/starkhacks/nbyv34do

Pipeline overview

┌───────────────────┐                           ┌───────────────────┐
│   SO-101 leader   │ ── human teleoperates ──▶ │  SO-101 follower  │
│  (my_leader_arm)  │                           │ (my_follower_arm) │
└───────────────────┘                           └───────────────────┘
                                                          │
                                                 captures │ 2 cams
                                                 30 fps   │ 640×480
                                                          ▼
                                        ┌───────────────────────────────┐
                                        │  top (UGREEN, overhead)       │
                                        │  wrist (ARC, follower wrist)  │
                                        └───────────────────────────────┘
                                                          │
                                                          ▼
                   ┌─────────────────────────────────────────────────────┐
                   │  record_stdin.py wraps lerobot-record               │
                   │   • pynput → stdin (Wayland fix)                    │
                   │   • 3-Enter per episode (start-gate keeps hands     │
                   │     out of frame; crash-durable parquet rotation)   │
                   └─────────────────────────────────────────────────────┘
                                                          │
                                           local parquet  │ + mp4 videos
                                                          ▼
                               ┌─────────────────────────────────────────┐
                               │ ~/.cache/huggingface/lerobot/           │
                               │   local/starkhacks_cell_pickplace/      │
                               └─────────────────────────────────────────┘
                                                          │
                                              push_to_hub │ hf_transfer +
                                                          │ upload_large_folder
                                                          ▼
                     ┌───────────────────────────────────────────────────┐
                     │  🤗  wbell7/starkhacks_cell_pickplace  (public)   │
                     └───────────────────────────────────────────────────┘
                                                          │
                                           snapshot_download (prefetch)
                                                          ▼
                              ┌──────────────────────────────────────────┐
                              │  MI300X VM — /root/.cache/huggingface/   │
                              │    lerobot/wbell7/starkhacks_cell_...    │
                              └──────────────────────────────────────────┘
                                                          │
                                     lerobot-train (ACT)  │
                                     50k steps · bf16 ·   │
                                     MIOpen+TunableOp     │
                                                          ▼
                              ┌──────────────────────────────────────────┐
                              │  ACT policy, 52M params → ckpt           │
                              └──────────────────────────────────────────┘
                                                          │
                                              push_to_hub │
                                                          ▼
                          ┌───────────────────────────────────────────────┐
                          │  🤗  wbell7/starkhacks_act_cell  (public)     │
                          └───────────────────────────────────────────────┘
                                                          │
                                  lerobot-record --policy.path
                                                          ▼
                          ┌───────────────────────────────────────────────┐
                          │  SO-101 follower drives itself — H34 ship     │
                          └───────────────────────────────────────────────┘

Background

Why SO-101? Affordable open-source teleop arm (Hugging Face + The Robot Studio). Usable out-of-the-box via LeRobot, with a leader-follower topology that produces clean demonstration data.

Why ACT? Action Chunking Transformer (Zhao et al., 2023) is the workhorse for small-data bimanual/single-arm imitation learning in LeRobot. Handles temporal multimodality via action chunks; trains well on hundreds of demos.

Why MI300X? Training on the hackathon's AMD track. 192 GB HBM3 on one GPU trivialises any memory pressure for ACT. With the ROCm bf16 + MIOpen + TunableOp recipe, we measured 0.209 s/step at batch 32 — a 3.4× speedup over the fp32 baseline on the same hardware. A 50k-step run is ~3 hours.

Why battery-cell pick-and-place? Maps directly to the brownfield handling that the Ford Industrial Robotics Grippers track cares about. Orange cell + green module is a stand-in for a factory line task (cylindrical cell insertion into a battery pack).

Hardware / software stack

Layer	Component	Notes
Leader arm	SO-101 (id `my_leader_arm`)	`/dev/so101_leader` → ttyACM1
Follower arm	SO-101 (id `my_follower_arm`)	`/dev/so101_follower` → ttyACM0
Top camera	UGREEN (USB UVC `0c45:2283`)	`/dev/cam_ugreen`, overhead
Wrist camera	ARC (USB UVC `05a3:9230`)	`/dev/cam_arc`, on follower wrist
Local host	Ryzen AI + Radeon 890M (gfx1150)	ROCm 6.x, 23 GiB RAM
Cloud training	MI300X VF, 192 GB HBM3	DO droplet (IP stored locally), ROCm 7.2
Framework	LeRobot v0.4.1 + PyTorch 2.7.1+rocm6.3	Python 3.10 local, 3.12 on VM
Tracking	wandb project `starkhacks`	entity `wrbell7`
Artifacts	Hugging Face Hub (public)	`wbell7/*`

Repository layout

~/starkhacks/
├─ README.md              # this file
├─ CLAUDE.md              # conventions + gotchas for future Claude sessions
├─ ROADMAP.md             # live phase tracker, ship-point checklist, done-log
├─ scripts/               # numbered runbook, 00_ → 08_
│  ├─ 00_anti_chaos.sh       # udev rules, remove brltty
│  ├─ 01_find_port.sh        # identify which arm is on which ttyACM*
│  ├─ 02_teleop.sh           # H12 ship — leader→follower mirror
│  ├─ 02a_teleop_raw.sh      # low-level teleop debug
│  ├─ 02b_teleop_cams.sh     # teleop + camera preview
│  ├─ 03_pipeline_sanity.sh  # ACT smoke on public SO-101 dataset
│  ├─ 04_record.sh           # 50 episodes, cameras (top, wrist)
│  ├─ 04b_validate_dataset.sh # post-record integrity + summary
│  ├─ 04c_view_episode.sh    # visual playback of one episode
│  ├─ 05_replay.sh           # H24 ship — open-loop replay
│  ├─ 06_train_smoke.sh      # 2k-step sanity train on our data
│  ├─ 07_train_full.sh       # full train (local iGPU fallback)
│  ├─ 08_eval.sh             # H34 ship — policy drives follower
│  ├─ record_stdin.py        # lerobot-record wrapper: stdin + start-gate + visual banners + durability
│  └─ README.md              # the runbook's own quickstart
├─ cloud/                 # MI300X Developer Cloud recipes
│  ├─ 00_mi300x_bootstrap.sh    # env (torch rocm, ffmpeg 7, lerobot)
│  ├─ 01_train_act_mi300x.sh    # full ACT recipe with the bf16+tune flags
│  └─ 01a_smoke_public.sh       # 200-step smoke on public aloha dataset
├─ amd_hackathon/         # AMD-track-specific materials, reference notebooks
├─ logs/                  # run logs + watchers (HF upload, train watcher, etc.)
└─ outputs/ (on MI300X VM)   # training checkpoints, tensorboard dumps

Installation (from scratch, for third parties)

The repo contains our code + glue + vendored AMD hackathon reference materials. Two external pieces must still be obtained: lerobot (pinned commit) and the ROCm + PyTorch stack. Scripts hardcode ~/starkhacks and ~/lerobot, so clone to those exact paths.

Prerequisites

Ubuntu 24.04 (or similar) with Wayland session
Python 3.10 via conda/miniforge
2× SO-101 arms (leader + follower), 2× USB cameras
HuggingFace + wandb accounts; AMD Developer Cloud for MI300X training

1. Clone this repo

git clone https://github.com/Garrett-R16/SH26_MindFlayer.git ~/starkhacks

2. Clone lerobot at the pinned commit

git clone https://github.com/huggingface/lerobot.git ~/lerobot
cd ~/lerobot
git checkout -b v0.4.1 a5b29d43

3. Create the conda env and install Python deps

conda create -n lerobot python=3.10 -y
conda activate lerobot
cd ~/lerobot
pip install -e '.[feetech]'
cd ~/starkhacks
pip install -r requirements.txt

For the ROCm-specific PyTorch build on a Radeon iGPU, install torch==2.7.1+rocm6.3 via the ROCm index first — otherwise pip will pull the CUDA wheel from PyPI (see Gotchas).

4. System dependencies

FFmpeg 7+ is required by lerobot for video encoding:

sudo add-apt-repository ppa:ubuntuhandbook1/ffmpeg7 -y
sudo apt update && sudo apt install -y ffmpeg

5. Authenticate HuggingFace + wandb

huggingface-cli login   # write-scope token
wandb login

6. Set up udev symlinks for stable device paths

cd ~/starkhacks
# Edit scripts/00_anti_chaos.sh with your arms' serial numbers first, then:
sudo bash scripts/00_anti_chaos.sh

This creates /dev/so101_follower, /dev/so101_leader, /dev/cam_ugreen, /dev/cam_arc.

How to reproduce, from zero

0. Local box prep (one time)

# Identify ports, write udev, remove brltty
./scripts/01_find_port.sh
# edit 00_anti_chaos.sh with the two serials, then:
sudo ./scripts/00_anti_chaos.sh
# Verify teleop (ship H12)
./scripts/02_teleop.sh

1. Record a dataset

# 50 episodes, two cameras. Three-Enter cycle per episode (start, stop, end-reset).
./scripts/04_record.sh

# If the process crashed partway: resume from where you left off
RESUME=1 NUM_EPISODES=<how-many-more> ./scripts/04_record.sh

# Verify integrity after recording
./scripts/04b_validate_dataset.sh

2. Replay to confirm the data is controllable (ship H24)

./scripts/05_replay.sh

3. Push dataset to the Hub

python -c "
from lerobot.datasets.lerobot_dataset import LeRobotDataset
d = LeRobotDataset('local/starkhacks_cell_pickplace',
                   root='$HOME/.cache/huggingface/lerobot/local/starkhacks_cell_pickplace')
d.repo_id = 'wbell7/starkhacks_cell_pickplace'
d.push_to_hub(tags=['lerobot','so101','starkhacks-2026'], private=False,
              upload_large_folder=True)
"

4. Train on MI300X

# On the cloud VM, first time:
bash cloud/00_mi300x_bootstrap.sh
# Sanity-check on a public dataset (a few minutes, covers MIOpen autotune):
bash cloud/01a_smoke_public.sh
# Full run on our data (~3 hours):
bash cloud/01_train_act_mi300x.sh

5. Evaluate on hardware (ship H34)

# Back on the local box:
./scripts/08_eval.sh

Performance recipe — how we got 3.4× on MI300X

From ROADMAP.md's benchmark log (same policy, same dataset, same batch, one variable at a time):

Config	s/step	Speedup	Notes
fp32 baseline, batch 32	0.715	1.0×	cold MIOpen
bf16 only	0.583	1.23×	via `ACCELERATE_MIXED_PRECISION=bf16`
bf16 + MIOpen + TunableOp, cold	1.862	0.38×	autotune tax
bf16 + MIOpen + TunableOp, warm	0.209	3.42×	caches persist at `/root/.miopen`

The full env recipe lives in cloud/01_train_act_mi300x.sh:

export ACCELERATE_MIXED_PRECISION=bf16
export MIOPEN_FIND_MODE=3
export MIOPEN_FIND_ENFORCE=3
export MIOPEN_USER_DB_PATH=/root/.miopen
export MIOPEN_CUSTOM_CACHE_DIR=/root/.miopen
export PYTORCH_TUNABLEOP_ENABLED=1
export PYTORCH_TUNABLEOP_TUNING=1
export TORCH_BLAS_PREFER_HIPBLAS_LT=1
export HSA_NO_SCRATCH_RECLAIM=1
export GPU_MAX_HW_QUEUES=2

First run at a given batch size pays ~5 min autotune; reruns are warm.

Gotchas we hit (and fixed)

pynput can't capture keys under Wayland. Workaround: scripts/record_stdin.py wraps lerobot-record, replaces the pynput listener with a stdin reader (<Enter> / n / q).
Hands in frame on the first recorded frames. The reset-phase Enter was also the start-next-episode Enter. record_stdin.py now gates every new episode on a second explicit Enter, with a big terminal banner (no speaker on this box — visual feedback only).
Out-of-memory crashes corrupted parquet footers mid-record twice, losing 15 and then 6 episodes. Two defenses added:
- --display_data=false (rerun in-memory buffer was the leak) + --dataset.num_image_writer_processes=1 (PNG writer into a subprocess).
- meta/info.json: data_files_size_in_mb=1 and monkeypatched metadata_buffer_size=1, so the parquet writer rotates — and thus finalises its footer — after ≈ every episode. A future crash costs 1 episode, not 10.
upload_folder wedges on wifi swap. Old sockets stuck in CLOSE-WAIT for ~15 min. Kill + restart (files already committed to the hub are skipped on retry). We now use upload_large_folder=True + HF_HUB_ENABLE_HF_TRANSFER=1 for parallel chunked uploads.
wandb.login(key=…) rejects the wandb_v1_… token format (40-char hex check is legacy). WANDB_API_KEY as an env var bypasses that check and is what wandb.init() actually reads.
lerobot's deps pulled torch from PyPI over our ROCm wheel. Install torch first, then install lerobot with pip install -c torch_constraint.txt to pin the ROCm build.
Ubuntu 24.04 .bashrc returns early for non-interactive shells, so exports there never fire for SSH-invoked commands. Creds go in /etc/environment + /etc/profile.d/starkhacks_creds.sh.
--policy.push_to_hub=false is required locally — HF push is not configured on the local box. Omitting it makes training abort at the final checkpoint.
Camera key order is load-bearing. top, wrist exactly (UGREEN overhead, ARC on wrist). Must match between record and inference or SmolVLA-class policies fail silently; ACT likely the same.
wrist_roll calibration clipping. If teleop feels clamped, re-sweep both arms through their full wrist-roll range during calibration.

Automation glue built during this run

scripts/record_stdin.py — lerobot-record wrapper (stdin + banners + start gate + per-episode parquet rotation).
/tmp/upload_watcher.sh — polls local upload, fires VM prefetch when done.
/root/prefetch_and_train.sh (on VM) — waits for smoke to finish cleanly, prefetches the dataset from the Hub, then kicks off the full 50k-step train.
/tmp/train_watcher.sh — polls the VM training every 90 s, raises a notify-send popup on crash or completion, logs step transitions.

Links

Dataset: https://huggingface.co/datasets/wbell7/starkhacks_cell_pickplace
Policy (pushed at train-end): https://huggingface.co/wbell7/starkhacks_act_cell
Training run: https://wandb.ai/wrbell7/starkhacks/runs/nbyv34do
Phase tracker: ROADMAP.md
Epic plan (full strategy PDF): ~/Downloads/epic_plan.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StarkHacks 2026 — SO-101 ACT pick-and-place on MI300X

TL;DR

Pipeline overview

Background

Hardware / software stack

Repository layout

Installation (from scratch, for third parties)

Prerequisites

1. Clone this repo

2. Clone lerobot at the pinned commit

3. Create the conda env and install Python deps

4. System dependencies

5. Authenticate HuggingFace + wandb

6. Set up udev symlinks for stable device paths

How to reproduce, from zero

0. Local box prep (one time)

1. Record a dataset

2. Replay to confirm the data is controllable (ship H24)

3. Push dataset to the Hub

4. Train on MI300X

5. Evaluate on hardware (ship H34)

Performance recipe — how we got 3.4× on MI300X

Gotchas we hit (and fixed)

Automation glue built during this run

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
amd_hackathon		amd_hackathon
checkpoints		checkpoints
cloud		cloud
scripts		scripts
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
ROADMAP.md		ROADMAP.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

StarkHacks 2026 — SO-101 ACT pick-and-place on MI300X

TL;DR

Pipeline overview

Background

Hardware / software stack

Repository layout

Installation (from scratch, for third parties)

Prerequisites

1. Clone this repo

2. Clone lerobot at the pinned commit

3. Create the conda env and install Python deps

4. System dependencies

5. Authenticate HuggingFace + wandb

6. Set up udev symlinks for stable device paths

How to reproduce, from zero

0. Local box prep (one time)

1. Record a dataset

2. Replay to confirm the data is controllable (ship H24)

3. Push dataset to the Hub

4. Train on MI300X

5. Evaluate on hardware (ship H34)

Performance recipe — how we got 3.4× on MI300X

Gotchas we hit (and fixed)

Automation glue built during this run

Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages