Hackathon submission from 0xmandy.
Pirate is a vision-servoing stack for Unitree Go2 that makes the robot hunt down objects like a pirate hunting treasure. The robot scans the environment with its onboard camera, runs YOLO detection to find a target (bottle or ball), closes in with bearing-locked visual control, pushes the object, then celebrates with sport gestures.
The core loop is:
camera frame → YOLO detection → bearing + distance estimate
→ approach burst → realign → push → Hello → backup → Scrape
DimOS supplies the robot runtime (WebRTC connection, camera/odom/lidar streams, move commands, sport gesture API). Pirate adds the vision-servoing control layer and replayable decision trace evidence on top — no DimOS core files modified.
- Project repo: https://github.com/0xmandy/pirate
- DimOS: https://github.com/dimensionalOS/dimos
- Decision traces: https://github.com/0xmandy/pirate/tree/main/artifacts/decision_traces
- Sensor stream profile: https://github.com/0xmandy/pirate/blob/main/artifacts/sensor_stream_profile.json
- Final demo video: https://github.com/0xmandy/pirate/tree/main/artifacts/showcase
- YOLO visual servoing: detect bottle/ball → extract bearing → drive toward it
- Closed-loop realign: if yaw error > 8° at standoff, back off and re-approach
- Push phase: forward burst at 0.35 m/s × 0.5 s, displacement measured from ball pose
- Full gesture sequence: approach → push → Hello (api_id=1016) → backup → Scrape (api_id=1029)
- MuJoCo sim proof: ball freejoint physics, push displacement 0.118–0.196 m confirmed
- Real hardware smoke test: odom ~19 Hz, camera ~10-12 Hz, lidar ~7-8 Hz
- Per-run decision trace JSON: yaw error, realign decision, push outcome, ball displacement
| Metric | Value |
|---|---|
| Push displacement — run 001 | 0.118 m (yaw err: 11.5° → realign → 7.1°) |
| Push displacement — run 003 | 0.196 m (yaw err: 9.6° → realign → 1.5°) |
| Vision approach success rate | 100% (2/2 valid runs) |
| Premature contact abort rate | ~33% (1/3); known fix: offset approach target_y |
| Real odom rate | ~19 Hz |
| Real camera rate | ~10–12 Hz |
| Confirmed gestures on hardware | Hello ✅ Scrape ✅ |
| Push threshold | 0.05 m (both runs exceed by 2–4×) |
DimOS already exposes everything the robot needs: streams, replay data, blueprints, camera frames, odometry, and Go2 control skills. Pirate adds a thin vision-and-action evidence layer on top:
DimOS observes and executes
YOLO guides the approach bearing
decision traces explain each run
That boundary keeps DimOS as the hardware/control system while making robot decisions inspectable and reproducible.
# Clone and syntax-check all scripts
make check
# Dry-run (no robot, no connection)
make dry-run
# MuJoCo sim — ball scene (sports ball, class 32)
make sim-run
# Real robot — water bottle
python examples/go2_bottle_demo_sequence.py \
--backend webrtc-ap --ball-distance 0.8 \
--execute --i-understand-this-is-real-robotSee artifacts/decision_traces/ for per-run JSON evidence:
| File | Outcome | Key fact |
|---|---|---|
run_001_success.json |
SUCCESS | yaw 11.5° → realign → 7.1° → push 0.118 m |
run_002_abort.json |
ABORT | ball dev 0.813 m during approach → premature contact |
run_003_success.json |
SUCCESS | yaw 9.6° → realign → 1.5° → push 0.196 m |
This PR intentionally does not vendor the full project into DimOS. The full source, generated artifacts, and demo video live in the project repo linked above.