A cinematography skill library for the MARS robot. The robot's 6-joint arm becomes a programmable camera crane, gimbal, and jib — controlled by a natural-language agent that translates director commands into physical camera moves.
Talk to it like a film director. It moves like a DP.
The MARS robot has a 2MP RGB camera (1080p, 160° FOV, 30 FPS) mounted on the gripper of a 6-joint arm with ~40cm of reach. Instead of treating the arm as a manipulator, MovieBot treats it as a camera rig — a crane, gimbal, and dolly system that a director can command with natural language.
Director: "Give me a hero shot, low angle, circle them slowly"
↓
Cinematographer Agent (NLP → skill selection)
↓
orbit_subject(camera_tilt_degrees=-15, direction="ccw", duration_seconds=12)
↓
Arm positions camera low + angled up, base orbits subject in waypoints
↓
Raw footage → Runway/Veo video-to-video for final polish
┌─────────────────────────────────────────────────────┐
│ Director (human, natural language) │
└──────────────────────┬──────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────┐
│ Cinematographer Agent │
│ Translates intent → skill calls │
│ Confirms framing, suggests post-processing │
└──────────────────────┬──────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────┐
│ Skill Library (15 skills) │
│ ┌─────────────────┐ ┌──────────────────────┐ │
│ │ ManipulationIF │ │ MobilityIF │ │
│ │ (arm = camera) │ │ (base = dolly track) │ │
│ └─────────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────┐
│ Post-Processing Pipeline │
│ Runway Gen-3 / Veo / Seedance │
│ Video-to-video style transfer + stabilization │
└─────────────────────────────────────────────────────┘
| Skill | What It Does | Film Reference |
|---|---|---|
orbit_subject |
Circle a subject with camera locked on center at each waypoint | Avengers hero shot, Requiem for a Dream |
orbit_smooth_arc |
Continuous smooth arc (camera tangent, not locked) | De Palma's Carrie prom |
dolly_in |
Push toward subject — the emotional escalation shot | Jaws, Goodfellas diner, Spielberg realization |
dolly_out |
Pull back — the reveal, the ending, the abandonment | Gone with the Wind, The Searchers |
static_frame |
Tripod lockoff with optional slow drift | Kubrick symmetry, Ozu tatami, Wes Anderson |
lateral_track |
Side-to-side tracking for parallax depth separation | Oldboy hallway, Snowpiercer |
kubrick_symmetrical |
Dead-center one-point-perspective push | The Shining, 2001, Full Metal Jacket |
| Skill | What It Does | Film Reference |
|---|---|---|
crane_up |
Arm rises while pitch adjusts down — always available | Gone with the Wind wounded soldiers |
crane_down |
Arm descends while pitch adjusts up | Descending arrival shots |
jib_swoop |
Diagonal: arm rises while base drives forward | Fincher's signature jib |
| Skill | What It Does | Film Reference |
|---|---|---|
whip_pan |
Fast rotational blur as a transition | PTA, Edgar Wright, Kurosawa |
reveal_tilt_up |
Start on detail, arm tilts up to reveal full subject | Darth Vader boots-to-helmet |
dolly_zoom |
Physical dolly + post-processed counter-zoom (Vertigo effect) | Jaws beach, Goodfellas |
oner_walk_and_talk |
Multi-leg tracking shot with per-leg camera angles | Goodfellas Copacabana, 1917, Birdman |
| Skill | What It Does | Film Reference |
|---|---|---|
handheld_simulated |
6-DOF arm micro-jitter for documentary/urgent feel | Saving Private Ryan, Bourne films |
| Component | Spec |
|---|---|
| Camera | 2MP RGB, 1080p, 160° FOV, 30 FPS, gripper-mounted |
| Arm | 6 joints, ~40cm reach, Cartesian + IK control |
| Base | Differential drive, LiDAR navigation |
| Camera Feed | ROS2 topic /mars/arm/image_raw |
mars/
├── CINEMATOGRAPHY_SKILLS.md # Technical reference
├── agents/
│ └── cinematographer.py # NLP agent (director → skill calls)
└── skills/
├── arm_camera_poses.py # Shared pose library (5 named poses + helpers)
├── orbit_subject.py # OrbitSubject, OrbitSmoothArc
├── dolly_in.py # DollyIn
├── dolly_out.py # DollyOut
├── static_frame.py # StaticFrame
├── lateral_track.py # LateralTrack
├── kubrick_symmetrical.py # KubrickSymmetrical
├── crane_arm.py # CraneUp, CraneDown, JibSwoop
├── whip_pan.py # WhipPan
├── reveal_tilt_up.py # RevealTiltUp
├── dolly_zoom.py # DollyZoom
├── oner_walk_and_talk.py # OnerWalkAndTalk
└── handheld_simulated.py # HandheldSimulated
Every skill positions the arm using ManipulationInterface.move_to_cartesian_pose(x, y, z, roll, pitch, yaw, duration). The arm end-effector IS the camera — moving it changes the shot framing, angle, and height.
The shared arm_camera_poses.py module provides:
- 5 named poses —
CAMERA_FORWARD_NEUTRAL,CAMERA_FORWARD_LOW(heroic),CAMERA_FORWARD_HIGH(god shot),CAMERA_STOWED(transport),CAMERA_BOOM_OUT(max reach) camera_pose_from_tilt(degrees)— maps cinematographic tilt angles (-25° to +15°) to arm Cartesian posesinterpolate_camera_pose(start, end, t)— smooth transitions between any two camera positions
The differential-drive base provides:
- Forward/backward dolly moves
- Rotational orbits and whip pans
- Lateral tracking (via rotate-drive-rotate)
- Combined moves (jib swoop = base forward + arm rise)
Crane skills validate target poses with solve_ik() before committing to a move. If the target is unreachable within the arm's workspace, the skill fails gracefully instead of hitting joint limits.
The Cinematographer agent translates natural language into skill calls:
"push in slow" → dolly_in(intensity="slow")
"crash in" → dolly_in(intensity="crash")
"hero shot" → orbit_subject(camera_tilt_degrees=-15)
"Kubrick hallway" → kubrick_symmetrical()
"villain reveal" → reveal_tilt_up(start_tilt_degrees=-25, end_tilt_degrees=10)
"crane up, god shot" → crane_up(rise_height_meters=0.20, end_camera_tilt_degrees=-20)
"handheld, urgent" → handheld_simulated(jitter_intensity="heavy")
"walk and talk, oner" → oner_walk_and_talk(...)
"vertigo shot" → dolly_zoom(direction="in")
"whip pan left" → whip_pan(direction="left", degrees=180)
Shot composition defaults:
- Heroic angle: camera_tilt = -10° to -20° (looking up at subjects)
- Neutral: camera_tilt = 0°
- Vulnerable/dominating: camera_tilt = +10° to +15° (looking down)
- Slow/dramatic: 6–12s duration
- Urgent/action: 0.5–2s duration
Raw footage from the arm camera feeds into AI video tools:
| Shot Type | Post Method | Notes |
|---|---|---|
| Orbit, dolly, lateral, oner | Video-to-video | Preserves real motion, adds style |
| Static frame, crane | Either | Image-to-video can generate motion |
| Dolly zoom | Custom pipeline | Skill emits zoom ratio metadata for digital counter-zoom |
| Any shot | Stabilization | Runway/Veo smooths remaining jitter |
The dolly_zoom skill emits a JSON manifest with exact zoom ratios for the post pipeline to apply the Vertigo counter-zoom effect.
Drop the mars/ directory into your MARS robot workspace:
cp -r mars/skills/ ~/skills/
cp -r mars/agents/ ~/agents/Skills auto-register by class name. The agent references them by their name property (snake_case).
manipulation.move_to_cartesian_pose(x, y, z, roll, pitch, yaw, duration)
manipulation.get_current_end_effector_pose() # → {position, orientation}
manipulation.goto_joint_state(joints) # 6 joint values
manipulation.solve_ik(x, y, z, roll, pitch, yaw, timeout) # → joints or Nonemobility.rotate(angle_radians) # blocking
mobility.send_cmd_vel(linear_x, angular_z, duration) # non-blocking
mobility.rotate_in_place(angular_speed, duration) # non-blockingThe pose values in arm_camera_poses.py are conservative estimates within the 40cm reach envelope. On first deployment:
- Run each named pose and verify the camera frames subjects as expected
- Adjust
CAMERA_FORWARD_HIGHz-value if the arm can reach higher - Verify
camera_pose_from_tilt()output matches your framing expectations - The camera orientation relative to the end-effector may need roll/pitch offset — the current assumption is that the camera faces outward along the end-effector's +X axis
RoboHacks 2026. The constraint was "make a robot shoot a short film with no human camera operator." The answer: treat the arm like a crane and talk to it like a DP.