Skip to content

thedotmack/MovieBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MovieBot

A cinematography skill library for the MARS robot. The robot's 6-joint arm becomes a programmable camera crane, gimbal, and jib — controlled by a natural-language agent that translates director commands into physical camera moves.

Talk to it like a film director. It moves like a DP.

The Idea

The MARS robot has a 2MP RGB camera (1080p, 160° FOV, 30 FPS) mounted on the gripper of a 6-joint arm with ~40cm of reach. Instead of treating the arm as a manipulator, MovieBot treats it as a camera rig — a crane, gimbal, and dolly system that a director can command with natural language.

Director: "Give me a hero shot, low angle, circle them slowly"
    ↓
Cinematographer Agent (NLP → skill selection)
    ↓
orbit_subject(camera_tilt_degrees=-15, direction="ccw", duration_seconds=12)
    ↓
Arm positions camera low + angled up, base orbits subject in waypoints
    ↓
Raw footage → Runway/Veo video-to-video for final polish

Architecture

┌─────────────────────────────────────────────────────┐
│  Director (human, natural language)                  │
└──────────────────────┬──────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────┐
│  Cinematographer Agent                               │
│  Translates intent → skill calls                     │
│  Confirms framing, suggests post-processing          │
└──────────────────────┬──────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────┐
│  Skill Library (15 skills)                           │
│  ┌─────────────────┐  ┌──────────────────────┐      │
│  │ ManipulationIF  │  │ MobilityIF           │      │
│  │ (arm = camera)  │  │ (base = dolly track) │      │
│  └─────────────────┘  └──────────────────────┘      │
└─────────────────────────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────┐
│  Post-Processing Pipeline                            │
│  Runway Gen-3 / Veo / Seedance                       │
│  Video-to-video style transfer + stabilization       │
└─────────────────────────────────────────────────────┘

Skills

Core Shots (7)

Skill What It Does Film Reference
orbit_subject Circle a subject with camera locked on center at each waypoint Avengers hero shot, Requiem for a Dream
orbit_smooth_arc Continuous smooth arc (camera tangent, not locked) De Palma's Carrie prom
dolly_in Push toward subject — the emotional escalation shot Jaws, Goodfellas diner, Spielberg realization
dolly_out Pull back — the reveal, the ending, the abandonment Gone with the Wind, The Searchers
static_frame Tripod lockoff with optional slow drift Kubrick symmetry, Ozu tatami, Wes Anderson
lateral_track Side-to-side tracking for parallax depth separation Oldboy hallway, Snowpiercer
kubrick_symmetrical Dead-center one-point-perspective push The Shining, 2001, Full Metal Jacket

Crane / Vertical (3)

Skill What It Does Film Reference
crane_up Arm rises while pitch adjusts down — always available Gone with the Wind wounded soldiers
crane_down Arm descends while pitch adjusts up Descending arrival shots
jib_swoop Diagonal: arm rises while base drives forward Fincher's signature jib

Advanced (4)

Skill What It Does Film Reference
whip_pan Fast rotational blur as a transition PTA, Edgar Wright, Kurosawa
reveal_tilt_up Start on detail, arm tilts up to reveal full subject Darth Vader boots-to-helmet
dolly_zoom Physical dolly + post-processed counter-zoom (Vertigo effect) Jaws beach, Goodfellas
oner_walk_and_talk Multi-leg tracking shot with per-leg camera angles Goodfellas Copacabana, 1917, Birdman

Effects (1)

Skill What It Does Film Reference
handheld_simulated 6-DOF arm micro-jitter for documentary/urgent feel Saving Private Ryan, Bourne films

Hardware

Component Spec
Camera 2MP RGB, 1080p, 160° FOV, 30 FPS, gripper-mounted
Arm 6 joints, ~40cm reach, Cartesian + IK control
Base Differential drive, LiDAR navigation
Camera Feed ROS2 topic /mars/arm/image_raw

Project Structure

mars/
├── CINEMATOGRAPHY_SKILLS.md    # Technical reference
├── agents/
│   └── cinematographer.py      # NLP agent (director → skill calls)
└── skills/
    ├── arm_camera_poses.py     # Shared pose library (5 named poses + helpers)
    ├── orbit_subject.py        # OrbitSubject, OrbitSmoothArc
    ├── dolly_in.py             # DollyIn
    ├── dolly_out.py            # DollyOut
    ├── static_frame.py         # StaticFrame
    ├── lateral_track.py        # LateralTrack
    ├── kubrick_symmetrical.py  # KubrickSymmetrical
    ├── crane_arm.py            # CraneUp, CraneDown, JibSwoop
    ├── whip_pan.py             # WhipPan
    ├── reveal_tilt_up.py       # RevealTiltUp
    ├── dolly_zoom.py           # DollyZoom
    ├── oner_walk_and_talk.py   # OnerWalkAndTalk
    └── handheld_simulated.py   # HandheldSimulated

How It Works

The Arm Is the Camera

Every skill positions the arm using ManipulationInterface.move_to_cartesian_pose(x, y, z, roll, pitch, yaw, duration). The arm end-effector IS the camera — moving it changes the shot framing, angle, and height.

The shared arm_camera_poses.py module provides:

  • 5 named posesCAMERA_FORWARD_NEUTRAL, CAMERA_FORWARD_LOW (heroic), CAMERA_FORWARD_HIGH (god shot), CAMERA_STOWED (transport), CAMERA_BOOM_OUT (max reach)
  • camera_pose_from_tilt(degrees) — maps cinematographic tilt angles (-25° to +15°) to arm Cartesian poses
  • interpolate_camera_pose(start, end, t) — smooth transitions between any two camera positions

The Base Is the Dolly Track

The differential-drive base provides:

  • Forward/backward dolly moves
  • Rotational orbits and whip pans
  • Lateral tracking (via rotate-drive-rotate)
  • Combined moves (jib swoop = base forward + arm rise)

IK Validation

Crane skills validate target poses with solve_ik() before committing to a move. If the target is unreachable within the arm's workspace, the skill fails gracefully instead of hitting joint limits.

Talking to the Director Agent

The Cinematographer agent translates natural language into skill calls:

"push in slow"          → dolly_in(intensity="slow")
"crash in"              → dolly_in(intensity="crash")
"hero shot"             → orbit_subject(camera_tilt_degrees=-15)
"Kubrick hallway"       → kubrick_symmetrical()
"villain reveal"        → reveal_tilt_up(start_tilt_degrees=-25, end_tilt_degrees=10)
"crane up, god shot"    → crane_up(rise_height_meters=0.20, end_camera_tilt_degrees=-20)
"handheld, urgent"      → handheld_simulated(jitter_intensity="heavy")
"walk and talk, oner"   → oner_walk_and_talk(...)
"vertigo shot"          → dolly_zoom(direction="in")
"whip pan left"         → whip_pan(direction="left", degrees=180)

Shot composition defaults:

  • Heroic angle: camera_tilt = -10° to -20° (looking up at subjects)
  • Neutral: camera_tilt = 0°
  • Vulnerable/dominating: camera_tilt = +10° to +15° (looking down)
  • Slow/dramatic: 6–12s duration
  • Urgent/action: 0.5–2s duration

Post-Processing Pipeline

Raw footage from the arm camera feeds into AI video tools:

Shot Type Post Method Notes
Orbit, dolly, lateral, oner Video-to-video Preserves real motion, adds style
Static frame, crane Either Image-to-video can generate motion
Dolly zoom Custom pipeline Skill emits zoom ratio metadata for digital counter-zoom
Any shot Stabilization Runway/Veo smooths remaining jitter

The dolly_zoom skill emits a JSON manifest with exact zoom ratios for the post pipeline to apply the Vertigo counter-zoom effect.

Deployment

Drop the mars/ directory into your MARS robot workspace:

cp -r mars/skills/ ~/skills/
cp -r mars/agents/ ~/agents/

Skills auto-register by class name. The agent references them by their name property (snake_case).

API Reference

ManipulationInterface (arm/camera control)

manipulation.move_to_cartesian_pose(x, y, z, roll, pitch, yaw, duration)
manipulation.get_current_end_effector_pose()  # → {position, orientation}
manipulation.goto_joint_state(joints)          # 6 joint values
manipulation.solve_ik(x, y, z, roll, pitch, yaw, timeout)  # → joints or None

MobilityInterface (base/dolly control)

mobility.rotate(angle_radians)                        # blocking
mobility.send_cmd_vel(linear_x, angular_z, duration)  # non-blocking
mobility.rotate_in_place(angular_speed, duration)      # non-blocking

Calibration Notes

The pose values in arm_camera_poses.py are conservative estimates within the 40cm reach envelope. On first deployment:

  1. Run each named pose and verify the camera frames subjects as expected
  2. Adjust CAMERA_FORWARD_HIGH z-value if the arm can reach higher
  3. Verify camera_pose_from_tilt() output matches your framing expectations
  4. The camera orientation relative to the end-effector may need roll/pitch offset — the current assumption is that the camera faces outward along the end-effector's +X axis

Built At

RoboHacks 2026. The constraint was "make a robot shoot a short film with no human camera operator." The answer: treat the arm like a crane and talk to it like a DP.

About

MARS robot cinematography skills — arm as camera rig

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages