No mouse. No touchpad. No hardware. Just a webcam.
Getting Started · Gesture Guide · Calibration · Architecture · Troubleshooting
AI Air Mouse uses your webcam and MediaPipe's hand landmark model to track your hands in real time and translate gestures into mouse and keyboard actions — entirely on your device, with zero cloud calls.
The interaction model is inspired by Apple Vision Pro: your thumb is always the trigger, your index finger is always the pointer, and there are no ambiguous multi-finger combinations. The right hand owns the mouse; the left hand optionally controls Windows system shortcuts. Together they support clicks, drag, scroll, zoom, window snapping, and more.
Webcam → MediaPipe (21 landmarks per hand) → Gesture State Machines → SendInput / pynput → Windows
Everything runs in a tight async loop. Frame capture never blocks inference. Cursor smoothing uses the 1-euro filter — zero perceptible lag at high speed, zero jitter at rest.
| Minimum | Recommended | |
|---|---|---|
| Python | 3.12 | 3.13 |
| Windows | 10 | 11 |
| Webcam | Any DirectShow camera | 60 FPS capable |
| RAM | 2 GB | 4 GB+ |
| Privileges | Standard user | Administrator (for priority tuning) |
git clone https://github.com/Rcidshacker/airmouse.git
cd airmouse
# Create virtual environment
python -m venv .venv
.\.venv\Scripts\pip install -r requirements.txt
# Verify all subsystems are healthy
.\.venv\Scripts\python test_startup.py
# Launch
.\.venv\Scripts\python main.pyPress Ctrl+C to exit. All Windows settings (timer resolution, pointer acceleration, process priority) are automatically restored on shutdown.
On startup you'll see logs like:
[INFO] core.tracker: MediaPipe HandLandmarker initialized (max_hands=2)
[INFO] core.gestures.orchestrator: GestureOrchestrator ready (dual-hand mode)
[INFO] airmouse.main: Running. Press Ctrl+C to exit.
[INFO] airmouse.main: FPS: 31.4
The right hand is always active. Index fingertip drives the cursor.
| Gesture | How to do it | Action |
|---|---|---|
| Point | Index finger extended, others relaxed | Move cursor |
| Pinch | Touch thumb tip to index tip, release | Left click |
| Double pinch | Two pinches within 0.4 s | Double click |
| Pinch + hold | Touch thumb to index, hold 0.4 s | Start drag — release to drop |
| Thumb + middle | Touch thumb tip to middle tip | Right click |
| Peace sign | Index + middle up, thumb + ring + pinky curled | Enter scroll mode |
| Wrist flick | (while in scroll mode) flick wrist up/down | Scroll — faster flick = more ticks |
| Fist | Close all fingers | Freeze cursor — open hand to resume |
Tip: The fist gesture is your rest state. Drop your hand, make a fist, and nothing will move or click until you extend your fingers again.
The left hand adds a system shortcut layer on top of right-hand mouse control. You can ignore it entirely — right-hand behavior is unaffected.
| Gesture | Action |
|---|---|
| Open palm (all 5 fingers extended) | Win+D — Show Desktop |
| V-sign (index + middle up, others curled) | Alt+Tab — Cycle Windows |
| Four fingers (index–pinky up, thumb curled) | Win+Tab — Task View |
| Thumb + index pinch | Win — Start Menu |
| Fist (held) | Hold Ctrl — combine with right-hand click for multi-select |
| Fist (released) | Release Ctrl |
| Index pointing + wrist swipe right | Alt+Left — Go Back |
Ctrl + Click workflow: Make a left-hand fist to hold Ctrl, then left-click with your right hand to multi-select items. Release the fist to drop Ctrl.
Activated only when both hands are visible.
| Gesture | Action |
|---|---|
| Both hands pinch + spread wrists apart | Ctrl+= — Zoom In |
| Both hands pinch + bring wrists together | Ctrl+− — Zoom Out |
| Left open palm + right fist swipe right | Win+Right — Snap window right |
| Left open palm + right fist swipe left | Win+Left — Snap window left |
| Both fists held for 1 second | Win+L — Lock Screen |
All parameters live in config.py — never scattered across modules.
Cursor feel
| Symptom | Parameter | Fix |
|---|---|---|
| Cursor shakes at rest | ONE_EURO_MINCUTOFF |
Decrease → try 0.3 |
| Cursor lags during fast motion | ONE_EURO_BETA |
Increase → try 0.05 |
| Cursor drifts near edges | TRACKPAD_MARGIN |
Increase |
Click sensitivity
| Symptom | Parameter | Fix |
|---|---|---|
| Left click not triggering | THUMB_INDEX_CLICK_DIST |
Increase → try 0.055 |
| Accidental left clicks | THUMB_INDEX_CLICK_DIST |
Decrease → try 0.04 |
| Right click not triggering | THUMB_MIDDLE_CLICK_DIST |
Increase |
| Drag activates too quickly | DRAG_HOLD_SECONDS |
Increase → try 0.5 |
Scroll & zoom
| Symptom | Parameter | Fix |
|---|---|---|
| Scroll too sensitive | WRIST_VELOCITY_THRESHOLD |
Increase → try 0.012 |
| Scroll not registering | WRIST_VELOCITY_THRESHOLD |
Decrease → try 0.005 |
| Zoom fires too easily | ZOOM_WRIST_DELTA |
Increase → try 0.07 |
Performance
| Symptom | Parameter | Fix |
|---|---|---|
| Low FPS | CAMERA_WIDTH / CAMERA_HEIGHT |
Reduce to 640 / 480 |
| Camera buffer delay | CAMERA_BUFFER_SIZE |
Keep at 1 |
MediaPipe dual-hand inference is heavier than single-hand. ~30 FPS at 1280×720 is normal.
┌─────────────────────────────────────────────────────────────────┐
│ main.py │
│ Windows OS tuning → inference loop → cleanup on exit │
└───────────────┬─────────────────────────────────────────────────┘
│
┌───────────┼───────────────────────────────┐
│ │ │
▼ ▼ ▼
camera.py tracker.py actuator.py
Async MediaPipe SendInput (move)
DirectShow HandLandmarker ──────────► pynput (click/key)
daemon → HandsResult
thread (left, right)
│
▼
gestures/orchestrator.py
Routes HandsResult to sub-processors
│
┌───────────┼───────────────┐
│ │ │
▼ ▼ ▼
right_hand left_hand two_hand
Cursor Win shortcuts Zoom / Snap
Clicks Alt+Tab Lock screen
Scroll Ctrl modifier
Fist lock
│
▼
filter.py + display.py
1-euro smooth DPI-aware
cursor coord map
Thumb as exclusive trigger — Index stays as pointer; thumb fires everything. No ambiguous combos like "is that a peace sign or a scroll trigger?"
Wrist velocity for scroll — Flicking your wrist is anatomically natural and speed-proportional. Faster flick → 1–5 scroll ticks, clamped and cooldown-gated.
Fist = rest state — Put your hand down and make a fist. Cursor freezes, no phantom clicks. Open your fingers to resume exactly where you left off.
SendInput over mouse_event — mouse_event is a deprecated Vista-era compatibility shim. SendInput with MOUSEEVENTF_VIRTUALDESK correctly spans all monitors in a virtual desktop.
DirectShow over MSMF — CAP_DSHOW delivers lower capture latency than the default Media Foundation backend on Windows.
1-euro filter over Kalman — Adaptive speed-dependent low-pass filter. Strict reduction in cursor lag vs. Kalman at equivalent jitter suppression (Casiez et al., CHI 2012).
MediaPipe handedness flip — MediaPipe labels hands from the camera's point of view. Camera-"Left" = user's right (webcam is a mirror). The tracker flips the label before populating HandsResult.right.
airmouse/
├── main.py Entry point — Windows setup + inference loop
├── config.py Every tunable constant, one file
├── requirements.txt Dependencies
├── test_startup.py Smoke tests — run before first use
├── models/ hand_landmarker.task (git-ignored, ~9 MB)
└── core/
├── camera.py Async frame capture via daemon thread
├── filter.py 1-euro filter implementation
├── display.py Multi-monitor virtual desktop + DPI mapping
├── tracker.py MediaPipe wrapper → HandsResult dataclass
├── actuator.py Mouse + keyboard injection
└── gestures/
├── orchestrator.py Coordinates all three processors
├── right_hand.py 5-state machine: IDLE/PENDING/DRAG/SCROLL/LOCKED
├── left_hand.py System shortcuts + Ctrl modifier state
├── two_hand.py Zoom / snap / lock-screen
└── utils.py dist3d, is_fist, is_peace_sign, is_open_palm, …
Camera fails to open
- Open Device Manager and confirm the webcam appears under Cameras or Imaging devices
- Try setting
CAMERA_INDEX = 1or2inconfig.pyif you have multiple cameras - Ensure privacy shutters are open on laptops with built-in shutters
- Check that no other app (Teams, Zoom, OBS) has exclusive camera access
Cursor maps to wrong screen positions
The app sets PROCESS_PER_MONITOR_DPI_AWARE at startup. If you use mixed DPI scaling across monitors:
- Set all displays to the same scale (100% or 125%) for initial testing
- Ensure your primary monitor is set correctly in Windows Display Settings
Gestures not recognized
- Lighting: Diffuse, even lighting works best. Avoid strong backlighting.
- Distance: Keep your hand 40–70 cm from the camera.
- Background: High-contrast backgrounds help MediaPipe. Avoid skin-tone walls.
- Thresholds: Tweak
THUMB_INDEX_CLICK_DISTorWRIST_VELOCITY_THRESHOLDinconfig.py.
Low FPS / stuttering
- Dual-hand MediaPipe inference is ~2× heavier than single-hand — 30 FPS at 1280×720 is expected
- Reduce resolution to 640×480 in
config.pyfor ~50% FPS improvement - Close other webcam-consuming applications
- Run as Administrator for
HIGH_PRIORITY_CLASSand MMCSS "Games" scheduling
Permission warnings in logs
HIGH_PRIORITY_CLASS and MMCSS thread registration require Administrator privileges. The application functions normally without elevation but may drop frames under CPU load. Run the terminal as Administrator for best performance.
MIT License — do whatever you want, just keep the attribution.