A computer vision-based mouse controller that uses hand tracking for cursor movement and gesture-based clicking. Built with MediaPipe and OpenCV.
- Real-time hand detection and tracking using MediaPipe
- Cursor control via index finger movement
- Click detection through thumb-finger proximity
- Configurable motion smoothing to reduce jitter
- Motion prediction for handling brief occlusions
- Coordinate calibration system for improved accuracy
- Multiple interaction modes (single-click and drag-hold)
- Python 3.7+
- Webcam
- Dependencies listed in
requirements.txt
git clone <repository-url>
cd pointer
uv syncRequires uv. Pinned to Python 3.11 / 3.12.
The application can be launched with:
uv run pointer
# or
uv run python -m finger_trackerFor programmatic usage:
from finger_tracker import FingerTracker
tracker = FingerTracker()
tracker.run()- Move index finger to control cursor
- Thumb ↔ index finger PIP (first knuckle) — left click / drag
- Thumb ↔ middle finger PIP — right click
- Closed fist, move vertically — scroll
- Open palm held for ~1 second — toggle pause (cursor + clicks frozen)
q- Exit applicationm- Toggle between HOLD (drag) and CLICK modesc- Enter calibration moded- Delete saved calibrations- Toggle smoothing on/offf- Cycle smoothing filter (one_euro → legacy → kalman)[/]- Adjust smoothing parameter (filter-dependent)+/-- Adjust click sensitivity thresholdr- Reset mouse state
- Press
cto enter calibration mode - Position index finger at each corner of desired tracking area
- Press
SPACEat each corner (4 points required) - Calibration data is saved automatically to
calibration.json - Press
ESCto cancel
Custom configurations can be passed to the tracker:
from finger_tracker import Config, FingerTracker
config = Config(
click_distance_threshold=10,
smoothing_factor=0.7,
click_mode='hold',
prediction_enabled=True
)
tracker = FingerTracker(config)
tracker.run()pointer/
├── finger_tracker/
│ ├── __init__.py # Package exports
│ ├── __main__.py # Module entry point
│ ├── config.py # Immutable Config dataclass
│ ├── state.py # Per-frame FrameState dataclass
│ ├── capture.py # Sync + threaded camera capture
│ ├── calibration.py # Coordinate mapping + calibration
│ ├── detection.py # MediaPipe hand detection + downscale
│ ├── gestures.py # Gesture classifier + debouncer
│ ├── mouse.py # PyAutoGUI mouse control + hysteresis/debounce
│ ├── smoothing.py # Smoother facade with pluggable strategies
│ ├── filters/ # One-Euro, Kalman, legacy MA+EMA strategies
│ └── tracker.py # Orchestrator (update/render split)
├── tests/ # pytest suite
├── pyproject.toml
├── uv.lock
└── README.md
The application processes webcam frames through the following pipeline:
- Hand landmark detection using MediaPipe
- Extraction of index finger tip and thumb positions
- Motion smoothing via moving average and exponential smoothing
- Velocity-based position prediction during occlusions
- Coordinate mapping from camera space to screen space
- Click detection via Euclidean distance measurement
- Mouse control through PyAutoGUI
- Increase smoothing by pressing
[ - Ensure adequate lighting conditions
- Keep hand within camera field of view
- Increase threshold with
+ - Touch thumb to index finger knuckle (not fingertip)
- Monitor yellow distance indicator in video feed
- Decrease smoothing with
] - Close resource-intensive applications
- Reduce camera resolution (modify source)
- Press
rto reset - Switch to CLICK mode with
m
MIT