| Branch | Version | Status |
|---|---|---|
master |
0.3.2 |
| Platform | Python | System Dependencies |
|---|---|---|
| Linux (X11/Wayland) | 3.12+ |
libgl1, libglib2.0-0, notify-send, xdotool, ydotool |
| macOS (ARM64) | N/A |
N/A |
| Windows (x86_64) | N/A |
N/A |
- Docs: [Link to documentation]
Gesture Vision is a gesture control system, developed in Python to automate desktop actions through real-time gesture recognition.
The system works through a computer vision pipeline where MediaPipe processes the camera stream to detect hand landmarks, coordinated by an asynchronous FastAPI backend. This provides precise event control with minimal impact on system resources.
- High Precision Tracking: MediaPipe Hand Landmarker implementation for low-latency hand landmark detection.
- Robust Trigger Control: Validation system based on gesture hold time and cooldown periods to eliminate false positives.
- Swipe Gestures: Workspace switching with horizontal hand movements - left/right.
- Screenshot Capture: Detection of the "peace" gesture (index and middle fingers up) to take screenshots.
- Invisible Mode (Background): Fully background operation without preview windows, optimizing GPU/CPU usage.
- Keyboard Shortcuts: Associate gestures with system shortcuts like ctrl+shift+q to execute any action.
- X11 and Wayland Support: Automatically detects whether to use xdotool (X11) or ydotool (Wayland).
- Native Notifications: Integration with desktop notification system to instantly confirm actions.
- Native Deployment: Simplified installation via
pipwith dedicated CLI commands for service management. - Web Panel: Graphical interface to configure gestures, shortcuts, and timing.
# Install from PyPI
pip install gesturevision# First time: automatically install system dependencies
gesturevision-start --install-deps
# Start the system
gesturevision-startAccess the web panel: http://localhost:8080
There you can:
- Start/stop the system
- Configure gesture and cooldown timing
- Add gestures and associate keyboard shortcuts
| Gesture | Description |
|---|---|
| ✌️ Peace | Index and middle fingers up |
| ✊ Fist | Closed hand |
| ✋ Open Hand | 5 fingers extended |
| 👍 Three Fingers | Index, middle, and ring fingers up |
| ☝️ Point | Only index finger up |
You can use any of the predefined shortcuts or write your own:
- Screenshot:
Print - Close window:
alt+F4 - Minimize all:
super+d - Lock screen:
super+l - Switch window:
alt+Tab - New tab:
ctrl+t - Close tab:
ctrl+w - Copy/Paste:
ctrl+c/ctrl+v - And many more...
src/gesturevision/main_vision.py: Gesture detection and capture engine.src/gesturevision/api/: FastAPI backend for control and configuration.src/gesturevision/static/: Control panel user interface.pyproject.toml: Package dependencies and entry points definition.gesturevision/assets/hand_landmarker.task: Pre-trained AI model.
Settings are stored in: ~/gesturevision/config.json
Structure:
{
"gesture_hold_seconds": 1.0,
"cooldown_seconds": 1.5,
"gestures": [
{
"name": "Peace (screenshot)",
"gesture_type": "peace_sign",
"required_hold_seconds": 1.0,
"enabled": true,
"action": "screenshot",
"shortcut": null
}
]
}docker-compose up --buildAccess the panel at: http://localhost:8080

