AirVolumeControl is a real-time gesture-based system that allows users to control their computer’s volume without touching the keyboard or mouse.
Using MediaPipe hand tracking and OpenCV, the program detects the distance between the thumb and index finger and dynamically maps it to the system volume level.
The project also includes a futuristic holographic UI overlay, making the interaction visually engaging and intuitive.
This project demonstrates concepts from Computer Vision, Human–Computer Interaction (HCI), and real-time gesture recognition systems.
- ✋ Real-time hand tracking
- 🎚 Touchless volume control
- 🎨 Holographic hand skeleton visualization
- 📊 Dynamic gradient volume bar
- ⚡ Live FPS counter
- 🟢 Gesture engagement detection
- 🧠 Smooth volume interpolation
- 🔄 Automatic MediaPipe model download
- Python
- OpenCV
- MediaPipe Tasks API
- NumPy
- PyCAW (Windows Audio Control API)
- Computer Vision
graph TD
A[Webcam Input] --> B[OpenCV Frame Capture]
B --> C[MediaPipe Hand Landmark Detection]
C --> D[Thumb Tip Landmark]
C --> E[Index Tip Landmark]
D --> F[Distance Calculation]
E --> F
F --> G[Distance to Volume Mapping]
G --> H[PyCAW Audio API]
H --> I[System Volume Adjustment]
B --> J[UI Rendering]
J --> K[Holographic Hand Skeleton]
J --> L[Volume Bar Visualization]
J --> M[FPS Counter]
J --> N[Gesture Engagement Banner]
flowchart LR
Camera --> FrameCapture
FrameCapture --> HandDetection
HandDetection --> LandmarkExtraction
LandmarkExtraction --> DistanceMeasurement
DistanceMeasurement --> VolumeInterpolation
VolumeInterpolation --> SystemAudioControl
SystemAudioControl --> UIOverlay
- Python 3.8+
- Webcam
- Windows OS (required for PyCAW audio control)
git clone https://github.com/kshitiz-arc/AirVolumeControl.git
cd AirVolumeControlpip install -r requirements.txtpython AirVolumeControl.pyPress Q to exit.
| Gesture | Action |
|---|---|
| Thumb and index finger close | Volume decrease |
| Thumb and index finger apart | Volume increase |
The system measures the distance between finger landmarks and maps it smoothly to the system volume range.
Potential extensions include:
- Multi-gesture control system
- Gesture-based media playback control
- Cross-platform audio support (Linux / macOS)
- AI-based gesture classification
- Augmented reality interface overlays
- GPU acceleration for improved performance
This project demonstrates ideas used in:
- Human Computer Interaction (HCI)
- Computer Vision Interfaces
- Gesture Recognition Systems
- Smart Environment Control
- Contactless Interaction Systems
Kshitiz
Mathematics Graduate Computer Vision Enthusiast Research Developer
GitHub: https://github.com/kshitiz-arc
If you find this project interesting, consider starring the repository.
This project is released under the MIT License.