The Edge-AI Spatial Computing Interface
Kinetic OS is a privacy-first, zero-hardware spatial operating system. By bridging the gap between biological intent and digital action, Kinetic transforms any standard 2D webcam into a high-fidelity, zero-latency gesture recognition engine. No physical mouse. No drivers. Just pure spatial mathematics.
Built by Team CodeDriven | 2026
Most webcam-based gesture trackers suffer from the "Midas Touch" (accidental clicks) and hand-drift. Kinetic OS solves this through advanced Human-Computer Interaction (HCI) mathematics:
- Hysteresis Thresholding: Implements a strict mathematical "dead zone" for all pinch gestures. A gesture firmly engages at
< 0.04distance and won't release until> 0.06, completely absorbing mid-air human hand tremors and eliminating UI flickering. - Vector Axis-Locking (The Analog Clutch): Kinetic isolates X and Y coordinate deltas upon clutch engagement. If you are scrubbing a timeline horizontally, vertical hand drift is mathematically ignored. The "two gears, one clutch" collision is eradicated.
- Spatial Array Hand-Sorting: Bypasses native Machine Learning handedness-guessing failures. By sorting hand arrays via X-axis coordinates, the engine is completely crash-proof against crossed arms and extreme camera angles.
Kinetic OS is divided into four highly specialized modules, operating over a unified Glassmorphic UI with dynamic DOM raycasting.
A complete replacement for physical HID devices.
- Tri-State Logic Engine: Smoothly transitions between Hovering, Scrolling, and Clicking modes.
- Smart Page Scroll: Uses
document.elementFromPointraycasting to dynamically scroll whatever specificdivor Tab your virtual cursor is currently targeting.
A gesture-native media environment engineered for zero-friction consumption.
- Dual-Core Player Abstraction: Seamlessly routes between native HTML5 video and the YouTube IFrame API without dropping gesture states. Protected by a transparent "Ghost Layer" to prevent iframe event swallowing.
- Mobile-Spatial Key Mapping: Mimics smartphone UX in 3D space. Double-pinch the left/right side of the air-space to instantly skip
+/- 10s. - Kinetic Snapshots (Gemini 2.5): Pinch your left pinky to drop a timestamp bookmark. The OS calls the Gemini 2.5 Flash API to instantly generate context-aware summaries and plot insights of that exact moment.
A dedicated neural trainer to teach Kinetic your own secret language.
- K=5 KNN Classifier: Train custom geometric hand poses directly in the browser.
- Distance Normalization: Automatically scales landmark geometry to ensure gestures work whether you are 1 foot or 10 feet away from the camera.
- Brain Export: Download your custom dataset as a lightweight
kinetic_brain.jsonfile to instantly upgrade the OS's vocabulary.
Converts biological sign language into visible text and synthesized audio.
- Real-Time ISL Translation: Maps geometric hand landmarks to an extensive sign language dictionary.
- Gemini Next-Word Prediction: Massively accelerates spatial conversations by predicting complete sentences from partial gesture inputs.
- Diegetic Waveform Visualizer: Real-time audio waveform rendering for visual feedback during speech synthesis.
Kinetic V3 abandons the standard "web" feel for a cinematic, Cyberpunk HUD aesthetic:
- Dynamic Canvas Ambilight: A hidden
1x1canvas extracts the dominant RGB frame of the active video every 500ms, casting a dynamic, bleeding neon aura around the player. - Diegetic Web Audio: Utilizes native
AudioContextoscillators to generate synthesized, low-latency spatial feedback (synth hums for analog clutches, sharp blips for UI skips). - Parallax Hologram Illusion: The video player subtlely tilts (perspective 3D transform) based on your hand's X/Y position, creating the illusion of a floating hologram.
- Color-Coded Cursor States: The virtual cursor physically morphs and shifts neon colors (Green=Click, Purple=Scrub, Orange=Scroll) for subconscious state-locking feedback.
- Zero Latency WebAssembly: The core MediaPipe tracking models run locally on your machine's CPU/GPU via WASM.
- 100% Private (Edge AI): Your camera feed never leaves your device. No video frames are ever transmitted to the cloud. What happens on your machine stays on your machine.
Because Kinetic Tube utilizes local DOM reading and Canvas extraction, running it via the file:// protocol will trigger browser CORS (Cross-Origin Resource Sharing) security blocks for YouTube links.
You must run Kinetic OS over a local server:
- Clone or download this repository.
- Open your terminal in the project directory.
- Start a local server:
- Python:
python -m http.server 8080 - Node.js:
npx http-server
- Python:
- Open your browser and navigate to
http://localhost:8080/index.html. - Allow Camera Access when prompted.
- Initiate the Terminal Boot Sequence and enter the experience.
Architected and Designed by Team CodeDriven.