Skip to content

Developer-Amar/Project-Kinetic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌐 KINETIC OS | V3 Master Build | Submission for Google Solution Challenge 2026

The Edge-AI Spatial Computing Interface

Version AI Tech Privacy

Kinetic OS is a privacy-first, zero-hardware spatial operating system. By bridging the gap between biological intent and digital action, Kinetic transforms any standard 2D webcam into a high-fidelity, zero-latency gesture recognition engine. No physical mouse. No drivers. Just pure spatial mathematics.

Built by Team CodeDriven | 2026


🚀 The Engineering Breakthroughs

Most webcam-based gesture trackers suffer from the "Midas Touch" (accidental clicks) and hand-drift. Kinetic OS solves this through advanced Human-Computer Interaction (HCI) mathematics:

  • Hysteresis Thresholding: Implements a strict mathematical "dead zone" for all pinch gestures. A gesture firmly engages at < 0.04 distance and won't release until > 0.06, completely absorbing mid-air human hand tremors and eliminating UI flickering.
  • Vector Axis-Locking (The Analog Clutch): Kinetic isolates X and Y coordinate deltas upon clutch engagement. If you are scrubbing a timeline horizontally, vertical hand drift is mathematically ignored. The "two gears, one clutch" collision is eradicated.
  • Spatial Array Hand-Sorting: Bypasses native Machine Learning handedness-guessing failures. By sorting hand arrays via X-axis coordinates, the engine is completely crash-proof against crossed arms and extreme camera angles.

🧠 The Kinetic Ecosystem: Four Core Engines

Kinetic OS is divided into four highly specialized modules, operating over a unified Glassmorphic UI with dynamic DOM raycasting.

1. Kinetic Interact (The Geometric Mouse)

A complete replacement for physical HID devices.

  • Tri-State Logic Engine: Smoothly transitions between Hovering, Scrolling, and Clicking modes.
  • Smart Page Scroll: Uses document.elementFromPoint raycasting to dynamically scroll whatever specific div or Tab your virtual cursor is currently targeting.

2. Kinetic Tube (Spatial Media Hub)

A gesture-native media environment engineered for zero-friction consumption.

  • Dual-Core Player Abstraction: Seamlessly routes between native HTML5 video and the YouTube IFrame API without dropping gesture states. Protected by a transparent "Ghost Layer" to prevent iframe event swallowing.
  • Mobile-Spatial Key Mapping: Mimics smartphone UX in 3D space. Double-pinch the left/right side of the air-space to instantly skip +/- 10s.
  • Kinetic Snapshots (Gemini 2.5): Pinch your left pinky to drop a timestamp bookmark. The OS calls the Gemini 2.5 Flash API to instantly generate context-aware summaries and plot insights of that exact moment.

3. Kinetic Train (The Architecture Forge)

A dedicated neural trainer to teach Kinetic your own secret language.

  • K=5 KNN Classifier: Train custom geometric hand poses directly in the browser.
  • Distance Normalization: Automatically scales landmark geometry to ensure gestures work whether you are 1 foot or 10 feet away from the camera.
  • Brain Export: Download your custom dataset as a lightweight kinetic_brain.json file to instantly upgrade the OS's vocabulary.

4. Kinetic Speak (Neural Communicator)

Converts biological sign language into visible text and synthesized audio.

  • Real-Time ISL Translation: Maps geometric hand landmarks to an extensive sign language dictionary.
  • Gemini Next-Word Prediction: Massively accelerates spatial conversations by predicting complete sentences from partial gesture inputs.
  • Diegetic Waveform Visualizer: Real-time audio waveform rendering for visual feedback during speech synthesis.

🎨 Premium UI / UX Architecture

Kinetic V3 abandons the standard "web" feel for a cinematic, Cyberpunk HUD aesthetic:

  • Dynamic Canvas Ambilight: A hidden 1x1 canvas extracts the dominant RGB frame of the active video every 500ms, casting a dynamic, bleeding neon aura around the player.
  • Diegetic Web Audio: Utilizes native AudioContext oscillators to generate synthesized, low-latency spatial feedback (synth hums for analog clutches, sharp blips for UI skips).
  • Parallax Hologram Illusion: The video player subtlely tilts (perspective 3D transform) based on your hand's X/Y position, creating the illusion of a floating hologram.
  • Color-Coded Cursor States: The virtual cursor physically morphs and shifts neon colors (Green=Click, Purple=Scrub, Orange=Scroll) for subconscious state-locking feedback.

⚙️ Privacy & Performance

  • Zero Latency WebAssembly: The core MediaPipe tracking models run locally on your machine's CPU/GPU via WASM.
  • 100% Private (Edge AI): Your camera feed never leaves your device. No video frames are ever transmitted to the cloud. What happens on your machine stays on your machine.

🛠️ Installation & Booting the OS

Because Kinetic Tube utilizes local DOM reading and Canvas extraction, running it via the file:// protocol will trigger browser CORS (Cross-Origin Resource Sharing) security blocks for YouTube links.

You must run Kinetic OS over a local server:

  1. Clone or download this repository.
  2. Open your terminal in the project directory.
  3. Start a local server:
    • Python: python -m http.server 8080
    • Node.js: npx http-server
  4. Open your browser and navigate to http://localhost:8080/index.html.
  5. Allow Camera Access when prompted.
  6. Initiate the Terminal Boot Sequence and enter the experience.

Architected and Designed by Team CodeDriven.

About

A privacy-first Edge-AI interface for non-verbal computer interaction. Built for Google Solution Challenge 2026.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages