Skip to content

Is116/gesture-works

Repository files navigation

Gesture Control

A real-time hand gesture recognition system that uses machine learning to control a ball on screen. Built with Next.js, TensorFlow.js, and MediaPipe.

Screenshots

Game Screen

Game Screen

Manage Gestures

Manage Gestures

Features

  • Custom Gesture Training: Train your own hand gestures (UP, DOWN, LEFT, RIGHT, FREEZE)
  • Real-time Recognition: Low-latency gesture detection using TensorFlow.js
  • ML Model: Neural network with enhanced feature extraction (directional angles, displacement vectors)
  • Performance Optimized: 10fps video processing, 400ms prediction throttle, model warm-up
  • Data Persistence: Auto-save training data to localStorage and server
  • Data Optimization: Reduce training samples for better performance
  • Clean UI: Minimal developer-friendly design

Tech Stack

  • Frontend: Next.js 16 (App Router), React 19, TypeScript
  • ML/AI: TensorFlow.js, MediaPipe Hand Landmarker
  • Styling: CSS with custom properties
  • Runtime: Bun (Node.js compatible)

Installation

# Clone the repository
git clone https://github.com/yourusername/gesture-works.git
cd gesture-works

# Install dependencies
bun install

# Run development server
bun run dev

Open http://localhost:3000 in your browser.

Usage

Playing the Game

  1. Visit the homepage - the game loads immediately
  2. Use your trained gestures to control the ball:
    • UP (↑): Move ball upward
    • DOWN (↓): Move ball downward
    • LEFT (←): Move ball left
    • RIGHT (→): Move ball right
    • FREEZE (■): Stop ball movement
  3. Ball wraps around screen edges

Training Custom Gestures

  1. Navigate to /manage or click "Manage Gestures" in the header
  2. Click a gesture button (UP, DOWN, LEFT, RIGHT, or FREEZE)
  3. Perform your custom gesture in front of the camera
  4. System captures 15 samples automatically
  5. Repeat for all 5 gestures
  6. Training data auto-saves to both localStorage and server

Quick Start Gestures:

  • Point index finger in direction for UP/DOWN/LEFT/RIGHT
  • Open palm (all fingers) for FREEZE

Managing Training Data

Optimize Data: Reduces training samples from 30 to 15 per gesture for better performance

Reset Training: Deletes all training data to start fresh

Architecture

Project Structure

gesture-works/
├── app/
│   ├── api/
│   │   └── save-gestures/
│   │       └── route.ts          # Server-side JSON persistence
│   ├── manage/
│   │   └── page.tsx               # Training screen route
│   ├── components.css             # Component styles
│   ├── globals.css                # Global styles
│   ├── layout.tsx                 # Root layout
│   └── page.tsx                   # Game screen (landing)
├── components/
│   ├── GameScreen.tsx             # Ball control game
│   ├── GestureSnakeGame.tsx       # (deprecated)
│   └── TrainingScreen.tsx         # Gesture training UI
├── hooks/
│   ├── useGestureTraining.ts      # ML model & training logic
│   └── useHandTracking.ts         # MediaPipe hand detection
├── public/
│   ├── default-gestures.json      # Persisted training data
│   └── mediapipe/                 # MediaPipe WASM files
├── next.config.js
├── package.json
└── tsconfig.json

ML Model Architecture

Feature Extraction:

  • 21 hand landmarks (x, y, z coordinates) = 63 base features
  • Enhanced with directional angles (sin/cos components)
  • Displacement vectors from palm center
  • Total: ~80 features per sample

Neural Network:

  • Input layer: 63+ features
  • Hidden layer 1: 64 units (ReLU)
  • Hidden layer 2: 32 units (ReLU)
  • Output layer: 5 units (Softmax)
  • Training: 30 epochs, Adam optimizer

Performance Optimizations:

  • Model warm-up with 3 dummy predictions (WebGL shader compilation)
  • Prediction throttle: 400ms (2.5 predictions/sec)
  • Video rendering: 10fps (every 6th frame)
  • Confidence threshold: 0.3 (30%)

API Endpoints

POST /api/save-gestures

Save training data to server

Request Body:

{
  "UP": [[...], [...], ...],
  "DOWN": [[...], [...], ...],
  "LEFT": [[...], [...], ...],
  "RIGHT": [[...], [...], ...],
  "FREEZE": [[...], [...], ...]
}

Response:

{
  "success": true,
  "message": "Training data saved successfully"
}

GET /api/save-gestures

Load training data from server

Response:

{
  "UP": [[...], [...], ...],
  "DOWN": [...],
  ...
}

Performance Notes

Initial Slowness

First 1-2 seconds may be slow due to:

  • WebGL shader compilation in TensorFlow.js
  • MediaPipe model initialization

Solution: Model warm-up runs 3 dummy predictions during build to pre-compile shaders.

Lag Prevention

  • Throttled predictions (400ms intervals)
  • Reduced video frame rate (10fps)
  • Cached canvas gradients
  • Disabled alpha channel on canvas
  • Sample rate logging (10% of predictions)

Data Optimization

If default-gestures.json becomes large (>10,000 lines):

  1. Go to /manage
  2. Click "Optimize Data"
  3. Reduces samples from 30 to 15 per gesture
  4. Maintains accuracy while improving performance

Browser Compatibility

  • Chrome/Edge: Full support (recommended)
  • Firefox: Full support
  • Safari: Requires MediaPipe WASM polyfill

Requirements:

  • WebGL 2.0 support
  • Camera access permission
  • Modern ES6+ JavaScript support

Configuration

Adjust Performance Settings

Edit hooks/useGestureTraining.ts:

const SAMPLES_PER_GESTURE = 15;        // Training samples
const EPOCHS = 30;                     // Training epochs
const PREDICTION_THROTTLE_MS = 400;    // Prediction interval

Edit components/GameScreen.tsx:

const BALL_SPEED = 5;                  // Ball movement speed
const PREDICTION_THROTTLE_MS = 400;    // Gesture detection rate

Customize Gestures

Edit hooks/useGestureTraining.ts:

const GESTURES = ['UP', 'DOWN', 'LEFT', 'RIGHT', 'FREEZE'];

Add new gesture recognition in components/GameScreen.tsx.

Development

Commands

bun run dev          # Start development server
bun run build        # Build for production
bun run start        # Start production server
bun run lint         # Run ESLint

Debug Mode

Enable debug logging in hooks/useGestureTraining.ts:

// Increase sample rate from 10% to 100%
if (Math.random() < 1.0) {  // Change from 0.1
  console.log('🎯 Predictions:', predictions);
}

Troubleshooting

Camera Not Working

  • Grant camera permissions in browser settings
  • Check if another app is using the camera
  • Try Chrome/Edge for best compatibility

Gestures Not Recognized

  • Ensure all 5 gestures are trained (15 samples each)
  • Check camera has good lighting
  • Position hand clearly in frame
  • Try retraining gestures with exaggerated movements

Performance Issues

  • Click "Optimize Data" in /manage
  • Close other browser tabs
  • Check GPU acceleration is enabled in browser
  • Reduce BALL_SPEED in GameScreen.tsx

Model Not Loading

  • Check browser console for errors
  • Verify MediaPipe files exist in /public/mediapipe/
  • Clear browser cache and reload

License

MIT

Credits

  • MediaPipe by Google
  • TensorFlow.js by Google
  • Next.js by Vercel

Built with ❤️ using Next.js and TensorFlow.js

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published