ASL Real-Time Translation System

A real-time American Sign Language (ASL) alphabet recognition system using computer vision and deep learning. The system uses MediaPipe for hand landmark detection and a custom neural network for gesture classification, achieving 98.46% accuracy on test data.

Features

Real-time ASL recognition - Recognizes 26 ASL alphabet letters from live webcam input
High accuracy - 98.46% test accuracy with robust hand landmark extraction
Efficient processing - Processes video at 15-20 FPS with sub-100ms latency
Confidence scores - Displays prediction confidence for each letter
Visual feedback - Shows hand landmarks and predictions overlaid on video

Technical Stack

Python 3.x
PyTorch - Neural network framework
MediaPipe - Hand landmark detection
OpenCV - Webcam capture and image processing
NumPy - Data processing
scikit-learn - Data splitting

Results

Dataset: 78,000+ ASL alphabet images
Training samples: 60,576 (after quality filtering)
Test accuracy: 98.46%
Validation accuracy: 98.39%
Model architecture: 3-layer feedforward neural network (63 → 128 → 64 → 26)

Installation

Clone the repository

git clone https://github.com/Aassi1/asl-translator.git
cd asl-translator

Create virtual environment

python -m venv venv
venv\Scripts\activate  # Windows
# source venv/bin/activate  # Mac/Linux

Install dependencies

pip install -r requirements.txt

Usage

Training the Model

python src/train_model.py

This will:

Load and preprocess the dataset
Train the neural network for 50 epochs
Save the trained model to models/asl_model.pth

Running Real-Time Recognition

python src/realtime_inference.py

Show your hand to the webcam and make ASL letters
The predicted letter and confidence score will appear on screen
Press 'q' to quit

Project Structure

asl-translator/
├── data/
│   ├── asl_alphabet_train/     # Training dataset
│   ├── landmarks.npy           # Preprocessed hand landmarks
│   └── labels.npy              # Corresponding labels
├── models/
│   └── asl_model.pth           # Trained model weights
├── src/
│   ├── preprocess_dataset.py  # Dataset preprocessing pipeline
│   ├── train_model.py          # Model training script
│   └── realtime_inference.py  # Real-time webcam demo
├── notebooks/                  # Exploration notebooks
├── requirements.txt
└── README.md

How It Works

Hand Detection: MediaPipe detects hands in webcam frames and extracts 21 landmark points
Feature Extraction: Landmarks are converted to 63-dimensional vectors (21 points × 3 coordinates)
Classification: Neural network predicts which of 26 letters is being signed
Display: Prediction and confidence score are shown on the video feed

Model Architecture

Input layer: 63 features (hand landmark coordinates)
Hidden layer 1: 128 neurons + ReLU + Dropout(0.3)
Hidden layer 2: 64 neurons + ReLU + Dropout(0.3)
Output layer: 26 neurons (one per letter)
Loss function: Cross-entropy
Optimizer: Adam (lr=0.001)

Future Improvements

Expand to recognize full words and phrases (dynamic signs)
Add support for numbers and common words
Implement sentence formation with word spacing detection
Deploy as web application
Support for multiple sign languages

Requirements

See requirements.txt for full dependencies. Main requirements:

torch>=2.0.0
opencv-python>=4.8.0
mediapipe>=0.10.0
numpy>=1.24.0
scikit-learn>=1.3.0

License

MIT License

Acknowledgments

ASL Alphabet dataset from Kaggle
MediaPipe by Google for hand tracking
PyTorch for deep learning framework

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ASL Real-Time Translation System

Features

Technical Stack

Results

Installation

Usage

Training the Model

Running Real-Time Recognition

Project Structure

How It Works

Model Architecture

Future Improvements

Requirements

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

ASL Real-Time Translation System

Features

Technical Stack

Results

Installation

Usage

Training the Model

Running Real-Time Recognition

Project Structure

How It Works

Model Architecture

Future Improvements

Requirements

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages