This project implements a real-time sign language detection system using Mediapipe for landmark detection and a deep learning model (LSTM-based) trained to recognize hand gestures.
It supports training a custom dataset, evaluating the model, and running real-time predictions via webcam.
.
├── collect.py # Collect training data from webcam (saves as keypoints)
├── config.py # Configuration (actions, paths, sequence settings)
├── evaluate.py # Evaluate trained model performance
├── model.py # Model definition and training logic
├── run.py # Run real-time sign language detection via webcam
├── utils.py # Utility functions for preprocessing and visualization
├── MP_DATA/ # Auto-generated dataset directory
├── model/ # Saved trained model files
└── Logs/ # TensorBoard logs
-
Dataset Collection
Collects webcam input, extracts Mediapipe landmarks (pose, face, hands), and saves keypoints into structured.npyfiles. -
Configurable Actions
Default gestures arehello,thanks, andiloveyou, but you can add more inconfig.py. -
Deep Learning Model (LSTM)
A stacked LSTM network is trained on landmark sequences to classify gestures. -
Evaluation
Generates confusion matrices and accuracy scores for trained models. -
Real-Time Inference
Recognizes signs live from webcam, overlays predictions, and displays probability bars.
-
Clone the repo:
git clone https://github.com/yourusername/sign-language-detection.git cd sign-language-detection -
Install dependencies:
pip install -r requirements.txt
Run the data collection script to generate training samples:
python collect.pyThis will create a dataset under MP_DATA/ with your defined actions.
Train (or load) the model:
python model.py- If a trained model exists at
model/sign_language_detection_model.keras, it will load it. - Otherwise, a new model will be trained and saved.
Check model performance:
python evaluate.pyRun the live system via webcam:
python run.pyPress q to quit the video feed.
- Actions defined:
['hello', 'thanks', 'iloveyou'] - Each action → 30 sequences × 30 frames per sequence.
- Model learns spatiotemporal patterns of landmarks.
- During real-time detection, predictions update dynamically with probabilities displayed.
- Add new actions: Edit
actionsarray inconfig.py. - Adjust sequence length & dataset size: Change
sequence_lengthandnum_sequencesinconfig.py. - Model architecture: Modify
build_model()inmodel.py.
- Python 3.8+
- Webcam access
- TensorFlow 2.x
- Mediapipe
- OpenCV
This project is licensed under the MIT License – feel free to use and modify.