This system detects hand gestures in real time using a webcam and classifies them into predefined labels. It preprocesses the dataset, trains multiple models using GridSearchCV for hyperparameter tuning, and selects the best-performing model for deployment.
The input to the project is a CSV file containing hand landmarks (e.g., x, y, z coordinates of keypoints) extracted from the HaGRID dataset using MediaPipe. The output will be a trained machine learning model capable of classifying hand gestures into predefined classes.
We will gain hands-on experience in data preprocessing, visualization, machine learning model training, and performance evaluation.
ProjectML_SheriF_Alexandria_AI45.mp4
To run this project, install the required dependencies:
pip install opencv-python mediapipe pandas numpy joblib scikit-learn xgboostThe dataset consists of hand landmark coordinates (X, Y) extracted using MediaPipe Hands. It includes labeled gestures used for training the models.
Three machine learning models were trained using GridSearchCV for hyperparameter tuning:
The dataset was split into training (90%) and testing (10%) subsets using train_test_split with stratification.
The best hyperparameters and evaluation metrics for each model:
The models were evaluated based on their accuracy, precision, recall, and F1-score. Below is a summary of the results:
| Model | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|
| SVM | 0.97 | 0.97 | 0.97 | 0.97 |
| Logistic Regression | 0.89 | 0.89 | 0.89 | 0.89 |
| Random Forest | 0.85 | 0.85 | 0.85 | 0.85 |
| XGBoost | 0.72 | 0.72 | 0.72 | 0.72 |
| Decision Tree | 0.65 | 0.66 | 0.63 | 0.63 |
After training, the best model is used for real-time classification. MediaPipe extracts hand landmarks, and the trained model predicts the gesture. Results are displayed on the webcam feed.
Run the real-time gesture detection script: '''bash python hand_gesture_detection.py ''' Press q to exit.
