Sure, I'll create a README file for your project. Here's a template you can use:
This project aims to recognize American Sign Language (ASL) gestures using Convolutional Neural Networks (CNNs). The dataset used for this project is the Sign Language MNIST dataset, which contains grayscale images of hand gestures representing letters from A to Z.
The dataset consists of two CSV files:
sign_mnist_train.csv
: Training data containing labeled images for model training.sign_mnist_test.csv
: Testing data containing labeled images for model evaluation.
Each image in the dataset is a 28x28 grayscale image, with pixel values ranging from 0 to 255.
- NumPy
- Pandas
- Matplotlib
- Seaborn
- Plotly Express
- PIL (Python Imaging Library)
- TensorFlow
- Keras
- Scikit-learn
The distribution of labels in the training and testing datasets is visualized using histograms. Additionally, sample images from each label are displayed to understand the gestures.
- The data is split into features (X) and labels (y).
- Feature scaling is performed by dividing pixel values by 255 to normalize them between 0 and 1.
- Labels are one-hot encoded to prepare them for classification.
The CNN model architecture is as follows:
- Two sets of convolutional layers followed by max-pooling and dropout layers.
- Flatten layer to convert 2D feature maps into 1D feature vectors.
- Dense layers with ReLU activation for classification.
- Softmax activation in the output layer for multiclass classification.
ImageDataGenerator from Keras is used for data augmentation, including rotation, zoom, and shift.
The model is trained using a generator to feed augmented data batches. A learning rate reduction callback is used to adjust the learning rate during training based on validation accuracy.
- Training and testing accuracy and loss are visualized using plots.
- Confusion matrix is generated to evaluate the model's performance.
- The model achieves high accuracy on the testing dataset.
The trained model is saved in the HDF5 format for future use.
Sample predictions are made on the first 10 images from the testing dataset. Actual and predicted labels are displayed alongside the corresponding images.