This repository contains the implementation of transfer learning for large image classification using a MobileNet model. The goal is to fine-tune the model on a custom dataset with at least 3 classes, each having a minimum of 100 images.
- Collected images using a phone/camera, ensuring a diverse set of classes.
- Split the dataset into training, validation, and test sets.
- Built an input pipeline to preprocess and augment the training data.
- Implemented data augmentation techniques to enhance the model's robustness.
- Utilized the MobileNet model pre-trained on ImageNet.
- Fine-tuned the model on the custom dataset, adjusting the last layers for the new classification task.
precision recall f1-score support
0 0.79 0.80 0.80 80
1 0.30 0.30 0.30 20
2 0.00 0.00 0.00 8
accuracy 0.65 108
macro avg 0.36 0.37 0.37 108
weighted avg 0.64 0.65 0.64 108
The transfer learning model using MobileNet achieves an accuracy of 0.74, while the CNN model built from scratch has a lower accuracy of 0.65. This indicates that MobileNet performs better in terms of overall correctness in predictions.
- Precision for class 0: 0.74
- Recall for class 0: 1.00
- F1-score for class 0: 0.85
- Precision, recall, and F1-score for class 0 are lower compared to MobileNet (0.79, 0.80, 0.80, respectively).
- Overall lower values for precision, recall, and F1-score across classes.
MobileNet exhibits higher macro and weighted averages for precision, recall, and F1-score compared to the CNN from scratch. This indicates better performance across all classes, taking into account class imbalances.
The CNN from scratch shows significant misclassifications, especially for class 1, where precision, recall, and F1-score are considerably lower. In contrast, MobileNet achieves better precision and recall for class 1.
The CNN from scratch struggles with the class imbalance, as seen in the lower precision, recall, and F1-score for class 1. MobileNet, having learned from a diverse set of classes during pre-training, demonstrates more balanced performance across classes.
MobileNet achieves better accuracy and metrics across multiple classes, indicating superior generalization ability. The CNN from scratch, with a simpler architecture, faces challenges in capturing diverse features necessary for accurate predictions.
In summary, the higher accuracy, precision, recall, and F1-score of MobileNet compared to the CNN from scratch underscore the advantages of transfer learning in terms of leveraging pre-trained models and their ability to generalize to new tasks, especially when task-specific data is limited.
This repository provides a comprehensive guide on fine-tuning the DistilBERT model for text classification using TensorFlow and the Hugging Face Transformers library. Text classification is a common natural language processing (NLP) task, and DistilBERT, a distilled version of BERT (Bidirectional Encoder Representations from Transformers), offers a lightweight yet powerful solution.
Ensure you have the following prerequisites installed:
- Python 3.6 or later
- TensorFlow
- Transformers library from Hugging Face
- Pandas
- Seaborn
- Matplotlib
- Plotly
- NLTK
- tqdm
You can install the required libraries using the following command:
pip install tensorflow transformers pandas seaborn matplotlib plotly nltk tqdm
This project assumes that you have a prepared dataset stored in the train_texts and test_texts variables. It is essential to ensure that your dataset is appropriately preprocessed and split into training and testing sets before proceeding with the model training.
DistilBERT tokenization is a crucial step in preparing the data for training. The process is performed using the DistilBertTokenizer from the Hugging Face Transformers library. The tokenizer is initialized with a pre-trained DistilBERT model, specifically 'distilbert-base-uncased'. During tokenization, the training and testing texts are transformed into sequences of tokens, and the sequences are encoded with padding and truncation to align with the model's input requirements.
The base DistilBERT model for sequence classification is loaded using TFDistilBertForSequenceClassification from the Hugging Face Transformers library. Following the model initialization, it is compiled with an Adam optimizer and categorical cross-entropy loss, setting the stage for training.
To enhance the base DistilBERT model, a custom neural network architecture named CustomDistilBERTModel is implemented. This architecture extends the base model by incorporating an additional dense layer with ReLU activation and an output layer with softmax activation. The extension allows for more flexibility and adaptability to specific text classification tasks.
To enhance the fine-tuning process, a custom model is designed by extending the base DistilBERT model. This custom model incorporates additional layers to provide more adaptability for specific classification tasks.
The CustomDistilBERTModel
class is defined as follows:
class CustomDistilBERTModel(tf.keras.Model):
def __init__(self, base_model):
super(CustomDistilBERTModel, self).__init__()
self.base_model = base_model
self.dense_layer = tf.keras.layers.Dense(256, activation='relu')
self.output_layer = tf.keras.layers.Dense(5, activation='softmax')
def call(self, inputs, training=False):
logits = self.base_model(inputs, training=training).logits
dense_output = self.dense_layer(logits)
predictions = self.output_layer(dense_output)
return predictions
Results The custom model achieves a test accuracy of 94.38%, showcasing the effectiveness of fine-tuning DistilBERT for the specific text classification task.
Feel free to adapt and modify the code based on your specific dataset and requirements.