This project demonstrates the application of deep learning techniques for analyzing both image and text datasets. It covers a comprehensive approach, including dataset preprocessing, model training, and evaluation. The project is structured into two main parts:
Sentiment analysis of the IMDB Movie Review Dataset employing models ranging from Naive Bayes to Recurrent Neural Networks (RNNs), including data augmentation techniques.
- Olivetti Faces Dataset (AT&T Laboratories Cambridge): A collection of face images for testing facial recognition methods.
- IMDB Movie Review Dataset: A dataset for binary sentiment classification featuring a set of 25,000 highly polar movie reviews for training and 25,000 for testing.
- Data Preprocessing: Splitting the dataset into training and testing sets.
- Baseline Model: Implementing a simple classifier to set a baseline.
- Dimensionality Reduction and Classification: Using PCA and LDA for feature extraction, followed by SVM for classification.
- Deep Learning: Implementing a CNN based on the LeNet-5 architecture.
- Preprocessing: Cleaning and preparing text data.
- Baseline Model: Naive Bayes classifier for establishing a performance benchmark.
- RNN: Developing a Vanilla RNN model for sentiment analysis.
- Data Augmentation: Enhancing the dataset to improve model performance.
- Clone the repository to your local machine.
- Ensure all dependencies are installed.
- Open AIProject6.ipynb in Jupyter Notebook or Google Colab.
- Run the cells sequentially to reproduce the results.