A machine learning and deep learning project that classifies news articles as either "true" or "fake" using various models, from simple baselines to advanced neural networks.
This project implements and compares different approaches to fake news detection:
- Majority Class Classifier (Baseline)
- Logistic Regression with TF-IDF
- Fully Connected Neural Network
- Advanced LSTM Model with Attention Mechanism
The best performing model (Advanced LSTM) achieved:
- Accuracy: 96.19%
- Precision: 94.88%
- Recall: 98.33%
- F1-Score: 96.58%
- Technical Report - Detailed description of the models, methodology, and results
- Project Presentation - Visual overview of the project's key components and findings
- Text preprocessing with NLTK
- Pre-trained Word2Vec embeddings
- Bidirectional LSTM with attention mechanism
- Comprehensive model evaluation metrics
- Training visualization tools
The advanced model includes:
- Embedding Layer (Word2Vec)
- Bidirectional LSTM
- Attention Mechanism
- Dropout Regularization
- Dense Layers
The project uses two datasets:
- PyTorch
- NLTK
- NumPy
- Pandas
- Scikit-learn
- Matplotlib
- Gensim (for Word2Vec)
- Clone the repository
- Install dependencies:
pip install -r requirements.txt - Download the pre-trained Word2Vec embeddings
- Download the datasets - Sameer Patel, Fake News Detection dataset, WELFake dataset
- run the data_preprocessing notebook
- Run the training scripts for different models