This repository contains a Jupyter notebook for a Fake News Detection project using machine learning techniques. The notebook is designed to run on Google Colab, leveraging a GPU for accelerated processing.
Make sure you have the required libraries installed by running the following commands:
pip install pandas numpy matplotlib seaborn wordcloud nltk gensim plotly scikit-learn
The notebook is divided into the following sections:
-
Loading the Dataset
- Reads and loads the True and Fake news datasets from Kaggle.
-
Setting up a target and merging datasets
- Adds a target column indicating whether the news is true (1) or fake (0).
-
Checking the number of null values
- Identifies and handles null values in the dataset.
-
Data Cleaning
- Removes stopwords and performs data cleaning.
-
Exploratory Data Analysis (EDA)
- Analyzes and visualizes the distribution of true and fake news, most covered issues, word cloud and maximum word count in a title.
-
Data Preprocessing
- Further cleans and prepares the data for model training.
-
Model Building
- Utilizes four algorithms for training and evaluation:
- Logistic Regression
- Decision Tree Classifier
- Gradient Boosting Classifier
- Random Forest Classifier
- Utilizes four algorithms for training and evaluation:
-
Manual Testing
- Allows users to input news for manual testing and provides predictions from the trained models.
-
Hindi to English News Translation
- Translates Hindi news to English for testing.
- Open the notebook in Google Colab using the provided badge.
- Run each cell sequentially to execute the code and observe the results.
- Optionally, use the manual testing section to input news for predictions.
Feel free to contribute, provide feedback or report issues.