Skip to content

PythonProf69/Fake-new-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Fake News Detector

This project contains a machine learning pipeline to classify news articles as "REAL" or "FAKE". It includes a script to train a Logistic Regression model and a separate script to use that model for predicting the authenticity of new, user-provided text.


📋 Features

  • Text Preprocessing: Cleans and prepares text data using stemming and stopword removal.
  • TF-IDF Vectorization: Converts text articles into a numerical format suitable for machine learning.
  • Model Training: Trains a Logistic Regression classifier and saves the trained components.
  • Real-time Prediction: Allows a user to input any news text and get an instant prediction.

📂 Project Structure

.
├── news.csv
├── news_classifier_model.pkl
├── tfidf_vectorizer.pkl
├── predictor.py
└── train_model.py

⚙️ Setup and Installation

Prerequisites

  • Python 3.7+
  • pip package manager

1. Clone the Repository (Optional)

If your code is in a Git repository, you can clone it. Otherwise, just make sure all your files are in one folder.

git clone <your-repository-url>
cd <your-repository-name>

2. Install Dependencies

It's recommended to use a virtual environment.

# Create and activate a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

# Install the required libraries
pip install pandas numpy scikit-learn nltk joblib

3. Download NLTK Data

The script requires the NLTK stopwords corpus. Run the following command in a Python interpreter to download it:

import nltk
nltk.download('stopwords')

🚀 Usage

The process is divided into two main steps: training the model and then using it for prediction.

Step 1: Train the Model

First, you must run the training script. This script will process the news.csv file, train the classifier, and save the model and vectorizer to disk as .pkl files.

python train_model.py

After running, you will see the model's accuracy on the training and test data printed to the console, and two new files will be created:

  • news_classifier_model.pkl
  • tfidf_vectorizer.pkl

Step 2: Make a Prediction

Once the model and vectorizer are saved, you can use the predictor.py script to classify new articles.

Run the script from your terminal:

python predictor.py

The script will prompt you to enter the news text. Paste the article content and press Enter. The model will then output its prediction.

Example Interaction:

Loading model and vectorizer...
Files loaded successfully.
Enter the news text to check:

<...paste your news article text here...>

--- Prediction ---
🚨 The model predicts that this news is FAKE.

🛠️ Model Details

  • Algorithm: Logistic Regression
  • Feature Extraction: Term Frequency-Inverse Document Frequency (TF-IDF)
  • Core Libraries: Scikit-learn, Pandas, NLTK

📄 File Descriptions

  • train_model.py: The main script for training the model and saving the pipeline components.
  • predictor.py: A script that loads the saved model to make real-time predictions on user input.
  • news.csv: The dataset used for training the model. It must contain title, text, and label columns.
  • news_classifier_model.pkl: The saved, trained Logistic Regression model object.
  • tfidf_vectorizer.pkl: The saved TF-IDF vectorizer object, necessary to transform new text correctly.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages