# 📰 **Fake News Generator & Detector**

This educational and awareness-focused Generative AI application explores the dual power of AI in creating and detecting fake news. It allows users to generate fake news headlines based on a selected category (e.g., politics, tech), a subject or entity (e.g., Apple, Government), and number of headlines. It also provides a detection system where users can input a news snippet to check whether it is likely to be fake or real using a fine-tuned BERT model.

**The application is built using:**

Transformers library
(For GPT-2 based fake news generation and BERT-based fake news classification)

Gradio
(To create a clean, tabbed web interface for both generation and detection)

Google Colab / Python
(For backend development, model training/fine-tuning, and deployment)

**This project aims to:**

✨ Demonstrate the capabilities and risks of large language models

🔍 Raise awareness about misinformation in media

🧠 Promote responsible AI development through practical exploration

This project showcases how AI can both contribute to and combat misinformation, serving as a tool for learning, experimentation, and ethics in AI.

**Step 1: Install Required Libraries**

In [None]:
!pip install transformers gradio --quiet

**Step 2: Load GPT-2 Model and Tokenizer**

In [None]:
# GPT-2: Fake News Generator

from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

gpt2_tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
gpt2_model = GPT2LMHeadModel.from_pretrained("gpt2").to(device)
gpt2_model.eval()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D(nf=2304, nx=768)
          (c_proj): Conv1D(nf=768, nx=768)
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D(nf=3072, nx=768)
          (c_proj): Conv1D(nf=768, nx=3072)
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=768, out_features=50257, bias=False)
)

**Step 3: Define Fake News Headline Generation Function**

In [None]:
def generate_fake_headlines(category, subject, num_headlines=1, max_length=20):
    headlines = []
    for i in range(num_headlines):
        prompt = f"Write a fake {category} news headline about {subject}:\n"
        input_ids = gpt2_tokenizer.encode(prompt, return_tensors='pt').to(device)

        output = gpt2_model.generate(
            input_ids,
            max_length=len(input_ids[0]) + max_length,
            temperature=0.9,
            top_p=0.9,
            do_sample=True,
            pad_token_id=gpt2_tokenizer.eos_token_id,
            no_repeat_ngram_size=2,
            early_stopping=True
        )

        generated_text = gpt2_tokenizer.decode(output[0], skip_special_tokens=True)
        headline = generated_text.replace(prompt, "").strip().split("\n")[0]
        headlines.append(headline)

    return headlines

**Step 4: Create Helper Function for Gradio Interface**

In [None]:
def gradio_fake_news_generator(category, subject, num_headlines):
    try:
        num = int(num_headlines)
        if num <= 0 or num > 10:
            return "Please enter a number between 1 and 10."
    except:
        return "Invalid input for number of headlines."

    headlines = generate_fake_headlines(category, subject, num)
    return "\n".join([f"{i+1}. {h}" for i, h in enumerate(headlines)])

**Step 5: Load BERT Model for Fake News Detection**

In [9]:
# 🔹 BERT: Fake News Detector

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch.nn.functional as F
from google.colab import userdata # Import userdata

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load the Hugging Face token from Colab secrets
HF_TOKEN = userdata.get('HF_TOKEN')

bert_tokenizer = AutoTokenizer.from_pretrained("Pulk17/Fake-News-Detection", token=HF_TOKEN) # Pass the token
bert_model = AutoModelForSequenceClassification.from_pretrained("Pulk17/Fake-News-Detection", token=HF_TOKEN).to(device) # Pass the token
bert_model.eval()

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/727 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e

**Step 6: Define News Classification Function using BERT**

In [10]:
def detect_fake_news(text):
    inputs = bert_tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512).to(device)
    outputs = bert_model(**inputs)
    probs = F.softmax(outputs.logits, dim=1)
    predicted_class = torch.argmax(probs, dim=1).item()
    confidence = torch.max(probs).item()
    label = "FAKE 🟥" if predicted_class == 0 else "REAL 🟩"
    return f"Prediction: {label}  ({confidence*100:.2f}% confidence)"

**Step 7: Build Gradio Interface for Generator & Detector**

In [11]:
# 🔹 Gradio Interface
import gradio as gr

# Generator tab
generator_interface = gr.Interface(
    fn=gradio_fake_news_generator,
    inputs=[
        gr.Textbox(label="News Category (e.g., politics, tech)"),
        gr.Textbox(label="Main Subject / Entity (e.g., Apple, Government)"),
        gr.Textbox(label="Number of Headlines (1-10)")
    ],
    outputs=gr.Textbox(label="Generated Fake News Headlines"),
    title="📰 Fake News Generator",
    description="Generate fake news headlines using GPT-2"
)

# Detector tab
detector_interface = gr.Interface(
    fn=detect_fake_news,
    inputs=gr.Textbox(lines=5, label="Enter full news text"),
    outputs=gr.Textbox(label="Fake News Detection Result"),
    title="🔍 Fake News Detector",
    description="Detect whether a news article is FAKE or REAL using BERT"
)

# Combine into tabs
gr.TabbedInterface([generator_interface, detector_interface],
                   tab_names=["📰 Generator", "🔍 Detector"]).launch()

It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://39563fd1bd64bd2f00.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


