# Detecting and Generating Fake News using GenAI & NLP Models (GPT-2 & BERT)

This student project explores the dual role of Artificial Intelligence in both **creating** and **identifying** misinformation in the digital world. By leveraging two state-of-the-art NLP models, this system provides insight into how language models can be applied to real-world challenges.

- **News Headline Generation**: The GPT-2 model is used to craft synthetic news headlines based on a short text prompt. This highlights how generative models can produce convincing fake content.

- **Fake News Classification**: A BERT-based classifier, fine-tuned on fake news datasets, is employed to assess whether a given headline is *fake* or *real*, showing how machine learning can help combat disinformation.

>

---

## Instructions:
1. Run each step in the notebook sequentially.
2. Use the provided CLI to:
   - Generate synthetic news headlines.
   - Analyze and detect fake or real headlines.

## Step 1 : Install & Import Libraries Required

In [1]:
!pip install transformers datasets torch gradio --quiet

from transformers import GPT2LMHeadModel, GPT2Tokenizer
from transformers import BertTokenizer, BertForSequenceClassification, pipeline
import torch
import gradio as gr


## Step 2 : Load Pre-Trained Models

In [2]:
gpt2_tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
gpt2_model = GPT2LMHeadModel.from_pretrained("gpt2")
gpt2_model.config.pad_token_id = gpt2_tokenizer.eos_token_id

clf = pipeline("text-classification", model="jy46604790/Fake-News-Bert-Detect", tokenizer="jy46604790/Fake-News-Bert-Detect")

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/735 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/499M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

Device set to use cuda:0


## Step 3 : Define Generation and Detection Functions

In [3]:
def generate_fake_news(prompt="Breaking News:"):
    input_ids = gpt2_tokenizer.encode(prompt, return_tensors='pt')
    attention_mask = torch.ones_like(input_ids)

    outputs = gpt2_model.generate(
        input_ids,
        attention_mask=attention_mask,
        max_length=30,
        num_return_sequences=5,
        do_sample=True,
        temperature=0.9,
        top_k=50,
        top_p=0.95,
        pad_token_id=gpt2_tokenizer.eos_token_id
    )

    headlines = []
    for output in outputs:
        headline = gpt2_tokenizer.decode(output, skip_special_tokens=True)
        headlines.append(headline)
    return "\n".join([f"{i+1}. {h}" for i, h in enumerate(headlines)])


def detect_fake_news(text):
    result = clf(text[:500])[0]
    label = "FAKE" if result['label'] == "LABEL_0" else "REAL"
    confidence = round(result['score'] * 100, 2)
    return f"📝 Input: {text}\n Prediction: {label}\n Confidence: {confidence}%"

## Step 4 : Run the Fake News Generator & Detector

In [5]:
print("\nFake News Generator & Detector\n")

while True:
    print("\nChoose an option:")
    print("1 - Generate Fake News Headlines")
    print("2 - Detect Fake/Real Headline")
    print("Type 'exit' to quit.")

    choice = input("Enter 1 / 2 / exit: ").strip().lower()

    if choice == '1':
        prompt = input("Enter a prompt: ").strip()
        print("\nGenerated Headlines:\n")
        print(generate_fake_news(prompt if prompt else "Breaking News:"))

    elif choice == '2':
        text = input("Enter a news headline: ").strip()
        print("\nDetection Result:\n")
        print(detect_fake_news(text))

    elif choice == 'exit':
        print("Exiting... Goodbye!")
        break
    else:
        print("Invalid input. Try again.")



Fake News Generator & Detector


Choose an option:
1 - Generate Fake News Headlines
2 - Detect Fake/Real Headline
Type 'exit' to quit.
Enter 1 / 2 / exit: 1
Enter a prompt: Sports

Generated Headlines:

1. Sports was very nice. I think I was great in that game too."

On Monday, at a press conference in New York City, Sanders
2. Sports's final game, a five-minute drive from Boston, where the Bruins won their last two games and will play host to the Montreal Canadiens on
3. Sports-News/AP Photo/John Minchillo The Associated Press 7/20 President Donald Trump speaks at the Democratic National Convention in Philadelphia, Tuesday
4. Sports.com) is committed to providing a safe, secure and efficient online sports league for all ages.

In the NFL, the National Football
5. Sports; or (4) any other program or person, (a) with respect to the program or person in respect of which the Director has the

Choose an option:
1 - Generate Fake News Headlines
2 - Detect Fake/Real Headline
Type 'exit' to 