### FACT-CHECKING PROMPT (**PROMPT ENGINEERING**)
#### 01. PROMPT_05

In [5]:
import json

def construct_fact_checking_prompt(enriched_entry):
    """
    Constructs a well-formatted prompt for fact-checking using DeepSeek LLM.
    
    Args:
        enriched_entry (dict): A dictionary containing article metadata, enriched content, and linguistic analysis.
    
    Returns:
        str: A formatted string prompt for the LLM.
    """
    
    # Extract relevant fields from enriched_entry
    title = enriched_entry.get("title", "Unknown Title")
    url = enriched_entry.get("url", "Unknown URL")
    published_date = enriched_entry.get("published_date", "Unknown Date")
    source_name = enriched_entry.get("source", "Unknown Source")
    author = enriched_entry.get("author", "Unknown Author")
    category = enriched_entry.get("category", "Unknown Category")
    summary = enriched_entry.get("enriched_content", "No summary available")[:500]  # Truncate if too long
    
    tfidf_outliers = json.loads(enriched_entry.get("TF-IDF Outliers", "[]"))  # Convert back to list
    tfidf_outliers_str = ", ".join(tfidf_outliers) if tfidf_outliers else "None"
    
    grammar_errors = enriched_entry.get("Grammar Errors", 0)
    sentence_count = enriched_entry.get("Sentence Count", 0)
    sentiment_polarity = enriched_entry.get("Sentiment Polarity", "Unknown")
    sentiment_subjectivity = enriched_entry.get("Sentiment Subjectivity", "Unknown")
    fact_checking_summary = enriched_entry.get("fact_checking_summary", "No fact-checking data available.")


    # Construct the prompt_05
    prompt = f"""
    You are a fact-checking AI analyzing the credibility of a news article. Below is the structured information:
    
    📰 **Article Information:**
    - **Title:** {title}
    - **URL:** {url}
    - **Published Date:** {published_date}
    - **Source:** {source_name}
    - **Author:** {author}
    - **Category:** {category}

    🔹 **Article Summary (Extracted via AI):**
    "{summary}"
    
    📊 **Linguistic Analysis:**
    - **TF-IDF Outlier Keywords (Unique/Unusual Words):** {tfidf_outliers_str}
    - **Grammar Issues:** {grammar_errors} errors
    - **Sentence Count:** {sentence_count}
    - **Sentiment Analysis:** 
        - **Polarity (Scale -1 to 1):** {sentiment_polarity}
        - **Subjectivity (Scale 0 to 1, higher = opinionated):** {sentiment_subjectivity}

    🎯 **Your Task:**
    1️⃣ **Assess the credibility of this article** based on the provided content and linguistic analysis.  
    2️⃣ **Use the TF-IDF outlier words** to determine if the article contains **unusual phrasing or misleading language.**  
    3️⃣ **Analyze sentiment:** Does the emotional tone suggest bias, fear-mongering, or objectivity?  
    4️⃣ **Evaluate readability & grammar:** Is the article professionally written, or does it contain errors typical of misinformation?  
    5️⃣ **Compare against reliable sources** if possible, to determine factual accuracy.  
    
    🏆 **Final Response Format:**
    - **Credibility Score:** (Scale 0-100, where 100 = totally credible, 0 = completely false)
    - **Verdict:** (Choose one: "True", "False", or "Misleading")
    - **Explanation:** (2-3 sentences summarizing why you assigned this rating)
    """
    
    return prompt


# Example Usage
example_entry = {
    "title": "Breaking News: AI Solves World Hunger",
    "url": "https://news.example.com/ai-hunger",
    "published_date": "2025-02-18",
    "source": "Example News",
    "author": "John Doe",
    "category": "Technology",
    "enriched_content": "AI has made significant advancements... (summary content here)...\n\nFact-Checking Data:\n- Verified by multiple sources",
    "TF-IDF Outliers": json.dumps(["AI", "breakthrough", "hunger crisis"]),
    "Grammar Errors": 2,
    "Sentence Count": 25,
    "Sentiment Polarity": 0.7,
    "Sentiment Subjectivity": 0.4,
    "fact_checking_summary": "Verified by multiple sources."
}

# Generate the prompt
prompt_text = construct_fact_checking_prompt(example_entry)
print(prompt_text)



    You are a fact-checking AI analyzing the credibility of a news article. Below is the structured information:
    
    📰 **Article Information:**
    - **Title:** Breaking News: AI Solves World Hunger
    - **URL:** https://news.example.com/ai-hunger
    - **Published Date:** 2025-02-18
    - **Source:** Example News
    - **Author:** John Doe
    - **Category:** Technology

    🔹 **Article Summary (Extracted via AI):**
    "AI has made significant advancements... (summary content here)...

Fact-Checking Data:
- Verified by multiple sources"
    
    📊 **Linguistic Analysis:**
    - **TF-IDF Outlier Keywords (Unique/Unusual Words):** AI, breakthrough, hunger crisis
    - **Grammar Issues:** 2 errors
    - **Sentence Count:** 25
    - **Sentiment Analysis:** 
        - **Polarity (Scale -1 to 1):** 0.7
        - **Subjectivity (Scale 0 to 1, higher = opinionated):** 0.4

    🎯 **Your Task:**
    1️⃣ **Assess the credibility of this article** based on the provided content and linguis

#### 02. PROMPT_06

In [6]:
import json

def construct_fact_checking_prompt(enriched_entry):
    """
    Constructs a well-formatted prompt for fact-checking using DeepSeek LLM.
    
    Args:
        enriched_entry (dict): A dictionary containing article metadata, enriched content, and linguistic analysis.
    
    Returns:
        str: A formatted string prompt for the LLM.
    """
    
    # Extract relevant fields from enriched_entry
    title = enriched_entry.get("title", "Unknown Title")
    url = enriched_entry.get("url", "Unknown URL")
    published_date = enriched_entry.get("published_date", "Unknown Date")
    source_name = enriched_entry.get("source", "Unknown Source")
    author = enriched_entry.get("author", "Unknown Author")
    category = enriched_entry.get("category", "Unknown Category")
    summary = enriched_entry.get("enriched_content", "No summary available")[:500]  # Truncate if too long
    
    tfidf_outliers = json.loads(enriched_entry.get("TF-IDF Outliers", "[]"))  # Convert back to list
    tfidf_outliers_str = ", ".join(tfidf_outliers) if tfidf_outliers else "None"
    
    grammar_errors = enriched_entry.get("Grammar Errors", 0)
    sentence_count = enriched_entry.get("Sentence Count", 0)
    sentiment_polarity = enriched_entry.get("Sentiment Polarity", "Unknown")
    sentiment_subjectivity = enriched_entry.get("Sentiment Subjectivity", "Unknown")
    fact_checking_summary = enriched_entry.get("fact_checking_summary", "No fact-checking data available.")


    
  # Construct the prompt_06
    prompt = f"""
    You are a fact-checking AI analyzing the credibility of a news article. Below is the structured information:
    
    📰 **Article Information:**
    - **Title:** {title}
    - **URL:** {url}
    - **Published Date:** {published_date}
    - **Source:** {source_name}
    - **Author:** {author}
    - **Category:** {category}

    🔹 **Article Summary (Extracted via AI):**
    "{summary}"
    
    📊 **Linguistic Analysis:**
    - **TF-IDF Outlier Keywords (Unique/Unusual Words):** {tfidf_outliers_str}
    - **Grammar Issues:** {grammar_errors} errors
    - **Sentence Count:** {sentence_count}
    - **Sentiment Analysis:** 
        - **Polarity (Scale -1 to 1):** {sentiment_polarity}
        - **Subjectivity (Scale 0 to 1, higher = opinionated):** {sentiment_subjectivity}

 ## 🎯 Task: Evaluate Credibility and Truthfulness  
    Based on the provided information, conduct a critical analysis following these points:  

    1️⃣ **Credibility Assessment:**  
       - Is the source reliable?  
       - Does the author have legitimate or recognized credentials in the field?  
       - Does the article follow a logical and professional structure, or does it appear poorly written?  

    2️⃣ **Detection of Misleading or Sensationalist Language:**  
       - Analyze the unusual words detected by TF-IDF. Are these terms uncommon in serious journalism?  
       - Does the article use exaggerated language to manipulate the reader’s emotions?  

    3️⃣ **Bias and Subjectivity:**  
       - Does the content appear neutral, or does it attempt to influence the reader’s opinion?  
       - Are there phrases that exaggerate, alarm, or contain subjective judgments?  

    4️⃣ **Verification with Other Sources:**  
       - If a key fact is mentioned, is there verifiable evidence from reliable sources?  
       - Are there missing expert citations or solid references?  

    5️⃣ **Linguistic Quality:**  
       - Does the text contain unusual grammatical errors for legitimate news articles?  
       - Does it appear to be an automatically generated or poorly translated text?  

    ---  
    
    ## 📌 **Expected Response Format**  
    - **Credibility Score (0-100):** (100 = Fully credible, 0 = Completely false)  
    - **Final Verdict:** ("True", "False", or "Misleading")  
    - **Detailed Explanation (3-5 sentences):** Justify the evaluation based on findings.  

    ⚠️ **If the information is insufficient, indicate that more context or additional sources are needed.**  
    """
    
    return prompt

# Example Usage
example_entry = {
    "title": "Breaking News: AI Solves World Hunger",
    "url": "https://news.example.com/ai-hunger",
    "published_date": "2025-02-18",
    "source": "Example News",
    "author": "John Doe",
    "category": "Technology",
    "enriched_content": "AI has made significant advancements... (summary content here)...\n\nFact-Checking Data:\n- Verified by multiple sources",
    "TF-IDF Outliers": json.dumps(["AI", "breakthrough", "hunger crisis"]),
    "Grammar Errors": 2,
    "Sentence Count": 25,
    "Sentiment Polarity": 0.7,
    "Sentiment Subjectivity": 0.4,
    "fact_checking_summary": "Verified by multiple sources."
}

# Generate the prompt
prompt_text = construct_fact_checking_prompt(example_entry)
print(prompt_text)



    You are a fact-checking AI analyzing the credibility of a news article. Below is the structured information:
    
    📰 **Article Information:**
    - **Title:** Breaking News: AI Solves World Hunger
    - **URL:** https://news.example.com/ai-hunger
    - **Published Date:** 2025-02-18
    - **Source:** Example News
    - **Author:** John Doe
    - **Category:** Technology

    🔹 **Article Summary (Extracted via AI):**
    "AI has made significant advancements... (summary content here)...

Fact-Checking Data:
- Verified by multiple sources"
    
    📊 **Linguistic Analysis:**
    - **TF-IDF Outlier Keywords (Unique/Unusual Words):** AI, breakthrough, hunger crisis
    - **Grammar Issues:** 2 errors
    - **Sentence Count:** 25
    - **Sentiment Analysis:** 
        - **Polarity (Scale -1 to 1):** 0.7
        - **Subjectivity (Scale 0 to 1, higher = opinionated):** 0.4

 ## 🎯 Task: Evaluate Credibility and Truthfulness  
    Based on the provided information, conduct a critical ana

#### 03. PROMPT_07

In [7]:
import json

def construct_fact_checking_prompt(enriched_entry):
    """
    Constructs a well-formatted prompt for fact-checking using DeepSeek LLM.
    
    Args:
        enriched_entry (dict): A dictionary containing article metadata, enriched content, and linguistic analysis.
    
    Returns:
        str: A formatted string prompt for the LLM.
    """
    
    # Extract relevant fields from enriched_entry
    title = enriched_entry.get("title", "Unknown Title")
    url = enriched_entry.get("url", "Unknown URL")
    published_date = enriched_entry.get("published_date", "Unknown Date")
    source_name = enriched_entry.get("source", "Unknown Source")
    author = enriched_entry.get("author", "Unknown Author")
    category = enriched_entry.get("category", "Unknown Category")
    summary = enriched_entry.get("enriched_content", "No summary available")[:500]  # Truncate if too long
    
    tfidf_outliers = json.loads(enriched_entry.get("TF-IDF Outliers", "[]"))  # Convert back to list
    tfidf_outliers_str = ", ".join(tfidf_outliers) if tfidf_outliers else "None"
    
    grammar_errors = enriched_entry.get("Grammar Errors", 0)
    sentence_count = enriched_entry.get("Sentence Count", 0)
    sentiment_polarity = enriched_entry.get("Sentiment Polarity", "Unknown")
    sentiment_subjectivity = enriched_entry.get("Sentiment Subjectivity", "Unknown")
    fact_checking_summary = enriched_entry.get("fact_checking_summary", "No fact-checking data available.")


    
  # Construct the prompt_07
    prompt = f"""
    You are a fact-checking AI analyzing the credibility of a news article. Below is the structured information:
    
    📰 **Article Information:**
    - **Title:** {title}
    - **URL:** {url}
    - **Published Date:** {published_date}
    - **Source:** {source_name}
    - **Author:** {author}
    - **Category:** {category}

    🔹 **Article Summary (Extracted via AI):**
    "{summary}"
    
    📊 **Linguistic Analysis:**
    - **TF-IDF Outlier Keywords (Unique/Unusual Words):** {tfidf_outliers_str}
    - **Grammar Issues:** {grammar_errors} errors
    - **Sentence Count:** {sentence_count}
    - **Sentiment Analysis:** 
        - **Polarity (Scale -1 to 1):** {sentiment_polarity}
        - **Subjectivity (Scale 0 to 1, higher = opinionated):** {sentiment_subjectivity}

 ## 🎯  Task: Evaluate Credibility and Truthfulness
    Based on the provided information, conduct a critical analysis following these detailed tasks:

    1. Source Credibility Analysis:
       - Assess the reliability of the source and its reputation in journalism.
       - Identify any potential conflicts of interest or biases within the source.
       - Determine if the author has known expertise or credentials in the subject matter.

    2. Content Verification:
       - Cross-check key claims with verifiable and authoritative sources.
       - Identify any exaggerations, misleading statements, or unverified claims.
       - Evaluate if the article presents evidence and factual backing for its assertions.

    3. Detection of Misleading or Sensationalist Language:
       - Analyze the tone and wording of the article to identify emotional manipulation.
       - Assess the presence of exaggerated or alarmist phrases that may indicate bias.
       - Determine whether the article includes balanced viewpoints or only presents one-sided perspectives.

    4. Bias and Subjectivity Evaluation:
       - Determine whether the article contains subjective language or ideological framing.
       - Identify any patterns of bias based on the article's structure, language, and omitted information.
       - Consider whether the article serves an agenda beyond objective reporting.

    5. External Source Comparison:
       - Check if other reputable news sources report on the same topic and whether their coverage aligns.
       - Identify inconsistencies in reporting across different sources.
       - Evaluate whether primary sources (official statements, research papers, etc.) support the claims made.

    6. Quality of Writing and Presentation:
       - Analyze the grammatical accuracy and coherence of the text.
       - Identify any unusual phrasing that may indicate automatic content generation or poor translation.
       - Determine if the article follows standard journalistic practices in terms of citations and formatting.

    ---
    
    Expected Response Format:
    - Credibility Score (0-100): (100 = Fully credible, 0 = Completely false)
    - Final Verdict: ("True", "False", or "Misleading")
    - Detailed Explanation (4-6 sentences): Justify the evaluation based on findings, referencing key factors from the analysis.

    If the information is insufficient to determine credibility, specify what additional context or sources would be necessary to reach a conclusive assessment.
    """
    
    return prompt

# Example Usage
example_entry = {
    "title": "Breaking News: AI Solves World Hunger",
    "url": "https://news.example.com/ai-hunger",
    "published_date": "2025-02-18",
    "source": "Example News",
    "author": "John Doe",
    "category": "Technology",
    "enriched_content": "AI has made significant advancements... (summary content here)...\n\nFact-Checking Data:\n- Verified by multiple sources",
    "TF-IDF Outliers": json.dumps(["AI", "breakthrough", "hunger crisis"]),
    "Grammar Errors": 2,
    "Sentence Count": 25,
    "Sentiment Polarity": 0.7,
    "Sentiment Subjectivity": 0.4,
    "fact_checking_summary": "Verified by multiple sources."
}

# Generate the prompt
prompt_text = construct_fact_checking_prompt(example_entry)
print(prompt_text)


    You are a fact-checking AI analyzing the credibility of a news article. Below is the structured information:
    
    📰 **Article Information:**
    - **Title:** Breaking News: AI Solves World Hunger
    - **URL:** https://news.example.com/ai-hunger
    - **Published Date:** 2025-02-18
    - **Source:** Example News
    - **Author:** John Doe
    - **Category:** Technology

    🔹 **Article Summary (Extracted via AI):**
    "AI has made significant advancements... (summary content here)...

Fact-Checking Data:
- Verified by multiple sources"
    
    📊 **Linguistic Analysis:**
    - **TF-IDF Outlier Keywords (Unique/Unusual Words):** AI, breakthrough, hunger crisis
    - **Grammar Issues:** 2 errors
    - **Sentence Count:** 25
    - **Sentiment Analysis:** 
        - **Polarity (Scale -1 to 1):** 0.7
        - **Subjectivity (Scale 0 to 1, higher = opinionated):** 0.4

 ## 🎯  Task: Evaluate Credibility and Truthfulness
    Based on the provided information, conduct a critical anal

### 04. PROMPT_08 (FINAL)
I've enhanced the prompt by adding Chain-of-Thought reasoning and Few-Shot learning examples to guide the model in generating more accurate fact-checking responses.

In [8]:
import json

def construct_fact_checking_prompt(enriched_entry):
    """
    Constructs a well-formatted prompt for fact-checking using DeepSeek LLM.
    
    Args:
        enriched_entry (dict): A dictionary containing article metadata, enriched content, and linguistic analysis.
    
    Returns:
        str: A formatted string prompt for the LLM.
    """
    
    # Extract relevant fields from enriched_entry
    title = enriched_entry.get("title", "Unknown Title")
    url = enriched_entry.get("url", "Unknown URL")
    published_date = enriched_entry.get("published_date", "Unknown Date")
    source_name = enriched_entry.get("source", "Unknown Source")
    author = enriched_entry.get("author", "Unknown Author")
    category = enriched_entry.get("category", "Unknown Category")
    summary = enriched_entry.get("enriched_content", "No summary available")[:500]  # Truncate if too long
    
    tfidf_outliers = json.loads(enriched_entry.get("TF-IDF Outliers", "[]"))  # Convert back to list
    tfidf_outliers_str = ", ".join(tfidf_outliers) if tfidf_outliers else "None"
    
    grammar_errors = enriched_entry.get("Grammar Errors", 0)
    sentence_count = enriched_entry.get("Sentence Count", 0)
    sentiment_polarity = enriched_entry.get("Sentiment Polarity", "Unknown")
    sentiment_subjectivity = enriched_entry.get("Sentiment Subjectivity", "Unknown")
    fact_checking_summary = enriched_entry.get("fact_checking_summary", "No fact-checking data available.")


    
  # Construct the prompt_08
    prompt = f"""
    You are a fact-checking AI analyzing the credibility of a news article. Follow the structured process below to evaluate its reliability step by step.
    
    ## Article Information:
    - **Title:** {title}
    - **URL:** {url}
    - **Published Date:** {published_date}
    - **Source:** {source_name}
    - **Author:** {author}
    - **Category:** {category}
    
    ## Article Summary (Extracted via AI):
    "{summary}"
    
    ## Linguistic Analysis:
    - **TF-IDF Outlier Keywords (Unusual Terms):** {tfidf_outliers_str}
    - **Grammar Issues:** {grammar_errors} errors
    - **Sentence Count:** {sentence_count}
    - **Sentiment Analysis:**
        - **Polarity (Scale -1 to 1):** {sentiment_polarity}
        - **Subjectivity (Scale 0 to 1, where 1 = highly opinionated):** {sentiment_subjectivity}
    
    ---
    
    ## Step-by-Step Fact-Checking Process:
    ### 1. Source Credibility Analysis
    - Is the source reputable, unbiased, and known for accurate reporting?
    - Does the author have expertise in this subject?
    - Are there any known biases or conflicts of interest?
    
    ### 2. Content Verification
    - Identify key factual claims in the article.
    - Cross-check these claims with authoritative sources (official reports, academic papers, reputable news organizations).
    - Are there missing citations or unverifiable claims?
    
    ### 3. Misleading or Sensationalist Language Detection
    - Does the article use emotionally charged or manipulative language?
    - Are there exaggerated statements, alarmist phrases, or one-sided narratives?
    - Compare the tone to neutral, fact-based reporting.
    
    ### 4. Bias and Subjectivity Evaluation
    - Assess if the article favors a particular viewpoint.
    - Does it omit relevant facts that would provide a balanced perspective?
    - Look for patterns of ideological framing.
    
    ### 5. External Source Comparison
    - Are other reputable sources reporting on the same topic?
    - If so, do their accounts align with this article?
    - Are primary sources (official documents, expert opinions) cited and accurately represented?
    
    ### 6. Quality of Writing and Presentation
    - Does the article follow proper journalistic standards?
    - Are there spelling, grammar, or formatting issues that suggest a lack of editorial oversight?
    - Is there evidence of automated content generation or poor translation?
    
    ---
    
    ## Expected Response Format (Use Chain-of-Thought Reasoning)
    
    ### Example Analysis (Few-Shot Learning Reference)
    **Article Title:** "Company XYZ Announces Revolutionary Cancer Treatment"
    
    **Step 1 - Source Credibility:**
    - The article is from "HealthNewsNow.com," which is not a well-known medical journal.
    - The author has no medical credentials.
    
    **Step 2 - Content Verification:**
    - The claim that "XYZ's drug cures cancer in 90% of cases" is not supported by any clinical trial data.
    - No references to peer-reviewed research were found.
    
    **Step 3 - Sensationalist Language:**
    - The article repeatedly uses phrases like "miracle cure" and "breakthrough," suggesting hype over factual accuracy.
    
    **Step 4 - Bias and Framing:**
    - The piece heavily promotes the company's stock, indicating a financial motive.
    
    **Step 5 - External Source Comparison:**
    - No major medical journals or regulatory agencies have published similar findings.
    
    **Step 6 - Writing Quality:**
    - Several grammar errors and an informal tone reduce credibility.
    
    **Final Verdict:** "Misleading"
    **Credibility Score:** 25/100
    **Explanation:** The article lacks verifiable sources, contains promotional bias, and exaggerates claims without evidence.
    
    ---
    
    Now, apply this step-by-step method to the given article and provide your evaluation.
    
    **Final Expected Output:**
    - **Credibility Score (0-100):** (100 = Fully credible, 0 = Completely false)
    - **Final Verdict:** ("True", "False", or "Misleading")
    - **Detailed Explanation (4-6 sentences)** referencing key findings from each fact-checking step.
    - **If insufficient information is available, specify what additional sources or context are needed.**
    """
    
    return prompt

# Example Usage
example_entry = {
    "title": "Breaking News: AI Solves World Hunger",
    "url": "https://news.example.com/ai-hunger",
    "published_date": "2025-02-18",
    "source": "Example News",
    "author": "John Doe",
    "category": "Technology",
    "enriched_content": "AI has made significant advancements... (summary content here)...\n\nFact-Checking Data:\n- Verified by multiple sources",
    "TF-IDF Outliers": json.dumps(["AI", "breakthrough", "hunger crisis"]),
    "Grammar Errors": 2,
    "Sentence Count": 25,
    "Sentiment Polarity": 0.7,
    "Sentiment Subjectivity": 0.4,
    "fact_checking_summary": "Verified by multiple sources."
}

# Generate the prompt
prompt_text = construct_fact_checking_prompt(example_entry)
print(prompt_text)


    You are a fact-checking AI analyzing the credibility of a news article. Follow the structured process below to evaluate its reliability step by step.
    
    ## Article Information:
    - **Title:** Breaking News: AI Solves World Hunger
    - **URL:** https://news.example.com/ai-hunger
    - **Published Date:** 2025-02-18
    - **Source:** Example News
    - **Author:** John Doe
    - **Category:** Technology
    
    ## Article Summary (Extracted via AI):
    "AI has made significant advancements... (summary content here)...

Fact-Checking Data:
- Verified by multiple sources"
    
    ## Linguistic Analysis:
    - **TF-IDF Outlier Keywords (Unusual Terms):** AI, breakthrough, hunger crisis
    - **Grammar Issues:** 2 errors
    - **Sentence Count:** 25
    - **Sentiment Analysis:**
        - **Polarity (Scale -1 to 1):** 0.7
        - **Subjectivity (Scale 0 to 1, where 1 = highly opinionated):** 0.4
    
    ---
    
    ## Step-by-Step Fact-Checking Process:
    ### 1. Source

### SEND PROMP TO THE MODEL

### 01_Deepseek R1

In [9]:
import ollama

# Generate the fact-checking prompt
prompt_text = construct_fact_checking_prompt(example_entry)

# Send the prompt to DeepSeek LLM using Ollama
response = ollama.chat(
    model="deepseek-r1",
    messages=[{"role": "user", "content": prompt_text}]
)

# Print the response
print(response['message']['content'])


<think>
Okay, so I'm trying to figure out the credibility of this news article that says AI has solved world hunger. The title is pretty dramatic: "AI Solves World Hunger." Let's break down the process step by step.

First off, the source is "Example News," which seems a bit generic and not very reputable. I don't recognize it as a well-known news outlet, so that raises some red flags for me. Usually, high-quality news sources have names like The New York Times or BBC News. This one might be a fake or a lesser-known site.

The author is John Doe, who just did a quick check on his LinkedIn—it seems he's an AI expert but doesn't have any notable credentials in global hunger research either. His background might not add much credibility since it's focused on AI without expertise in food distribution systems.

Looking at the date of publication, it's from 2025-02-18. That's pretty recent, so I wonder if this is a newly released piece that hasn't been widely vetted yet. Sometimes breakthrou

### AUTOMATE PROCESS AND SAVE RESULTS

In [None]:
import json
import ollama
import chromadb  # Ensure chromadb is installed and properly set up
import os
from datetime import datetime

def fetch_entries_from_chromadb(collection_name="news_articles", max_entries=5):
    """
    Fetches the latest enriched entries from ChromaDB.

    Args:
        collection_name (str): The ChromaDB collection to query.
        max_entries (int): The maximum number of entries to fetch.

    Returns:
        list: A list of retrieved entries.
    """
    client = chromadb.PersistentClient(path="./chroma_db")  # Update path as needed
    collection = client.get_collection(name=collection_name)
    
    # Retrieve latest X entries
    results = collection.get(include=["metadatas", "documents"], limit=max_entries)

    # Process entries into required format
    entries = []
    for i in range(len(results["documents"])):
        entry = {
            "title": results["metadatas"][i].get("title", "Unknown Title"),
            "url": results["metadatas"][i].get("url", "Unknown URL"),
            "published_date": results["metadatas"][i].get("published_date", "Unknown Date"),
            "source": results["metadatas"][i].get("source", "Unknown Source"),
            "author": results["metadatas"][i].get("author", "Unknown Author"),
            "category": results["metadatas"][i].get("category", "Unknown Category"),
            "enriched_content": results["documents"][i],
            "TF-IDF Outliers": json.dumps(results["metadatas"][i].get("tfidf_outliers", [])),
            "Grammar Errors": results["metadatas"][i].get("grammar_errors", 0),
            "Sentence Count": results["metadatas"][i].get("sentence_count", 0),
            "Sentiment Polarity": results["metadatas"][i].get("sentiment_polarity", "Unknown"),
            "Sentiment Subjectivity": results["metadatas"][i].get("sentiment_subjectivity", "Unknown"),
            "fact_checking_summary": results["metadatas"][i].get("fact_checking_summary", "No fact-checking data available."),
        }
        entries.append(entry)
    
    return entries

def construct_fact_checking_prompt(entry):
    """
    Constructs a fact-checking prompt for the LLM.

    Args:
        entry (dict): A dictionary containing article details.

    Returns:
        str: The formatted prompt.
    """
    return f"Analyze the following news article:\n\nTitle: {entry['title']}\nSource: {entry['source']}\nPublished: {entry['published_date']}\nContent: {entry['enriched_content']}\n\nIs this news article reliable? Provide a fact-checking analysis."

def get_unique_filename(base_path="PROMPT_RESULTS"):
    """
    Generates a unique filename with the format: Prompt_Return_YYYY-MM-DD_N.json.

    Args:
        base_path (str): The directory where files will be stored.

    Returns:
        str: The full path of the unique filename.
    """
    # Ensure the directory exists
    os.makedirs(base_path, exist_ok=True)

    # Get current date
    date_str = datetime.now().strftime("%Y-%m-%d")
    file_number = 1

    while True:
        filename = f"Prompt_Return_{date_str}_{file_number}.json"
        full_path = os.path.join(base_path, filename)
        if not os.path.exists(full_path):
            return full_path
        file_number += 1

def analyze_multiple_articles(chroma_entries, model="deepseek-r1"):
    """
    Sends multiple articles from ChromaDB to the DeepSeek LLM for analysis and saves the results.

    Args:
        chroma_entries (list): List of enriched article entries from ChromaDB.
        model (str): The LLM model to use.

    Returns:
        list: A list of responses from the LLM.
    """
    responses = []
    
    for entry in chroma_entries:
        prompt_text = construct_fact_checking_prompt(entry)
        response = ollama.chat(
            model=model,
            messages=[{"role": "user", "content": prompt_text}]
        )
        response_content = response['message']['content']
        responses.append({
            "title": entry.get("title", "Unknown Title"),
            "url": entry.get("url", "Unknown URL"),
            "published_date": entry.get("published_date", "Unknown Date"),
            "source": entry.get("source", "Unknown Source"),
            "author": entry.get("author", "Unknown Author"),
            "fact_check_result": response_content
        })
    
    # Generate unique filename
    output_file = get_unique_filename()

    # Save responses to a JSON file
    with open(output_file, "w", encoding="utf-8") as file:
        json.dump(responses, file, indent=4, ensure_ascii=False)

    return responses

# ** Fetch limited entries from ChromaDB and send them for analysis **
chroma_entries = fetch_entries_from_chromadb(max_entries=5)
results = analyze_multiple_articles(chroma_entries)

# Output results
print(json.dumps(results, indent=4, ensure_ascii=False))




[
    {
        "title": "Presidents Are Often Judged By History Through The Lens Of Morality",
        "url": "https://newsone.com/5939034/presidents-are-judged-by-history-through-the-lens-of-morality/",
        "published_date": "2025-02-17T14:33:46+00:00",
        "source": "Unknown source",
        "author": "George R. Goethals, University of Richmond",
        "fact_check_result": "<think>\nAlright, so I need to figure out if the given news article is reliable by doing a fact-checking analysis. Let me start by reading through the article carefully and then verifying each piece of information against credible sources or existing knowledge.\n\nFirst, the article's title mentions that presidents are judged by history through the lens of morality. It says that Lincoln is highly rated because of his leadership during the Civil War. The source is an unknown entity from 2025, which already seems a bit odd since news articles from that future date aren't available yet. Maybe it's just pla


# Fundamental Differences Between the Four Prompts  

## Prompt 5  
- Focuses on **historical context** and **content consistency**.  
- Lacks **deep source credibility verification** and **external fact-checking mechanisms**.  
- Limited **bias and misleading language analysis**.  

## Prompt 6  
- Introduces **source credibility evaluation** and **basic bias detection**.  
- Begins to analyze **methodological clarity** and **fact-checking transparency**.  
- Lacks **structured Chain-of-Thought reasoning**.  

## Prompt 7  
- Expands **source credibility checks** with **institutional transparency and citation verification**.  
- Includes **misleading language detection**, **tone manipulation analysis**, and **media influence evaluation**.  
- Integrates **fact-checking cross-references** with **official reports and authoritative sources**.  
- Considers **paywalls, 404 errors, and content manipulation**.  
- Uses **Chain-of-Thought reasoning** but lacks structured examples.  

## Prompt 8  
- Fully structured **step-by-step fact-checking framework**.  
- Uses **Chain-of-Thought reasoning** and **Few-Shot learning examples**.  
- Evaluates **bias, misleading language, external validation, and source credibility**.  
- **Most comprehensive prompt**, guiding the model through a **detailed verification process**.  

---

# Comparative Summary of Results  

## Presidential Rankings Article  
- **Prompt 5**: Basic historical context and content analysis.  
- **Prompt 6**: Adds bias detection and methodological scrutiny.  
- **Prompt 7**: Enhances source transparency checks.  
- **Prompt 8**: Uses structured fact-checking, citing **historical data validation**.  
- **Best Performance**: **Prompt 8**, for its structured evaluation and Chain-of-Thought reasoning.  

## Prince and Princess of Wales' Mustique Villa  
- **Prompt 5**: Identifies lack of reputable sources.  
- **Prompt 6**: Adds **404 error analysis** and considers content removal.  
- **Prompt 7**: Investigates **possible content manipulation**.  
- **Prompt 8**: **Examines digital forensic aspects and source credibility evaluation**.  
- **Best Performance**: **Prompt 8**, for comprehensive **source validation and misinformation detection**.  

## Scottish Museum Government Funding  
- **Prompt 5**: Checks general **government funding credibility**.  
- **Prompt 6**: Adds **external source validation**.  
- **Prompt 7**: Evaluates **paywalls and content accessibility**.  
- **Prompt 8**: Expands into **cross-referencing with institutional reports**.  
- **Best Performance**: **Prompt 8**, for detailed **fact-checking and external validation**.  

## The Lathums Album Release  
- **Prompt 5**: Checks **content consistency** but lacks source verification.  
- **Prompt 6**: Introduces **fact-checking against past media coverage**.  
- **Prompt 7**: Examines **marketing influences on media narratives**.  
- **Prompt 8**: **Cross-references with promotional strategies and industry standards**.  
- **Best Performance**: **Prompt 8**, for evaluating **industry patterns and media accuracy**.  

## Trump and FAA Air Traffic Control Firings  
- **Prompt 5**: Identifies **timeline inconsistencies**.  
- **Prompt 6**: Adds **basic fact-checking against historical events**.  
- **Prompt 7**: Expands **government accountability verification**.  
- **Prompt 8**: **Uses official aviation reports and cross-referencing techniques**.  
- **Best Performance**: **Prompt 8**, for in-depth **fact-checking and misinformation risk analysis**.  

---

# Final Verdict: Best Prompt for Fake News Detection  
The **updated prompt used in File 8** is the most effective because it:  
- **Combines Chain-of-Thought reasoning with Few-Shot learning examples**.  
- **Includes a structured fact-checking process for source credibility, bias, and misleading language detection**.  
- **Incorporates external verification techniques, including cross-referencing authoritative sources**.  
- **Considers digital accessibility barriers, such as paywalls, content removals, and misinformation tactics**.  

This prompt is the **best fit for an LLM-based fake news detection system**, ensuring **accurate, detailed, and structured verification**.  

Would you like additional refinements to further optimize for your specific use case?  



