# An AI Agent for Detecting and Eliminating Logical Fallacies in News Articles

## Introduction

### What is an Agent?
An agent, in the context of artificial intelligence (AI), refers to a system or program capable of autonomously performing tasks on behalf of a user or another system. According to IBM's widely cited definition, "An artificial intelligence (AI) agent refers to a system or program that is capable of autonomously performing tasks on behalf of a user or another system by designing its workflow and utilizing available tools." Agents are designed to handle tasks that are either too complex or too tedious for humans to perform manually, making them invaluable in various applications such as data analysis, customer service, and content moderation.

### Uses of Agents
Agents can be employed in a multitude of scenarios, including but not limited to:
- Automating repetitive tasks
- Analyzing large datasets
- Providing customer support through chatbots
- Moderating content to ensure compliance with guidelines
- Enhancing decision-making processes by providing insights and recommendations

## Project Introduction

### Objective
In this project, we aim to develop an agent that can identify and remove logical fallacies from news articles. Logical fallacies are errors in reasoning that undermine the logic of an argument. By eliminating these fallacies, we can improve the quality and reliability of news content.

### Logic and Workflow
The agent will follow a structured workflow to achieve its objective:
1. **Input Data**: The agent will take news articles as input.
2. **Fallacy Detection**: Using a predefined list of logical fallacies and their descriptions (stored in `fallacies_df`), the agent will scan the articles to identify any instances of these fallacies.
3. **Fallacy Removal**: Once identified, the agent will either flag or remove the fallacies from the text.
4. **Output Data**: The cleaned articles, free from logical fallacies, will be provided as output.

### Applicability
The logic and workflow used in this project can be easily adapted to create similar agentic tools for other applications, such as:
- Detecting and correcting grammatical errors
- Identifying and removing biased language
- Ensuring compliance with editorial guidelines

By leveraging the power of AI agents, we can automate complex tasks and enhance the quality of content across various domains.

### Methodology overview for this high-level agent to check the news for logical fallacies:
1. Pulls relevent news articles from the Serper API with a user prompt 

2. Chain 1: Calls a model to check for pre-defined logical fallacies in a pandas dataframe

3. Chain 2: Calls the model again to output the result in 2 stages of analysis 

Remember:
It doesn't have to analyze all logical fallacies, or even the complete news articles - the succesful use case here for an agent is designed so that if there is something caught, it is important for the user to be aware of (and verify for accuracy).


## Setup

1. We need the below api keys
     * An [OpenAI API key](https://www.datacamp.com/tutorial/openai-o1-api)
     * A [Serper API key](https://serper.dev/)
     * Langchain modules



## Imports

In [17]:
# Ingesting packages this tutorial will utilize
import os
import pandas as pd 

from langchain.chains import LLMChain
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.utilities import GoogleSerperAPIWrapper

# Removing warnings for demo purposes
import warnings
warnings.filterwarnings('ignore')

## Ingest API keys and Load our csv file as a Dataset

In [18]:
# Ingesting API keys
openai_api_key = os.environ["OPENAI_API_KEY"]
serper_api_key = os.environ["SERPER_API_KEY"]

In [19]:
# Importing CSV of fallacies as pandas dataframe
fallacies_df = pd.read_csv('fallacies.csv')
fallacies_df

Unnamed: 0,Fallacy,Description
0,Adhominem,Attacks on the character or personal traits of...
1,Adpopulum,The fallacy that something must be true or cor...
2,Appeal to Emotion,An attempt to win support for an argument by e...
3,Fallacy of Extension,"Making broad, sweeping generalizations and ext..."
4,Intentional Fallacy,Falsely supporting a conclusion by claiming to...
5,False Causality,Jumping to conclusions about causation between...
6,False Dilemma,Presenting only two possible options or sides ...
7,Hasty Generalization,Making a broad inference or generalization to ...
8,Illogical Arrangement,"Constructing an argument in a flawed, illogica..."
9,Fallacy of Credibility,Dismissing or attacking the credibility of the...


## Setup the model 

We assign the model API parameters for `ChatOpenAI` to be called in LangChain 

In [20]:
# Define the model
llm = ChatOpenAI(
    temperature=0,
    model='gpt-4.1-mini', 
    max_tokens=16000,
    openai_api_key=openai_api_key
)

## We create first prompt template

This task contains instructions for the first model call that summarizes the news article
* Use `content` that will be taken from the webpage using WebBaseLoader in LangChain in code below
* Use `fallacies_df` as imported earlier
* Output as a variable called `summary`

In [21]:
# first prompt template
template1 = """You are a communications expert. Given this news article, summarize it in five very clear sentences and be accurate. Do not make things up. Check to make sure you are correct. Think step by step.

Fallacies to check: {fallacies_df} 

Article: {content}

Communications expert:
Summary:
"""

## Assign the first prompt template to a chain for LangChain

Connecting the prompt template and model parameters with one defined chain 

In [22]:
# The first chain
chain1 = LLMChain(
    llm = llm,
    prompt = PromptTemplate(
        input_variables=["content", "fallacies_df"],
        template = template1
    )
)

## Create the second prompt template

Defines the second model call that acts as a scholastic reviewer to the first model call, which needs to be seperated so that the entire context window is utilized around an additional goal (otherwise the 2 actions would bleed together and ultimately lose information if hitting the limits of a context window)
* Use `fallacies_df` a second time so that the model uses the same definitions as the first chain and not any fallacies in its training data
* Use `summary` from the first prompt template output that was used in the first chain

In [23]:
# The second prompt template
template2 = """
You are an ethics professor analyzing this article summary, review using the fallacy definitions again. Be succint and easy to read, but intelligently draw upon all ethics rules. Provide:
1. The single most impactful fallacy found out of fallacies in the summary
2. Why this fallacy might mislead readers
3. One possible alternative interpretation of why the fallacy could have been included, as a counterfactual and counterpoint to finding the key fallacy.

Article summary: {summary}
Fallacies to consider: {fallacies_df}

Professor:
"""

## Assign the second prompt template to a second chain

This is the same methodology as the first chain

In [24]:
# The second chain
chain2 = LLMChain(
    llm = llm,
    prompt = PromptTemplate(
        input_variables=["content", "fallacies_df"],
        template = template2
    )
)

## Define parameters for the news search query (task already completed)

This uses the google serper API to return news articles from the past month (`m1`)

In [25]:
# Parameters for google serper API
search = GoogleSerperAPIWrapper(
    type="news",
    tbs="qdr:m1",  
    serper_api_key=serper_api_key
)

## Get the article and load article content

For this project, we define a search topic and filter on a domain with no current paywall (and a currently relevant domain to filter on logical fallacies for, but this can be any domain you choose)

It also returns the most recent result due to this APIs default order of reverse chronology, and (for this tutorial only, in order to overcome this model's rate limits, only returns the first 3,000 characters of an article, but that filter can be edited or removed as needed)

In [26]:
# Define search topic
search_topic = "global trade"

# Get search results and load with WebBaseLoader and loader.load
search_results = search.results(f"site:whitehouse.gov {search_topic}")
article_url = search_results['news'][0]['link']
article_title = search_results['news'][0]['title']
loader = WebBaseLoader(article_url)
article_text = ' '.join(loader.load()[0].page_content[:3000].split())

## Analyze the article calling the 2 chains

Each chain should be called with the appropriate input and outputs.

In [27]:
# Analyze article by running both chains
summary = chain1.run(content=article_text, fallacies_df=fallacies_df)
analysis = chain2.run(summary=summary, fallacies_df=fallacies_df)

## Output

Print the result variables in a structured format: `article_title`, `article_url`, `analysis`

In [28]:
# Print results
print("Title:", article_title)
print("URL:", article_url)

print("\nAnalysis Results:")
print(analysis)

Title: Fact Sheet: President Donald J. Trump Increases Section 232 Tariffs on Steel and Aluminum
URL: https://www.whitehouse.gov/fact-sheets/2025/06/fact-sheet-president-donald-j-trump-increases-section-232-tariffs-on-steel-and-aluminum/

Analysis Results:
1. **Most impactful fallacy:**  
**False Causality** (Fallacy #5) — The summary implies that increasing tariffs directly protects national security by addressing unfair trade practices and excess global production, suggesting a straightforward cause-effect relationship without sufficient evidence.

2. **Why this fallacy might mislead readers:**  
Readers may be led to believe that the tariff increase is the definitive and sole solution to national security threats posed by steel and aluminum imports. This oversimplifies complex economic and security dynamics, potentially obscuring other factors or consequences, such as retaliation, increased costs for consumers, or alternative security measures.

3. **Alternative interpretation of wh

## Conclusion

In this project, we aimed to analyze an article by identifying logical fallacies within its content. The process involved several key steps:

1. **Data Preparation**: We started with a dataframe `fallacies_df` that contains a list of common logical fallacies and their descriptions. This dataframe served as a reference for identifying fallacies in the article.

2. **Article Analysis**: We used two chains to process the article:
   - **Chain 1**: This chain summarized the article content and identified potential fallacies based on the `fallacies_df`.
   - **Chain 2**: This chain took the summary from Chain 1 and performed a more detailed analysis, cross-referencing the fallacies identified in the summary with the descriptions in `fallacies_df`.

3. **Result Presentation**: Finally, we printed the title and URL of the article along with the detailed analysis results.

### Insights

- **Fallacy Identification**: The project successfully identified and categorized logical fallacies within the article. This can be particularly useful for critical reading and improving the quality of arguments in writing.
- **Automation Potential**: The use of chains to automate the summarization and analysis process demonstrates the potential for scalable and efficient content analysis.
- **Data Utilization**: Leveraging a structured dataframe of fallacies allowed for systematic and consistent identification of logical errors, highlighting the importance of well-organized reference data in analytical tasks.

### Detailed Results

Here are the results we obtained from the analysis:

- **Title**: The title of the article analyzed.
- **URL**: The URL where the article can be accessed.
- **Analysis**: A detailed breakdown of the logical fallacies identified within the article, including the type of fallacy and a brief description of each.

The analysis results provide a comprehensive view of the logical structure of the article, highlighting areas where logical fallacies are present. This detailed feedback can be instrumental in refining arguments and ensuring the integrity of the content.