# Credibility Sub-Agent Test Notebook

This notebook tests the credibility sub-agent, which is responsible for:
- **Claim Verification**: Checking if claims are supported by reliable evidence
- **Source Assessment**: Evaluating the trustworthiness of sources used
- **Consistency Checking**: Looking for contradictions or inconsistencies
- **Bias Identification**: Noting potential biases in sources or analysis

The main agent uses this sub-agent to:
- Verify research findings from the web research agent
- Assess source quality and reliability
- Identify potential misinformation or biased reporting
- Rate overall trustworthiness of gathered information

## Prerequisites

This notebook connects to the LangGraph server, which provides:
- **Cloud-hosted store** - Persistent memory across sessions (same as LangSmith Studio)
- **Cloud-hosted checkpointer** - Conversation state persistence

### Setup Steps

1. **Start the LangGraph server** (from the `deep-agent/` directory):
   ```bash
   cd deep-agent
   langgraph dev
   ```

2. **Wait for the server** to be ready at `http://localhost:2024`

3. **Run this notebook** - it connects to the same agent as LangSmith Studio

### Why This Architecture?

When you run `langgraph dev`, LangGraph connects to LangSmith's cloud infrastructure.
This means:
- Agent state persists across sessions
- No local PostgreSQL database required
- Assistant ID is auto-discovered so this notebook works for anyone


## Setup

In [None]:
# Ensure scratchpad folders exist and are empty
from pathlib import Path
import shutil

scratchpad = Path("../scratchpad")
for folder in ["data", "images", "notes", "plots", "reports"]:
    path = scratchpad / folder
    if path.exists():
        shutil.rmtree(path)
    path.mkdir(parents=True)
    
print("Scratchpad folders ready (data, images, notes, plots, reports)")

In [None]:

from langgraph_sdk import get_sync_client
from dotenv import load_dotenv

load_dotenv()

# Connect to local LangGraph server (must have `langgraph dev` running)
LANGGRAPH_URL = "http://localhost:2024"

try:
    client = get_sync_client(url=LANGGRAPH_URL)
    assistants = client.assistants.search(limit=10)
    if not assistants:
        raise RuntimeError("No assistants registered. Run `langgraph dev` from the deep-agent/ directory and try again.")

    ASSISTANT_ID = assistants[0]["assistant_id"]

    print(f"Connected to LangGraph server at {LANGGRAPH_URL}")
    print("Available assistants:")
    for a in assistants:
        marker = " (selected)" if a["assistant_id"] == ASSISTANT_ID else ""
        print(f"  - {a['assistant_id']}: {a.get('name', 'unnamed')}{marker}")
    print(f"Using assistant_id: {ASSISTANT_ID}")
except Exception as e:
    print(f"Could not connect to LangGraph server at {LANGGRAPH_URL}")
    print(f"Error: {e}")
    print("Make sure to run 'langgraph dev' from the deep-agent/ directory first.")
    raise SystemExit(1)


In [None]:
# Assistant ID selected during connection above
print(f'Active assistant_id: {ASSISTANT_ID}')

import time
from IPython.display import display, Markdown


def truncate(text, limit=2000):
    return text[:limit] + "
..." if len(text) > limit else text


def test_credibility_agent(message: str, thread_id: str = None):
    """Run the credibility agent via LangGraph SDK and display all intermediate steps."""
    thread_id = thread_id or f"test-{int(time.time())}"

    try:
        client.threads.get(thread_id)
    except Exception:
        client.threads.create(thread_id=thread_id)

    display(Markdown(f"## Task
```
{message.strip()}
```
---"))

    final_response = None

    for chunk in client.runs.stream(
        thread_id=thread_id,
        assistant_id=ASSISTANT_ID,
        input={"messages": [{"role": "user", "content": message}]},
        stream_mode="updates",
    ):
        if chunk.event == "updates":
            for node_name, node_output in chunk.data.items():
                messages = node_output.get("messages", [])
                for msg in messages:
                    msg_type = msg.get("type")

                    if msg_type == "ai" and msg.get("tool_calls"):
                        for tc in msg["tool_calls"]:
                            name = tc.get("name")
                            args = tc.get("args", {})
                            display(Markdown(f"### Tool Call: `{name}`
```json
{truncate(str(args), 500)}
```"))

                    elif msg_type == "tool":
                        content = msg.get("content", "")
                        display(Markdown(f"### Tool Response
```
{truncate(content)}
```
---"))

                    elif msg_type == "ai" and msg.get("content") and not msg.get("tool_calls"):
                        final_response = msg["content"]
                        display(Markdown(f"## Response
{final_response}"))

    return final_response


In [None]:
import time
from IPython.display import display, Markdown


def truncate(text, limit=2000):
    return text[:limit] + "\n..." if len(text) > limit else text


def test_credibility_agent(message: str, thread_id: str = None):
    """Run the credibility agent via LangGraph SDK and display all intermediate steps."""
    thread_id = thread_id or f"test-{int(time.time())}"

    # Create thread if needed
    try:
        client.threads.get(thread_id)
    except Exception:
        client.threads.create(thread_id=thread_id)

    display(Markdown(f"## Task\n```\n{message.strip()}\n```\n---"))

    final_response = None

    # Stream from the agent graph
    for chunk in client.runs.stream(
        thread_id=thread_id,
        assistant_id=ASSISTANT_ID,
        input={"messages": [{"role": "user", "content": message}]},
        stream_mode="updates",
    ):
        if chunk.event == "updates":
            for node_name, node_output in chunk.data.items():
                messages = node_output.get("messages", [])
                for msg in messages:
                    msg_type = msg.get("type")

                    # Tool calls
                    if msg_type == "ai" and msg.get("tool_calls"):
                        for tc in msg["tool_calls"]:
                            name = tc.get("name")
                            args = tc.get("args", {})
                            display(Markdown(f"### Tool Call: `{name}`\n```json\n{truncate(str(args), 500)}\n```"))

                    # Tool responses
                    elif msg_type == "tool":
                        content = msg.get("content", "")
                        display(Markdown(f"### Tool Response\n```\n{truncate(content)}\n```\n---"))

                    # Final AI response
                    elif msg_type == "ai" and msg.get("content") and not msg.get("tool_calls"):
                        final_response = msg["content"]
                        display(Markdown(f"## Response\n{final_response}"))

    return final_response


---
# Example 1: Simple Claim Verification (Simple)

**Context**: A simple factual claim about a company that needs verification.

**Sub-agent role**: Verify a straightforward claim using web search to find corroborating evidence.

In [None]:
# Example 1: Simple claim verification
example_1_message = """Please verify the following claim:

"Apple Inc. is the most valuable publicly traded company in the world by market capitalization."

Check if this claim is accurate as of the current date.
"""

import time
response_1 = test_credibility_agent(example_1_message, thread_id=f"example-1-{int(time.time())}")

---
# Example 2: Research Output Assessment (Medium)

**Context**: The web research agent has produced findings about a stock that need verification.

**Sub-agent role**: Assess the credibility of multiple claims and sources from a research report.

In [None]:
# Example 2: Research output assessment
example_2_message = """Please assess the credibility of this research output:

---
## NVIDIA Stock Analysis Summary

**Key Findings:**
1. NVIDIA's data center revenue grew over 200% year-over-year in their most recent quarter
2. The company has captured approximately 80-90% of the AI chip market
3. Major tech companies including Microsoft, Google, and Amazon are all using NVIDIA GPUs for AI workloads
4. The stock has risen over 200% in 2024

**Sources Used:**
- NVIDIA Investor Relations
- Reuters
- Bloomberg
- TechCrunch

**Original Question:** "Why has NVIDIA stock performed so well recently?"
---

Verify the key claims and assess whether this research adequately answers the original question.
"""

response_2 = test_credibility_agent(example_2_message, thread_id=f"example-2-{int(time.time())}")

---
# Example 3: Contradictory Information Assessment (Complex)

**Context**: Research has uncovered conflicting information about a stock from different sources.

**Sub-agent role**: Analyze contradictory claims, assess source reliability, and determine which narrative is more credible.

In [None]:
# Example 3: Contradictory information assessment
example_3_message = """Please assess the credibility of these conflicting research findings:

---
## Tesla Stock Analysis - Conflicting Views

**Bullish Research (Source: Tesla fan blog "TeslaDaily.com"):**
- "Tesla will dominate the robotaxi market by 2025"
- "Full Self-Driving is already safer than human drivers"
- "Tesla's margins are industry-leading and sustainable"
- "Competition from legacy automakers is irrelevant"

**Bearish Research (Source: Short-seller report from "Hindenburg Research"):**
- "Tesla's FSD claims are exaggerated and potentially dangerous"
- "Margins are declining due to price cuts and competition"
- "Chinese EV makers are taking significant market share"
- "Robotaxi timeline has been repeatedly delayed"

**Neutral Research (Source: Goldman Sachs equity research):**
- "Tesla maintains technology leadership but faces margin pressure"
- "FSD progress is notable but regulatory approval timeline uncertain"
- "Competition is intensifying but Tesla's brand remains strong"

**Original Question:** "What is the outlook for Tesla stock over the next 12 months?"
---

Please:
1. Assess the credibility of each source
2. Identify which claims can be verified vs which are speculative
3. Note any potential biases in each source
4. Determine if this research provides a balanced view to answer the original question
"""

response_3 = test_credibility_agent(example_3_message, thread_id=f"example-3-{int(time.time())}")

---
# Notes

## Expected Outputs

For each example, the credibility agent should provide:

### Claim Verification
- VERIFIED / PARTIALLY VERIFIED / UNVERIFIED / CONTRADICTED status for each claim
- Evidence supporting the verification status

### Source Assessment
- Overall source quality rating (1-5)
- Concerns about any sources

### Answer Quality
- Whether the research answers the original question
- Missing elements

### Recommendations
- Suggested corrections or additions
- Areas needing more research

### Final Verdict
- Trustworthy and defensible? (Yes/With caveats/Needs work)

## Credibility Criteria Used

**High Credibility Sources:**
- Peer-reviewed research
- Official government/institutional data
- Established news organizations
- Primary sources and original documents

**Lower Credibility Sources:**
- Blogs and opinion pieces
- Social media
- Sites with heavy advertising
- Sources with clear conflicts of interest

## Integration with Main Agent

In production, the main agent would:
1. Receive research from `web-research-agent`
2. Send research to `credibility-agent` for verification
3. Use credibility feedback to refine or supplement research
4. Only include verified information in final reports