<font color = teal>**Agentic System for Literature-Grounded Causal Reasoning**</font>

**Workflow Overview**

```
User Question
      ↓
LLM builds search query
(causal-focused keywords)
      ↓
Search scientific literature (arXiv)
      ↓
Extract causal sentences
(LLM, strict JSON output)
      ↓
Build causal graph
(from extracted evidence)
      ↓
Evidence-based reasoning
(over the causal graph)
      ↓
Verdict + explanation + uncertainty
```


In [None]:
# Silence noisy sklearn/arXiv warnings and improve Jupyter output scrolling

import warnings
from IPython.display import display, HTML

warnings.filterwarnings("ignore")

display(HTML("""
<style>
.output_scroll { max-height: 600px !important; }
</style>
"""))

Imports: LLM client, literature search, and text processing

In [None]:
from openai import OpenAI
import numpy as np
import matplotlib.pyplot as plt
import json
import arxiv
import re
import networkx as nx
from tqdm.auto import tqdm
import os
import pandas as pd

1. Initialize OpenAI client using an API key.
-  The API key is generated from https://platform.openai.com/account/api-keys
and should be stored securely (e.g., as an environment variable), not hard-coded.

In [None]:
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

2. LLM-Assisted Literature Search Query Generation

- This function uses a large language model to convert a causal question about two variables into a concise, causality-focused arXiv search query.
- The generated query is optimized to retrieve scientific papers that discuss whether intervening on one variable affects the other.

In [None]:
def gpt_build_search_query(client, X: str, Y: str) -> str:
    prompt = f"""
You are constructing a search query for scientific literature.

Goal:
Find papers discussing whether changing or intervening on X affects Y.

Variables:
X = {X}
Y = {Y}

Include keywords such as:
causal, effect, influence, intervention, mechanism, impact, change, risk, dependency

Return ONLY a concise arXiv-style query string.
"""
    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
        max_tokens=40
    )
    return resp.choices[0].message.content.strip()


2. Literature Retrieval from arXiv

- This function searches the arXiv database using a provided query and returns the most relevant scientific papers up to a specified limit.
- For each paper, it extracts key metadata—including the title, abstract, arXiv ID, DOI (if available), and publication date—to support downstream evidence extraction and causal reasoning.

In [None]:
def search_arxiv(query: str, max_results: int = 15):
    client = arxiv.Client()
    search = arxiv.Search(
        query=query,
        max_results=max_results,
        sort_by=arxiv.SortCriterion.Relevance
    )

    papers = []
    for r in client.results(search):
        papers.append({
            "title": r.title,
            "abstract": r.summary,
            "url": r.entry_id,
            "arxiv_id": r.entry_id.split("/")[-1],
            "doi": getattr(r, "doi", None),
            "published": getattr(r, "published", None),
            "source": "arXiv"
        })
    return papers


# # Split scientific text into clean, individual sentences for evidence extraction

_SENT_SPLIT = re.compile(r'(?<=[.!?])\s+')

def split_sentences(text: str):
    text = re.sub(r"\s+", " ", (text or "").strip())
    return [s.strip() for s in _SENT_SPLIT.split(text) if s.strip()]

3. Causal Evidence Extraction from Scientific Abstracts
- This function uses a language model to scan paper abstracts and extract exact sentences that support a causal relationship from X to Y, based on an intervention-style definition of causality.
- The model is constrained to return strict JSON, enabling reliable parsing and ensuring that only explicit, sentence-level evidence is collected for downstream reasoning.

In [None]:
def extract_causal_evidence(client, X: str, Y: str, papers: list, max_sentences_per_paper: int = 3):
    evidence = []

    for p in tqdm(papers, desc="Extracting causal evidence", unit="paper"):
        sentences = split_sentences(p["abstract"])

        prompt = f"""
You are extracting STRICT causal evidence from a scientific abstract.

We are ONLY interested in sentences that EXPLICITLY assert
a causal effect from X to Y.

STRICT DEFINITION:
A sentence supports "X → Y" ONLY IF it states that:
- changing, modifying, intervening on, or manipulating X
- directly causes, leads to, increases, decreases, or determines Y

STRICT MATCHING RULE (MANDATORY):

- The sentence MUST explicitly mention X or a clear synonym of X
- AND MUST explicitly mention Y or a clear synonym of Y
- If Y is not explicitly mentioned, DO NOT extract the sentence
- Do NOT infer Y from downstream effects, intermediate variables, or context


DO NOT extract sentences that:
- merely mention association, correlation, relevance, or importance
- describe examples, applications, goals, preferences, or optimization
- discuss usage, scheduling, or motivation without causal testing
- imply causality without explicit causal verbs
- reverse the direction (Y → X)

X = "{X}"
Y = "{Y}"

ALLOWED causal verbs (must appear clearly):
"causes", "leads to", "results in", "induces", "increases", "decreases",
"affects", "determines", "controls", "drives", "modulates"

TASK:
From the sentences below, extract up to {max_sentences_per_paper}
sentences that STRICTLY satisfy the definition above.

OUTPUT FORMAT (STRICT JSON ONLY):

If supporting sentences exist:
{{
  "support_sentences": ["exact sentence 1", "exact sentence 2"],
  "note": "short justification (optional)"
}}

If NO sentence strictly satisfies the definition:
{{
  "support_sentences": [],
  "note": "no explicit causal claim"
}}

SENTENCES:
{json.dumps(sentences, ensure_ascii=False)}
"""

        resp = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": prompt}],
            temperature=0,
            max_tokens=250
        )

        try:
            obj = json.loads(resp.choices[0].message.content)
            for s in obj.get("support_sentences", []):
                evidence.append({
                    "sentence": s,
                    "title": p["title"],
                    "arxiv_id": p["arxiv_id"],
                    "doi": p["doi"],
                    "url": p["url"]
                })
        except Exception:
            pass

    return evidence


4. Evidence-Based Causal Graph Construction

- This builds a directed causal graph where edges are supported by extracted sentences from the literature.

In [None]:
class LiteratureCausalGraph:
    def __init__(self):
        self.graph = nx.DiGraph()
        self.edge_evidence = {}

    def add_edge(self, src, tgt, ev):
        self.graph.add_edge(src, tgt)
        self.edge_evidence.setdefault((src, tgt), []).append(ev)


def build_graph_from_evidence(X: str, Y: str, evidence: list):
    g = LiteratureCausalGraph()
    for ev in evidence:
        g.add_edge(X, Y, ev)
    return g



5. Evidence-Based Causal Reasoning and Verdict

- This function evaluates the extracted literature evidence to determine whether there is explicit support for a causal relationship from X to Y.
- It produces a clear verdict along with cited evidence, a summary of the papers searched, and an explicit statement of uncertainty and limitations.

In [None]:
def literature_reasoning(graph: LiteratureCausalGraph, X: str, Y: str, papers: list):
    n = len(papers)
    edge = (X, Y)
    found = graph.edge_evidence.get(edge, [])

    paper_refs = []
    for i, p in enumerate(papers, 1):
        ref = f"arXiv:{p['arxiv_id']}"
        if p.get("doi"):
            ref += f" | DOI:{p['doi']}"
        paper_refs.append(f"[{i}] {p['title']} ({ref})")

    if found:
        ev_lines = []
        for i, ev in enumerate(found, 1):
            ref = f"arXiv:{ev['arxiv_id']}"
            if ev.get("doi"):
                ref += f", DOI:{ev['doi']}"
            ev_lines.append(f'({i}) "{ev["sentence"]}" — {ref}')

        return {
            "verdict": "YES (literature-supported)",
            "reason": (
                f"Searched {n} arXiv papers.\n\n"
                "Papers searched:\n" + "\n".join(paper_refs) + "\n\n"
                f"Supporting sentences for {X} → {Y}:\n" + "\n".join(ev_lines)
            ),
            "uncertainty": (
                "Evidence is extracted from abstracts only.\n"
                "Some abstracts describe associations rather than interventions.\n"
                "Stronger claims require full-text analysis and study design validation."
            )
        }

    return {
        "verdict": "NO DIRECT ABSTRACT EVIDENCE",
        "reason": (
            f"Searched {n} arXiv papers.\n\n"
            "Papers searched:\n" + "\n".join(paper_refs) + "\n\n"
            f"No abstract sentence explicitly supported {X} → {Y}."
        ),
        "uncertainty": (
            "Absence of evidence ≠ absence of causality.\n"
            "The claim may exist in full text or non-arXiv sources (e.g., journals, patents)."
        )
    }


6. Visualization of the Literature-Derived Causal Graph

- This function visualizes the causal graph constructed from literature evidence, displaying variables as nodes and supported causal relationships as directed edges.
- Edges are drawn with clear arrowheads and spacing to ensure interpretability, making the evidence-backed causal structure easy to inspect.

In [None]:
# def plot_literature_graph(lit_graph: LiteratureCausalGraph):
#     G = lit_graph.graph
#     if G.number_of_edges() == 0:
#         print("Graph is empty.")
#         return

#     pos = nx.spring_layout(G, seed=42, k=0.6)

#     plt.figure(figsize=(4.5, 4.5))

#     nx.draw_networkx_nodes(
#         G, pos,
#         node_size=3000,
#         node_color="lightblue",
#         edgecolors="black"
#     )

#     nx.draw_networkx_edges(
#         G, pos,
#         arrows=True,
#         arrowstyle="-|>",
#         arrowsize=18,
#         width=2,
#         edge_color="red",
#         min_source_margin=40,
#         min_target_margin=40
#     )

#     nx.draw_networkx_labels(G, pos, font_size=10, font_weight="bold")
#     plt.margins(0.50)
#     plt.title("Literature-Derived Causal Graph")
#     plt.axis("off")

#     plt.show()


def plot_literature_graph(
    lit_graph: LiteratureCausalGraph,
    save_path: str = None,
    show_empty: bool = True,
):
    G = lit_graph.graph

    # ---- Handle empty graph case safely ----
    if G.number_of_edges() == 0:
        print("Graph is empty.")

        if show_empty:
            fig, ax = plt.subplots(figsize=(4.5, 4.5))
            ax.text(
                0.5, 0.5,
                "No causal edges\nfound in literature",
                ha="center",
                va="center",
                fontsize=12,
                fontweight="bold",
                transform=ax.transAxes
            )
            ax.set_title("Literature-Derived Causal Graph")
            ax.axis("off")

            if save_path is not None:
                fig.savefig(save_path, dpi=300, bbox_inches="tight")

            return fig

        return None

    # ---- Non-empty graph plotting ----
    pos = nx.spring_layout(G, seed=42, k=0.6)

    fig, ax = plt.subplots(figsize=(4.5, 4.5))

    nx.draw_networkx_nodes(
        G, pos,
        ax=ax,
        node_size=3000,
        node_color="lightblue",
        edgecolors="black"
    )

    nx.draw_networkx_edges(
        G, pos,
        ax=ax,
        arrows=True,
        arrowstyle="-|>",
        arrowsize=18,
        width=2,
        edge_color="red",
        min_source_margin=40,
        min_target_margin=40
    )

    nx.draw_networkx_labels(
        G, pos,
        ax=ax,
        font_size=10,
        font_weight="bold"
    )

    ax.margins(0.50)
    ax.set_title("Literature-Derived Causal Graph")
    ax.axis("off")

    if save_path is not None:
        fig.savefig(save_path, dpi=300, bbox_inches="tight")

    return fig


7. Running the Literature-Grounded Causal Analysis

- This block executes the full causal reasoning workflow end to end: it generates a literature search query for the variables of interest, retrieves relevant papers, extracts causal evidence, constructs a causal graph, and produces a final verdict with supporting citations and uncertainty.
- By changing the values of X and Y, the same pipeline can be reused to analyze different causal hypotheses in a consistent and transparent manner.

In [None]:
# X = "cation substitution"
# Y = "lattice parameter"

# query = gpt_build_search_query(client, X, Y)
# papers = search_arxiv(query, max_results=50)
# evidence = extract_causal_evidence(client, X, Y, papers)
# lit_graph = build_graph_from_evidence(X, Y, evidence)

# result = literature_reasoning(lit_graph, X, Y, papers)

# print("\nVERDICT:", result["verdict"])
# print("\nREASONING:\n", result["reason"])
# print("\nUNCERTAINTY:\n", result["uncertainty"])

# plot_literature_graph(lit_graph)


saving and research table

In [None]:
def safe_name(text):
    return text.lower().replace(" ", "_").replace("/", "_")


In [None]:
def run_and_save_literature_reasoning(
    client,
    X,
    Y,
    base_dir="llm_outputs",
    max_results=20
):
    # ---- folder setup ----
    query_name = f"{safe_name(X)}__{safe_name(Y)}"
    output_dir = os.path.join(base_dir, "queries", query_name)
    os.makedirs(output_dir, exist_ok=True)

    # ---- pipeline ----
    query = gpt_build_search_query(client, X, Y)
    papers = search_arxiv(query, max_results=max_results)
    evidence = extract_causal_evidence(client, X, Y, papers)
    lit_graph = build_graph_from_evidence(X, Y, evidence)
    result = literature_reasoning(lit_graph, X, Y, papers)

    # ---- save JSON outputs ----
    verdict_path = os.path.join(output_dir, "verdict.json")
    with open(verdict_path, "w") as f:
        json.dump(result, f, indent=2)

    metadata = {
        "X": X,
        "Y": Y,
        "query": query,
        "num_papers": len(papers),
    }
    with open(os.path.join(output_dir, "metadata.json"), "w") as f:
        json.dump(metadata, f, indent=2)

    # ---- save plot ----
    fig = plot_literature_graph(lit_graph)
    fig_path = os.path.join(output_dir, "literature_graph.png")
    fig.savefig(fig_path, dpi=300, bbox_inches="tight")

    # ---- console output ----
    print(f"\nSaved results to: {output_dir}")
    print("\nVERDICT:", result["verdict"])
    print("\nUNCERTAINTY:", result["uncertainty"])

    return result


In [None]:
pairs = [
    ("alkali cations", "metal cations"),
    ("alkali cations", "crystallographic lattice parameters"),
    ("alkali cations", "lattice angle"),
    ("alkali cations", "unit cell volume"),
    ("alkali cations", "in-plane polarization"),

    ("metal cations", "crystallographic lattice parameters"),
    ("metal cations", "lattice angle"),
    ("metal cations", "unit cell volume"),
    ("metal cations", "in-plane polarization"),

    ("crystallographic lattice parameters", "lattice angle"),
    ("crystallographic lattice parameters", "unit cell volume"),
    ("crystallographic lattice parameters", "in-plane polarization"),

    ("lattice angle", "unit cell volume"),
    ("lattice angle", "in-plane polarization"),

    ("unit cell volume", "in-plane polarization"),
]


for X, Y in pairs:
    run_and_save_literature_reasoning(client, X, Y)