# TruLens + LangGraph + DSPy

This experimental notebook is a WIP on building/harnessing TruLens feedback functions to evaluate LangGraph/DSPy applications. This notebook is not intended for production use, and should be followed with caution.
TruLens feedback functions can be used in tandem with DSPy, more specifically as metrics for evaluation and optimization with respect to prompt engineering and agent workflow examination.

In [None]:
import json
import operator
import os
from typing import Annotated, List, TypedDict
import warnings

from dotenv import load_dotenv
import dspy
from langchain_community.tools.ddg_search.tool import DuckDuckGoSearchResults
from langchain_core.tools import StructuredTool
from langchain_core.tools import Tool
from langgraph.graph import StateGraph
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=UserWarning)

# load env variables
load_dotenv()

In [None]:
# Configure DSPy
lm = dspy.LM(
    model="gpt-5",
    temperature=1.0,
    max_tokens=16000,
    api_key=os.getenv("OPENAI_API_KEY"),
)
dspy.configure(lm=lm)

In [None]:
# Define LangGraph state with annotated query
class FinanceState(TypedDict):
    query: Annotated[str, operator.add]
    intents: List[str]
    results: dict
    answer: str
    citations: List[str]


# Initialize tools
news_search_tool = DuckDuckGoSearchResults(
    backend="news", max_results=5, verbose=True, output_format="list"
)
text_search_tool = DuckDuckGoSearchResults(
    backend="text", max_results=5, verbose=True, output_format="list"
)

n_search_tool = Tool(
    name="News Search",
    func=news_search_tool.invoke,
    description="Search the web for recent news articles.",
)

t_search_tool = Tool(
    name="Text Search",
    func=text_search_tool.invoke,
    description="Search the web for financial data.",
)


# Selenium-based browser tool
def browse_page_selenium(url: str, query: str) -> str:
    """Browse a webpage and return the content."""
    # Configure Chrome options
    chrome_options = Options()
    chrome_options.add_argument("--headless")
    chrome_options.add_argument("--disable-gpu")
    driver = webdriver.Chrome(options=chrome_options)
    try:
        driver.get(url)
        content = driver.find_element("tag name", "body").text
        return content
    except Exception as e:
        return f"Error browsing {url}: {str(e)}"
    finally:
        driver.quit()


browse_tool = StructuredTool.from_function(
    name="Browse Webpage",
    func=browse_page_selenium,
    description="Navigate to a URL and extract specific financial data or news.",
)

In [None]:
# Define DSPy modules
class QueryAnalyzer(dspy.Module):
    def __init__(self):
        super().__init__()
        self.predict = dspy.Predict(
            signature="query -> intents",
            prompt="""
            Classify the query into one or more of the following intents: stock_data, fundamentals, news.
            If the query does not match any of these, return an empty list.
            Return the intents as a JSON list.

            Examples:
            - Query: "What is Apple's current stock price?" Intents: ["stock_data"]
            - Query: "What is the latest news on Tesla?" Intents: ["news"]
            - Query: "What is the P/E ratio of Microsoft?" Intents: ["fundamentals"]
            - Query: "What is the market cap of Amazon?" Intents: ["stock_data"]
            - Query: "What is the weather today?" Intents: []

            Query: {query} Intents:
            """,
        )

    def forward(self, query: str):
        result = self.predict(query=query)
        try:
            intents = json.loads(result.intents)
        except json.JSONDecodeError:
            intents = [result.intents] if result.intents else []
        return dspy.Prediction(intents=intents)


class StockDataModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.search = dspy.Predict(
            signature="query -> search_query",
            prompt="""
            Generate a search query to find stock prices or trends for the given query.
            Focus on real-time or historical stock data.

            Examples:
            - Query: "What is Apple's current stock price?" Search Query: "Apple stock price"
            - Query: "Tesla stock trend last month" Search Query: "Tesla stock price history last month"

            Query: {query} Search Query:
            """,
        )
        self.extract = dspy.Predict(
            signature="contents, query -> stock_data",
            prompt="""
            Extract stock prices or trends from multiple webpage contents based on the query.
            If numerical data (e.g., prices), compute the average if consistent, or select the most recent.
            If trends, summarize the direction (e.g., rising, falling).
            Return the aggregated result.

            Examples:
            - Query: "What is Apple's current stock price?" Contents: ["...AAPL $210.42...", "...AAPL $210.50..."] Stock Data: "AAPL $210.46"
            - Query: "Tesla stock trend last month" Contents: ["...TSLA rose 5%...", "...TSLA up 4.8%..."] Stock Data: "TSLA rose approximately 4.9% last month"

            Contents: {contents} Query: {query} Stock Data:
            """,
        )

    def forward(self, query: str):
        search_query = self.search(query=query).search_query
        search_results = text_search_tool.run(search_query)
        urls = [
            result["link"] for result in search_results if "link" in result
        ][:3]  # take top 3 results
        if not urls:
            return {"stock_data": "No relevant stock data found", "sources": []}
        contents = []
        sources = []
        for url in urls:
            content = browse_tool.run({"url": url, "query": query})
            if not content.startswith("Error"):
                contents.append(content)
                sources.append(url)
        if not contents:
            return {"stock_data": "No relevant stock data found", "sources": []}
        result = self.extract(contents=contents, query=query)
        return {"stock_data": result.stock_data, "sources": sources}


class FundamentalsModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.search = dspy.Predict(
            signature="query -> search_query",
            prompt="""
            Generate a search query to find fundamental financial metrics or statements for the given query.
            Focus on metrics like P/E ratio, market cap, EPS, etc or financial statements.

            Examples:
            - Query: "What is the P/E ratio of Apple?" Search Query: "Apple P/E ratio"
            - Query: "What is the market cap of Tesla?" Search Query: "Tesla market cap"

            Query: {query} Search Query:
            """,
        )
        self.extract = dspy.Predict(
            signature="contents, query -> fundamentals_data",
            prompt="""
            Extract financial metrics from multiple webpage contents based on the query.
            For numerical metrics (e.g., P/E ratio, market cap, EPS), select the most frequent or average if consistent.
            For statements, summarize the key points.
            Return the aggregated result.

            Examples:
            - Query: "What is the P/E ratio of Apple?" Contents: ["...Apple's P/E ratio is 20..."] Fundamentals Data: "Apple's P/E ratio is 20"
            - Query: "What is the market cap of Tesla?" Contents: ["...Tesla's market cap is $1 trillion...", "...Tesla market cap $1.2 trillion..."] Fundamentals Data: "Tesla's market cap is approximately $1.1 trillion"

            Contents: {contents} Query: {query} Fundamentals Data:
            """,
        )

    def forward(self, query: str) -> str:
        search_query = self.search(query=query).search_query
        search_results = text_search_tool.run(search_query)
        urls = [
            result["link"] for result in search_results if "link" in result
        ][:3]  # take top 3 results
        if not urls:
            return {
                "fundamentals_data": "No relevant fundamentals data found",
                "sources": [],
            }
        contents = []
        sources = []
        for url in urls:
            content = browse_tool.run({"url": url, "query": query})
            if not content.startswith("Error"):
                contents.append(content)
                sources.append(url)
        if not contents:
            return {
                "fundamentals_data": "No relevant fundamentals data found",
                "sources": [],
            }
        result = self.extract(contents=contents, query=query)
        return {
            "fundamentals_data": result.fundamentals_data,
            "sources": sources,
        }


class NewsAnalysisModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.search = dspy.Predict(
            signature="query -> search_query",
            prompt="""
            Generate a search query to find news articles or analyst opinions for the given query.
            Focus on recent company news and earnings reports.

            Examples:
            - Query: "What is the latest earnings news on Apple?" Search Query: "Apple earnings news"
            - Query: "Is Tesla a good investment?" Search Query: "Tesla investment analysis news"

            Query: {query} Search Query:
            """,
        )
        self.extract = dspy.Predict(
            signature="contents, query -> news_summary",
            prompt="""
            Summarize key points and analyst opinions from multiple news articles or analyst reports based on the query.
            Combine key points into a concise summary, prioritizing consistency across sources...,.

            Examples:
            - Query: "What is the latest earnings news on Apple?" Contents: ["...strong iPhone sales...", "...record Macbook revenue..."] News Summary: "Apple reported strong iPhone sales and record Macbook revenue"
            - Query: "Is Tesla a good investment?" Contents: ["...Tesla's market cap is $1 trillion...", "...Tesla's EPS is $1.20..."] News Summary: "Analysts are bullish on Tesla's future prospects"

            Contents: {contents} Query: {query} News Summary:
            """,
        )

    def forward(self, query: str) -> str:
        search_query = self.search(query=query).search_query
        search_results = news_search_tool.run(search_query)
        urls = [
            result["link"] for result in search_results if "link" in result
        ][:3]  # take top 3 results
        if not urls:
            return {"news_summary": "No relevant news found", "sources": []}
        contents = []
        sources = []
        for url in urls:
            content = browse_tool.run({"url": url, "query": query})
            if not content.startswith("Error"):
                contents.append(content)
                sources.append(url)
        if not contents:
            return {"news_summary": "No relevant news found", "sources": []}
        result = self.extract(contents=contents, query=query)
        return {"news_summary": result.news_summary, "sources": sources}

In [None]:
stock_keywords = ["stock", "stock_data", "price", "trend"]
fundamentals_keywords = [
    "fundamentals",
    "ratio",
    "financial",
    "metrics",
    "statement",
    "P/E",
    "EPS",
]
news_keywords = ["news", "earnings", "analyst", "opinion", "report."]

In [None]:
# Define LangGraph nodes
def query_analyzer_node(state: FinanceState) -> FinanceState:
    analyzer = QueryAnalyzer()
    result = analyzer(state["query"])
    return {"intents": result.intents}


def stock_data_node(state: FinanceState) -> FinanceState:
    if any(
        any(kw in intent for kw in stock_keywords)
        for intent in state["intents"]
    ):
        module = StockDataModule()
        result = module(state["query"])
        state["results"]["stock_data"] = result["stock_data"]
        state["citations"].extend(result["sources"])
    return state


def fundamentals_node(state: FinanceState) -> FinanceState:
    if any(
        any(kw in intent for kw in fundamentals_keywords)
        for intent in state["intents"]
    ):
        module = FundamentalsModule()
        result = module(state["query"])
        state["results"]["fundamentals_data"] = result["fundamentals_data"]
        state["citations"].extend(result["sources"])
    return state


def news_analysis_node(state: FinanceState) -> FinanceState:
    if any(
        any(kw in intent for kw in news_keywords) for intent in state["intents"]
    ):
        module = NewsAnalysisModule()
        result = module(state["query"])
        state["results"]["news_summary"] = result["news_summary"]
        state["citations"].extend(result["sources"])
    return state


def combine_results_node(state: FinanceState) -> FinanceState:
    results = state["results"]
    answer_parts = []
    if "stock_data" in results:
        answer_parts.append(f"Stock Data: {results['stock_data']}")
    if "fundamentals_data" in results:
        answer_parts.append(
            f"Fundamentals Data: {results['fundamentals_data']}"
        )
    if "news_summary" in results:
        answer_parts.append(f"News Summary: {results['news_summary']}")
    answer = (
        " ".join(answer_parts)
        if answer_parts
        else "No relevant data found for the query."
    )
    seen = set()
    citations = [
        c for c in state["citations"] if not (c in seen or seen.add(c))
    ]
    return {"answer": answer, "citations": citations}

In [None]:
# Build LangGraph workflow
workflow = StateGraph(FinanceState)

# Add nodes
workflow.add_node("query_analyzer", query_analyzer_node)
workflow.add_node("stock_data", stock_data_node)
workflow.add_node("fundamentals", fundamentals_node)
workflow.add_node("news_analysis", news_analysis_node)
workflow.add_node("combine_results", combine_results_node)

# Define edges (sequential execution)
workflow.add_edge("query_analyzer", "stock_data")
workflow.add_edge("stock_data", "fundamentals")
workflow.add_edge("fundamentals", "news_analysis")
workflow.add_edge("news_analysis", "combine_results")
workflow.set_entry_point("query_analyzer")
workflow.set_finish_point("combine_results")

# Compile graph
graph = workflow.compile()

In [None]:
# Optimize DSPy modules
def optimize_modules():
    optimizer = dspy.MIPROv2(metric="REPLACE")
    QueryAnalyzer().optimize(optimizer)
    StockDataModule().optimize(optimizer)
    FundamentalsModule().optimize(optimizer)
    NewsAnalysisModule().optimize(optimizer)

In [None]:
# Add TruLens feedback function
from trulens.providers.openai import OpenAI

# Use GPT-4o for RAG Triad Evaluations
provider = OpenAI(model_engine="gpt-4o")

In [None]:
# Example usage
def run_pipeline(query: str) -> dict:
    state = FinanceState(
        query=query, intents=[], results={}, answer="", citations=[]
    )
    result = graph.invoke(state)
    return {"answer": result["answer"], "citations": result["citations"]}

    # for state_step in graph.stream(state):
    #     print(state_step)


# Run pipeline
if __name__ == "__main__":
    # Optimize DSPy modules (periodically)
    # optimize_modules()

    # Example queries
    queries = [
        "Is Apple a good investment?",
        # "What is the P/E ratio of Tesla?",
        # "What is the market cap of Amazon?",
        # "What is the weather today?"
    ]

    for query in queries:
        result = run_pipeline(query)
        print("\n")
        print(f"Query: {query}")
        print(f"Answer: {result['answer']}")
        print(f"Citations: {result['citations']}")
        print("\n")
        print(
            f"Evaluation: {provider.relevance_with_cot_reasons(query, result['answer'])}"
        )

Example output:

```
[][{'snippet': 'Discover the innovative world of Apple and shop everything iPhone, iPad, Apple Watch, Mac, and Apple TV, plus explore accessories, entertainment, and expert device support.', 'title': 'Apple', 'link': 'https://www.apple.com/'}, {'snippet': "Sep 9, 2025 · The iPhone 17 is here, along with a very thin iPhone Air. There are three new Apple watches to tell you how you're feeling, and a pair of AirPods Pro 3 that can translate between languages.", 'title': 'Everything Apple Announced: iPhone Air, iPhone 17, Apple Watches ...', 'link': 'https://www.wired.com/story/everything-apple-announced-iphone-air-iphone-17-apple-watches-airpods-pro-3/'}, {'snippet': 'Shop the latest Apple products, accessories and offers. Compare models, get expert shopping help, plus flexible payment and delivery options.', 'title': 'Apple Store Online', 'link': 'https://www.apple.com/store'}, {'snippet': 'Learn more about popular features and topics, and find resources that will help you with all of your Apple products.', 'title': 'Official Apple Support', 'link': 'https://support.apple.com/'}]

Query: Is Apple a good investment?
Answer: Stock Data: No relevant stock data found Fundamentals Data: Short answer: The provided content is marketing pages and a news recap about Apple’s latest product launches (iPhone 17 lineup, iPhone Air, Apple Watch Series 11/Ultra 3/SE 3, AirPods Pro 3, accessories, and carrier/education promotions). It contains no financial statements, guidance, or valuation data, so you cannot determine whether Apple is a “good investment” from these materials alone.

What the documents indicate (potential business drivers):
- Broad refresh across core hardware:
  - iPhone 17 lineup with premium pricing: iPhone 17 from $799, iPhone Air $999, 17 Pro $1,099, 17 Pro Max $1,199; availability Sept 19.
  - New designs/features (e.g., “Camera Plateau,” vapor chamber cooling, 48MP telephoto up to 8x optical-like zoom; N1 wireless; C1X modem on iPhone Air).
  - Apple Watch Series 11/Ultra 3/SE 3 with battery-life gains and new health/safety features; SE from $249, Series 11 from $399, Ultra 3 from $799.
  - AirPods Pro 3 at $249 with stronger ANC, heart-rate sensing, and live translation features; availability Sept 19.
- Promotions and ecosystem monetization:
  - Carrier deals (AT&T/T-Mobile/Verizon/Boost) that can stimulate iPhone upgrades.
  - Education promo through 9/30/2025 offering AirPods or eligible accessory with Mac/iPad purchase.
  - New accessories (cases, straps, MagSafe battery) that can raise attach rate.
  - Services trials (Apple TV+/Music) that can support Services engagement.

Why this is not enough to judge the stock:
- No revenue, margin, cash flow, buyback/dividend details, or valuation (P/E, EV/EBIT, FCF yield).
- No unit/ASP expectations, guidance, or regional demand data.
- No competitive/market-share analysis or regulatory/supply-chain risks in these materials.

What you’d still need to evaluate investment merit:
- Recent financials: revenue growth by segment (iPhone, Services, Wearables), gross/operating margins, FCF, net cash/debt, capital returns (buybacks/dividends).
- Valuation versus history/peers (e.g., multiples vs. megacap tech and hardware names).
...
Citations: ['https://www.apple.com/', 'https://www.wired.com/story/everything-apple-announced-iphone-air-iphone-17-apple-watches-airpods-pro-3/', 'https://www.apple.com/store']


Evaluation: (0.6666666666666666, {'reason': "Criteria: The response is relevant to the prompt but lacks financial analysis.\nSupporting Evidence: The response discusses Apple's product launches and potential business drivers but states that financial data is needed to determine if Apple is a good investment."})
```