# Lesson 6: Improve Agent's GPA

In this lesson, you'll make two targeted changes to the agent:

1. Adjust the planning prompt to include explicit goals, pre-conditions, and post-conditions for each step. This helps the executor understand the sub-goals it needs to reach.

2. You will add inline evals so the agent receives feedback on when to do additional research. This provides the executor feedback on whether it's reaching its sub-goals.

In [None]:
import os
from dotenv import load_dotenv
import warnings

load_dotenv(override=True)
warnings.filterwarnings("ignore")

os.environ["TRULENS_OTEL_TRACING"] = "1"

<div style="background-color:#fff6ff; padding:13px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">
<p> 💻 &nbsp; <b>To access <code>requirements.txt</code>, <code>env.template</code>, <code>prompts.py</code>, and <code>helper.py</code> files:</b> 1) click on the <em>"File"</em> option on the top menu of the notebook 2) click on <em>"Open"</em>.

<p> ⬇ &nbsp; <b>Download Notebooks:</b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Download as"</em> and select <em>"Notebook (.ipynb)"</em>.</p>

</div>

## 6.1 Add inline evaluations

In [None]:
from langchain.schema import HumanMessage
from langgraph.graph import END
from langgraph.types import Command
from typing import Literal
from trulens.otel.semconv.trace import SpanAttributes
from trulens.core.otel.instrument import instrument
from helper import cortex_agent, State, f_context_relevance

from trulens.apps.langgraph.inline_evaluations import inline_evaluation

In [None]:
@inline_evaluation(f_context_relevance)
@instrument(
    span_type=SpanAttributes.SpanType.RETRIEVAL,
    attributes=lambda ret, exception, *args, **kwargs: {
        SpanAttributes.RETRIEVAL.QUERY_TEXT: args[0].get("agent_query") if args[0].get("agent_query") else None,
        SpanAttributes.RETRIEVAL.RETRIEVED_CONTEXTS: [
            ret.update["messages"][-1].content
        ] if hasattr(ret, "update") else "No tool call",
    },
)
def cortex_agents_research_node(
    state: State,
) -> Command[Literal["executor"]]:
    query = state.get("agent_query", state.get("user_query", ""))
    # Call the tool with the string query
    agent_response = cortex_agent.invoke({"messages":query})
    # Compose a message content string with all results new HumanMessage with the result
    new_message = HumanMessage(content=agent_response['messages'][-1].content, name="cortex_researcher")
    # Append to the message history
    goto = "executor"
    return Command(
        update={"messages": [new_message]},
        goto=goto,
    )

In [None]:
from helper import web_search_agent

@inline_evaluation(f_context_relevance)
@instrument(
    span_type=SpanAttributes.SpanType.RETRIEVAL,
    attributes=lambda ret, exception, *args, **kwargs: {
        SpanAttributes.RETRIEVAL.QUERY_TEXT: args[0].get("agent_query", ""),
        SpanAttributes.RETRIEVAL.RETRIEVED_CONTEXTS: [
            ret.update["messages"][-1].content
        ] if hasattr(ret, "update") else "No tool call",
    },
)
def web_research_node(
    state: State,
) -> Command[Literal["executor"]]:
    agent_query = state.get("agent_query")
    result = web_search_agent.invoke({"messages":agent_query})
    goto = "executor"
    # wrap in a human message, as not all providers allow
    # AI message at the last position of the input messages list
    result["messages"][-1] = HumanMessage(
        content=result["messages"][-1].content, name="web_researcher"
    )
    return Command(
        update={
            # share internal message history of research agent with other agents
            "messages": result["messages"],
        },
        goto=goto,
    )

## 6.2 Update the planning prompt

Add pre-conditions, post-conditions, and goals to each step in the agent's plan.

Adding this explicit detail helps the executor understand the goal of each step, which improves tool calling and agent decisions.

In [None]:
import helper
import prompts
from langchain.schema import HumanMessage

def patched_plan_prompt(state):
    base = prompts.plan_prompt(state).content
    insertion = '"action": "string",\n            "pre_conditions": ["string", ...],\n            "post_conditions": ["string", ...],\n            "goal": "string",'
    base = base.replace('"action": "string",', insertion)
    return HumanMessage(content=base)

helper.plan_prompt = patched_plan_prompt

## 6.3 Build the graph

In [None]:
from langgraph.graph import START, StateGraph
from helper import State, planner_node, executor_node, chart_node, chart_summary_node, synthesizer_node

workflow = StateGraph(State)
workflow.add_node("planner", planner_node)
workflow.add_node("executor", executor_node)
workflow.add_node("web_researcher", web_research_node)
workflow.add_node("cortex_researcher", cortex_agents_research_node)
workflow.add_node("chart_generator", chart_node)
workflow.add_node("chart_summarizer", chart_summary_node)
workflow.add_node("synthesizer", synthesizer_node)

workflow.add_edge(START, "planner")

graph = workflow.compile()

## 6.4 Create a TruLens session for logging

In [None]:
from trulens.core.session import TruSession
from trulens.core.database.connector.default import DefaultDBConnector

# Initialize connector with SQLite database one folder back
connector = DefaultDBConnector(database_url="sqlite:///default.sqlite")

# Create TruSession with the custom connector
session = TruSession(connector=connector)

## 6.5 Register the new version of the agent

<div style="background-color:#f7fff8; padding:15px; border-width:3px; border-color:#e0f0e0; border-style:solid; border-radius:6px"> 
    <p>🚨 &nbsp; In this notebook, you are directly provided with the results obtained during filming. This is to help eliminate waiting time, and to prevent potential rate limit errors that might occur in this learning environment (this learning environment is constrained, and the GPA evaluation metrics consume a significant number of tokens).
</div>

Here's the code that registers the agent with TruLens:

```python
from trulens.apps.langgraph import TruGraph

from helper import f_answer_relevance, f_context_relevance, f_groundedness, f_logical_consistency, f_execution_efficiency, f_plan_adherence, f_plan_quality

tru_recorder = TruGraph(
    graph,
    app_name="Sales Data Agent",
    app_version="L6: Inline evals + sub-goals in planning prompt",
    feedbacks=[
        f_answer_relevance,
        f_context_relevance,
        f_groundedness,
        f_logical_consistency,
        f_execution_efficiency,
        f_plan_adherence,
        f_plan_quality,
    ],
)
```

## 6.6 Re-test the agent

<div style="background-color:#f7fff8; padding:15px; border-width:3px; border-color:#e0f0e0; border-style:solid; border-radius:6px"> 
    <p>🚨 &nbsp;<b>Run Results:</b> In this notebook, you are directly provided with the results obtained during filming. This is to help eliminate waiting time, and to prevent potential rate limit errors that might occur in this learning environment (this learning environment is constrained, and the GPA evaluation metrics consume a significant number of tokens).
</div>

**Code for query 1:**
``` python
from langchain.schema import HumanMessage

with tru_recorder as recording:
    query = "What are our top 3 client deals? Chart the deal value for each."
    print(f"Query: {query}")
    state = {
                "messages": [HumanMessage(content=query)],
                "user_query": query,
                "enabled_agents": ["cortex_researcher", "web_researcher", "chart_generator", "chart_summarizer", "synthesizer"],
            }
    graph.invoke(state)

    print("--------------------------------")
```

In [None]:
records, feedback = session.get_records_and_feedback()
print(f"Query: {records.iloc[3]['input']}\n")
print(f"Output: {records.iloc[3]['output']}\n")

**Code for query 2**

```python
with tru_recorder as recording:
    query = "Identify our pending deals, research if they may be experiencing regulatory changes, and using the meeting notes for each customer, provide a new value proposition for each given the regulatory changes."
    print(f"Query: {query}")
    state = {
                "messages": [HumanMessage(content=query)],
                "user_query": query,
                "enabled_agents": ["cortex_researcher", "web_researcher", "chart_generator", "chart_summarizer", "synthesizer"],
            }
    graph.invoke(state)

    print("--------------------------------")
```

In [None]:
print(f"Query: {records.iloc[4]['input']}\n")
print(f"Output: {records.iloc[4]['output']}\n")

**Code for query 3**
```python
with tru_recorder as recording:
    query = "Identify our largest client deal, then find important topics in the meeting notes with that company, and find a news article related to the important topics discussed."
    print(f"Query: {query}")
    state = {
                "messages": [HumanMessage(content=query)],
                "user_query": query,
                "enabled_agents": ["cortex_researcher", "web_researcher", "chart_generator", "chart_summarizer", "synthesizer"],
            }
    graph.invoke(state)

    print("--------------------------------")
```

In [None]:
print(f"Query: {records.iloc[5]['input']}\n")
print(f"Output: {records.iloc[5]['output']}\n")

## 6.7 Launch TruLens dashboard

By comparing to the previous version, we can validate the changes.

**Note:** Make sure to click on the second link (not the localhost) to open the TruLens dashboard.

In [None]:
from trulens.dashboard import run_dashboard
import os
str_port = 8005
_ = run_dashboard(port=str_port)
print(os.environ['DLAI_LOCAL_URL'].format(port=str_port))

**What other improvements could be also done?**
- In this course, we focused on evaluating the end-to-end agent behavior. We could have also tested the behavior of each specialized agent separately to optimize their prompt and design.
- We could have added other metrics for inline-evaluations.
- We could also updated the prompt of the executor. 