# Talk to your App

In majority of the examples, LLM Usage is shown as a chat-bot or personal assistant. 

A more powerful usage of Agentic-AI are when they interact with the system in the background and enrich it or take autonomous actions. This is conceptually similar to streaming analytics - with analytics being processed by Agents powered by LLMs.

You will see that for chat-bot type usage - if the AI infrastructure is down, things still work.
However for the Agents in backend - if the AI system is down, then the system will suffer some down time.

_Each module is typically dependent on the prior modules having been completed successfully_

In [17]:
import openai
import re
import httpx
import os
import rich
import json
from openai import OpenAI
from agents import Agent, ModelSettings, function_tool,Runner,AsyncOpenAI,OpenAIChatCompletionsModel
from dotenv import load_dotenv, find_dotenv
from rich.pretty import pprint
from rich import print

api_key = "dummy_key"
os.environ["OPENAI_API_KEY"] = api_key

model = "llama3.2:3b-instruct-fp16" 
base_url = "http://localhost:11434/v1/"

model = OpenAIChatCompletionsModel( 
    model=model,
    openai_client=AsyncOpenAI(base_url="http://localhost:11434/v1",api_key = api_key),
)

print("[green] Model setup[/green]")

## Automated Error Handling

1.The process is started when an ansible job log completion data comes in. (as of now, we simulate this as a user input)
1. It examimes if there is any error. If no error, it ends
1. If there is an error:
    - Agent analyzes and recommends
    - Agent opens a jira ticket
    - Agent sends a slack message

![Workflow](resources/images/agent_log_workflow.png)

In [18]:
from dataclasses import dataclass
from typing import Literal

from agents import Agent, ItemHelpers, Runner, TResponseInputItem, trace


@dataclass
class Advisor:
    body: str

@dataclass
class Slacker:
    body: str

@dataclass
class JIRAer:
    body: str


advisor_agent = Agent(
    name="advisor_agent",
    instructions="You can look at the contents of an ansible log and spot the error. You will describe what the error is in a few crisp sentences so that a human can take corrective actions.",
    model = model,
    output_type=Advisor,
)

slack_agent = Agent(
    name="slack_agent",
    instructions="If there is an error captured in the input, you will always state - I have slacked the message. Also add the contents that in the provided input. Give a made up slack link. If there is no error, then just say - All is well, there is nothing to be done.",
    model = model,
    output_type=Slacker,
)

jira_agent = Agent(
    name="jira_agent",
    instructions="If there is an error captured in the input, you will always state - I have opened a JIRA ticket. Also add the contents that in the provided input. Give a made up JIRA Handle. If there is no error, then just say - All is well, there is nothing to be done.",
    model = model,
    output_type=JIRAer,
)



In [19]:
#msg = "all tasks successfully finished"
msg = "could not connect to the host as the password as expired. "
inputs = [{"content": msg, "role": "user"}]

#with trace("Router"):
#    story_outline_result = await Runner.run(advisor_agent,inputs)
#    #uncomment this to see the details
#    pprint(story_outline_result)
#    print("--------------------------")
#    print(story_outline_result.final_output)

with trace("Workflow"):
    print("----------Advisor Output----------")
    advisor_result = await Runner.run(advisor_agent,inputs)
    print(advisor_result.final_output.body)
    advisor_output: Advisor = advisor_result.final_output
    
    print("----------JIRA Output----------")
    jira_result = await Runner.run(jira_agent,advisor_output.body)
    print(jira_result.final_output.body)
    
    print("----------Slack Output----------")
    slack_result = await Runner.run(slack_agent,advisor_output.body)
    print(slack_result.final_output.body)

# Human-in-the-Loop
But LLMs hallucinate!

Yes, LLMs, just like traditional AI and human beings may not always give the right answer. To handle those kind of possible mistakes, we have processes in place.

In the above example, we are seamlessly blending human-in-the-loop when we open a JIRA ticket or Slack a message. Even if the contents of these are not entirely accurate, we do not lose much because a person can check and correct if needed. 

In distributed systems, there are lots of similar examples - which has been around us for a long time - to take care of possible errors: for example those which arise out of CAP Theorem related inconsistencies. 


## Extend the Debugging Agent capabilities to make it robust
We shared in 05-agents.ipynb a debugger agent. Let us make it more robust. What happens if the answer given by the debugger has obvious gaps or hallucinates. Can we get another agent to review it and fix it? Let us see.

### This is a clone of the last lesson

In [21]:
@function_tool
def get_dependency(service:str) ->list[str]:
    dep_service=["ProductCatalogService","CheckoutService","UserProfileService"] 
    return dep_service

did_agent = Agent(
    name="DependencyIdentifier Agent",
    instructions=(
        "An incident will be passed on.\n"
        "From that, firstly identify the affected service name only.\n"
        "Next, identify what are the service dependencies for that service.\n"
        "Just return all service names in a comma separated format like a python list[str]. Also include the affected service.\n"
        "And nothing else"
    ),
    model=model,
    tools=[get_dependency],
)

@function_tool
def get_changelog(service:list) ->list[str]:
    change_log=["ProductCatalogService changed","CheckoutService changed"]
    return change_log

change_agent = Agent(
    name="ChangeLog Agent",
    instructions=(
        "An array of service names will be passed on.\n"
        "Identify what has changed with these services and return them.\n"
        "Just return all changes in a comma separated format like a python list[str].\n"
        "Do not return duplicate changes"
    ),
    model= model,
    tools=[get_changelog],
)

@function_tool
def get_errorlog(service:list) ->list[str]:
    error_log=["ProductCatalogService is responding slowly"]
    return error_log

error_agent = Agent(
    name="Error Log Agent",
    instructions=(
        "An array of service names will be passed on. \n"
        "Note that all services may not have error messages and it is unlikely that same message appear in logs of all services. \n"
        "The error messages will have service names in the messages. \n"
        "Identify the error messages in the logs if any and corresponding service name in which the error happens"
    ),
    model=model,
    tools=[get_errorlog],
)

debugger_agent = Agent(
    name="Debugger Agent",
    instructions=(
        "You will be given:\n"
        "1. Incident details.\n"
        "2. Services that could have been root cause of the problem.\n"
        "3. Services that were changed in the time interval.\n"
        "4. Services that had errors in the logs.\n"
        "Based on the above, loigically think through and conclude the most likely reason for this problem. \n"
        "Please lay down your thought process clearly that led you to the conclusion. "
    ),
    model= model
)


### Now let us add a new agent
We add a Verification Agent whose sole job is to audit each diagnosis before you act on it. In practice this agent will:

- Read the incident summary, the list of services, and the debugger’s reasoning
- Check for mismatches or missing facts (e.g., a service name dropped or an error overlooked)
- Flag any inconsistencies or confirm “All clear”

By doing so, we get an extra safety net that catches accidental oversights or AI hallucinations. 

In [22]:
verification_agent = Agent(
    name="Verification Agent",
    instructions=(
        "You’ll be given four parts:\n"
        "1) The incident description\n"
        "2) A list of services\n"
        "3) The debugger agent’s full reasoning and conclusion\n\n"
        "Check for any of these issues:\n"
        " • References to services not in the original list\n"
        " • Conclusions that contradict the provided errors/changes\n"
        " • Missing any service that clearly had errors or changes\n\n"
        "If everything is consistent, reply “Consistent”. Otherwise, list the problems."
    ),
    model=model
)


### Notice the new section added under orchestrate
Invokes the verification agent after debugger agent is done.

In [23]:
import asyncio
async def orchestrate(input):
    # Call the intermediate agents to gather the facts
    # These all use tools heavily
    dep_result = await Runner.run(did_agent,input)
    change_result = await Runner.run(change_agent, dep_result.final_output)
    error_result = await Runner.run(error_agent, dep_result.final_output)

    services = dep_result.final_output               # e.g. ["foo","bar","baz"]
    changes  = change_result.final_output             # e.g. ["foo changed","bar changed"]
    errors   = error_result.final_output              # e.g. ["foo is responding slowly"]

    # Build a single prompt string:
    message = (
        "Incident details: " + input + "\n"
        "Affected services: " + services + "\n"
        "Changes detected: " + changes + "\n"
        "Error logs: " + errors + "\n"
        "Based on the above, logically think through and conclude the most likely reason for this problem. "
        "Please lay down your thought process clearly that led you to the conclusion."
    )
    print("\n")
    print("Input to the Deubgger Agent: ")
    print("-----------------------------")
    print(message)
    print("\n")
    # Invoke it:
    debugger_result = await Runner.run(debugger_agent, message)
    return debugger_result.final_output

### Calling the orchestration function as in the past

In [24]:
input = "Incident: ShoppingCart response time has increased to 10 sec"
diagnosis = await orchestrate(input)
print("=============================================")
print("=== Verifier Thought Process & Conclusion ===")
print("=============================================")
print(diagnosis)

# Going into Production

We have seen now how a so called output by AI can be cross checked. This is widely used pattern in Agentic workflows.

One other important issue - how does the system Learn ?

1. Let us say Agentic application gives a certain resolution to an incident.
1. And the engineer verifies it to be correct.
1. Or the engineer verifies it not be correct and knows the correct solution.

How can we enhance our agentic application with this.

## Learning
### Integrate Human Feedback
- Whenever the Verification Agent flags a problem, route the case to an engineer for review.
- Provide a simple thumbs-up/thumbs-down or rating interface. Feed that rating back into your store.
- Also think of allowing general text entry field allowing engineer to enter what was done, if it was a thumbs down

### Capture and store outcomes
- Capture the above data and store it in a database

### Feedback the data
- Provide this data as an additional context to the debugger agent (or perhaps add another agent) which looks at this data and fine tunes the recommendation.

### Continuously refine your agents
- Periodically pull the best-rated incident examples (and their human-approved diagnoses) to create few-shot prompts or even fine-tune a custom model.
- Update agent instructions based on common failure modes (e.g. “always double-check inventory data”).

Some of the other next steps could be -

## Build visibility and dashboards
- Surface the agents’ findings and verification results in a team dashboard—showing average time to diagnosis, verification pass rates, and automation success rates.
- Use that data to spot gaps (e.g. Services that consistently fool the Debugger) and add new special-purpose agents.

## Hook into real incident streams
- Connect the orchestration function to the monitoring/alerting system (e.g. Prometheus Alertmanager, CloudWatch Alarms, PagerDuty webhooks). Therefore every time an alert fires for “shopping-cart latency,” the agents automatically kick off the dependency→change→error→debug→verify chain.

## Future State
- Let us imagine a state in future where there may be many agents that can gather data. In the above example it was confined to 3:
  - discovering service dependency,
  - looking at change log and
  - looking at application logs.
Let us say there are agents for metrics, anomaly detection, cluster health (for the cluster on which the service is running on etc).
- In that case, the incident could go to a `Planner agent` that decides to breakdown the troubleshooting into steps and call agents for each steps.
- And then hand over the summary to the Debugger Agent and Verifier agent as shown above.