## Agentic AI - XDR Stories

<div style="background-color: #FFEBCC; border-left: 6px solid #FFA500; padding: 10px;">
<b>Warning:</b>This example requires the use of a third party, commercial LLM (OpenAI) which charges for API use. The XDR incident data being sent to OpenAI can be substantial and will often include sensitive data such as PII. Anyone seeking to build an agent using the process described below should ensure that they are willing to pay the charges associated with use of the LLM, and have approval to send their account data to OpenAI. Alternative models and platforms are available which might be cheaper and/or allow customers to run a model locally.
</div>

### What do we mean by "Agentic AI"?

According to ChatGPT:

```
Agentic AI refers to artificial intelligence systems that operate autonomously, making decisions and taking actions to achieve specific goals without constant human intervention.
```

The key factors which distinguish an AI Agent from a non-AI agent (such as a reminder email your mobile phone provider sends you when you are running low on credit) and which distinguish it from non-agentic AI (such as a ChatGPT conversation) are:
* **Autonomous Decision-Making:** the system can make decisions on unfamiliar data without ongoing explanation and assistance from a human.
* **Independent Execution:** the system operates when required without needing a human to activate it. This might be in response to some sort of event trigger, or a regular poll.
* **Stimulus/response:** the system receives input from some external factors, and executes actions accordingly.

In this example we will build a simple agent which manages a Cato Networks customer account's XDR incident feed, by providing analyst feedback for open stories. In order to be able to do this, it will:

1. Use the Cato API to download a list of stories generated or updated during a certain timeframe.
2. Use the OpenAI API to send each story together with a suitable prompt, receiving the model's classification of the incident and recommended settings for fields such as status and verdict.
3. Format the model's response as Cato XDR analyst feedback and submit via the Cato API, checking for any errors.

The value proposition is clear. Even though the Cato Networks XDR feature provides a low-volume, high-value feed of curated security events, if you are a small IT team, you might not have time to manage your XDR story workbench as often or as diligently as you would like. Even if you have the resources, you might appreciate having a knowledgeable and capable LLM triage the list for you, closing out incidents which do not meet the criteria for involvement by a human analyst. You may even find over time that you place a high degree of trust in the AI's capabilities to make good quality decisions, 24x7x365.

### Cato Networks XDR

```
XDR (Extended Detection and Response) is a comprehensive cybersecurity approach that integrates and correlates data from multiple security layers to provide enhanced threat detection, investigation, and response capabilities.
 
Cato Networks' XDR platform enables both Security operations and Network operations teams to monitor the organization's network for potential security threats and network performance issues. The platform includes advanced correlation engines (called Producers) that analyze traffic data to find matches for specific indications of threat activities or network issues. When a match is found, it produces a story that can be reviewed and investigated in the Cato Management Application (CMA).
```

That's the perfectly acceptable definition I received from the Cato Chatbot. Cato's XDR is a popular feature which adds an additional tier to the value pyramid:

<div style="text-align: center;"><img src="valuevolume.png" width="600"></div>

At the bottom of the pyramid are firewall events from Cato's Triple Firewall (Internet/WAN/LAN) which can be up to 90% of a Cato customer's feed. Although these are valuable from a forensic and incident response perspective, when it comes to threat hunting the large volume outweighs the value considerably. Next up are the events produced by Cato security features such as IPS, Anti Malware, CASB, DLP, as well as connectivity events such as ZTNA user connections and CMA logins. These are a much lower volume but much higher value. Finally at the top are the XDR producer models running over everything happening in the bottom tiers, producing a low volume of extremely high value, curated stories. We're going to add a new tier on top of the pyramid, so the human analyst working the list of open stories can focus on those stories which are most worthy of attention, such as incidents where not all activity was blocked.

**Our goal is to close low-severity stories while adding additional explanatory information to more interesting stories in the XDR workbench in CMA**.


### Initialising the connection to the API.
Firstly, let's import the libraries we need and set up the connection to the API. As usual, we assuming that the account ID and API key are preloaded as environment variables and we use our helper module to encapsulate the business of making an API call (see the [Getting Started](Getting%20Started.ipynb) notebook if any of this is unclear):

In [1]:
#
# Initialise the API connection
#
import datetime
import json
import os
from collections import defaultdict
from openai import OpenAI
from cato import API
C = API(os.environ["CATO_API_KEY"])

### The XDR Stories Query

The Cato Networks API call our agent will use to retrieve XDR stories is the [xdr.stories](https://api.catonetworks.com/documentation/#query-xdr.stories) query, which is documented in the knowledge base. XDR stories are **rich*** objects with multiple levels, many different fields and potentially large arrays of incident and timeseries data. A full explanation of every single field is outside the scope of this document and potentially any document, due to the dynamic nature of XDR in general, and the rate at which Cato introduces new fields and features. This is one of the primary advantages of using an LLM to analyse the story data:

**The AI should be able to understand the data without being told exactly what it is.**

This is possible because the Cato API provides fields which describe what they actually contain; for example, the object which contains the incident data is called, well, **incident**. We do however have to construct a GraphQL query which the agent will use to extract all these fields. When I need to construct a new query, especially where there are many potential fields, a method I've found to be useful is:

1. Copy the example query and variables from the Cato API documentation: [https://api.catonetworks.com/documentation/#query-xdr.stories](https://api.catonetworks.com/documentation/#query-xdr.stories)
2. Paste these into the Cato API Playground: [https://api.catonetworks.com/api/v1/graphql2](https://api.catonetworks.com/api/v1/graphql2)
3. Adjust the query, adding fields as suggested by the documentation and according to the online help provided by the Playground.
4. Copy the finished query and variables into your code.

This process resulted in the following query and variables:

In [2]:
#
# Query
#
query = '''query Stories($accountId: ID!, $from: Int!, $limit: Int!, $sort: [StorySortInput!], $filter: [StoryFilterInput!]!) {
  xdr(accountID: $accountId) {
    stories(
      input: {paging: {from: $from, limit: $limit}, sort: $sort, filter: $filter}
    ) {
      paging {
        from
        limit
        total
      }
      items {
        ...StoryBrief
      }
    }
  }
}

fragment StoryBrief on Story {
  id
  accountId
  accountName
  updatedAt
  createdAt
  analystName
  incident {
    analystFeedback {
      additionalInfo
      severity
      threatType {
        name
        recommendedAction
        details
      }
      threatClassification
      verdict
    }
    connectionType
    criticality
    description
    engineType
    firstSignal
    id
    indication
    lastSignal
    predictedVerdict
    predictedThreatType
    producerName
    queryName
    research
    site {
      id
      name
    }
    source
    sourceIp
    status
    storyDuration
    ticket
    user {
      id
      name
    }
    vendor    
    ... on Threat {
      ...ThreatIncidentBrief
    }
    ... on ThreatPrevention {
      ...ThreatPreventionIncidentBrief
    }
    ... on AnomalyStats {
    ...AnomalyStatsIncidentBrief
    }
    ... on AnomalyEvents {
      ...AnomalyEventsIncidentBrief
    }
  }
}

fragment ThreatIncidentBrief on Threat {
  site {
    id
    name
  }
  user {
    id
    name
  }
  direction
}

fragment ThreatPreventionIncidentBrief on ThreatPrevention {
  clientClass
  deviceName
  direction
  events {
    action
    appId
    appName
    dnsProtectionCategory
    eventType
    ruleId
    scanResult
    severity
    signatureId
    threatName
    threatType
    virusName
  }
  flowsCardinality
  logonName
  macAddress
  mitres {
    id
    name
  }
  os
  riskLevel
  similarStoriesData {
    indication
    similarityPercentage
    storyId
    threatClassification
    threatTypeName
    verdict
  }
  targets {
    analysisScore
    categories
    catoPopularity
    countryOfRegistration
    creationTime
    engines
    eventData {
      action
      appId
      appName
      dnsProtectionCategory
      eventType
      ruleId
      scanResult
      severity
      signatureId
      threatName
      threatType
      virusName
    }
    infectionSource
    name
    searchHits
    threatFeeds
    threatReference
    type
  }
  threatPreventionsEvents {
    appName
    clientClass
    createdAt
    destinationCountry
    destinationGeolocation
    destinationIp
    destinationPort
    direction
    dnsResponseIP
    domain
    fileHash
    httpResponseCode
    ja3
    method
    referrer
    smbFileName
    sourceGeolocation
    sourceIp
    sourcePort
    target
    tunnelGeolocation
    url
    user
    userAgent
  }
  ticket
}

fragment AnomalyStatsIncidentBrief on AnomalyStats {
  clientClass
  deviceName
  direction
  logonName
  macAddress
  mitres {
    id
    name
  }
  os
  similarStoriesData {
    indication
    similarityPercentage
    storyId
    threatClassification
    threatTypeName
    verdict
  }
  targets {
    analysisScore
    categories
    catoPopularity
    countryOfRegistration
    creationTime
    engines
    eventData {
      action
      appId
      appName
      dnsProtectionCategory
      eventType
      ruleId
      scanResult
      severity
      signatureId
      threatName
      threatType
      virusName
    }
    infectionSource
    name
    searchHits
    threatFeeds
    threatReference
    type
  }
  ticket
}

fragment AnomalyEventsIncidentBrief on AnomalyEvents {
  clientClass
  deviceName
  direction
  logonName
  macAddress
  mitres {
    id
    name
  }
  os
  similarStoriesData {
    indication
    similarityPercentage
    storyId
    threatClassification
    threatTypeName
    verdict
  }
  targets {
    analysisScore
    categories
    catoPopularity
    countryOfRegistration
    creationTime
    engines
    eventData {
      action
      appId
      appName
      dnsProtectionCategory
      eventType
      ruleId
      scanResult
      severity
      signatureId
      threatName
      threatType
      virusName
    }
    infectionSource
    name
    searchHits
    threatFeeds
    threatReference
    type
  }
  ticket
}
'''

#
# Variables
#
variables = {
  "accountId": os.environ["CATO_ACCOUNT_ID"],
  "from": 0,
  "limit": 25,
  "filter": [
    {
      "timeFrame": {"time": "last.P1D", "timeFrameModifier": "StoryUpdate"},
      "producer": {"not_in": ["NetworkXDR", "NetworkMonitor"]}
    }
  ],
  "sort": [{"fieldName": "updatedAt", "order": "desc"}]
}

### Step 1: Downloading the XDR stories

Now that we have imported our modules and defined the Cato API query for downloading stories, the agent can call this query and store the results in a list of story objects. It needs to do this in a loop because the XDR query is limited by a batch size, and so allows the requester to paginate over multiple iterations to retrieve all eligible stories according to the input parameters. These parameters include a timeframe, and for the purposes of this exercise, we will use a timeframe of 1 day. This implies that the agent will be scheduled to only execute once every 24 hours. For some organisations this is entirely acceptable; others might prefer an agent which runs more frequently, such as once per hour, in which case the timeframe input would need to be aligned with this schedule.

After each iteration we print the number of stories retrieved in the current batch, the total number saved to our list of stories, and the total number eligible for retrieval, stopping when a batch returns zero stories:

In [3]:
#
# Define an object to hold the stories
#
stories = []


#
# Iterate over the query until we have the total
#
iteration = 0
while True:

    
    #
    # Make the query
    #
    iteration += 1
    variables["from"] = len(stories)
    success, result = C.send("Stories", variables, query)
    items = result["data"]["xdr"]["stories"]["items"]
    line = f'iteration:{iteration} len(items):{len(items)} len(stories):{len(stories)} '
    line += f'total:{result["data"]["xdr"]["stories"]["paging"]["total"]}'
    print(line)


    #
    # Collect the stories
    #
    for story in items:
        stories.append(story)


    #
    # Stop if we received zero
    #
    if len(items) == 0:
        print("Received zero, stopping")
        break

iteration:1 len(items):25 len(stories):0 total:71
iteration:2 len(items):25 len(stories):25 total:71
iteration:3 len(items):21 len(stories):50 total:71
iteration:4 len(items):0 len(stories):71 total:71
Received zero, stopping


### Step 2: Submitting the stories to the LLM

This step can be divided into two parts: defining a function to submit the story together with an appropriate system prompt, and then iterating over the stories we downloaded in step 1, submitting the story and saving the feedback.

The system prompt below does the following things:
* Provides a basic description of the goal. Note that we don't provide any information on the structure of the data to be inputted. We don't even tell the LLM that it is in JSON format.
* Describes in detail that the **response** should be in JSON format, what those JSON fields should be and, where the field comes from an enum, list what those allowed enum values are.
* Includes some advice for how certain incidents should be handled, such as that low severity incidents can be closed.
* Specifies that the model provide an explanation for why it made the decisions it did, as an additional field.

What advice should be provided to the LLM depends very much on the requirements of the organisation. The below is just an example.

In [4]:
#
# Define a function to call the LLM
#
def get_llm_recommendation(story):
    openai = OpenAI(api_key=os.environ["OPENAI"])
    system_prompt = """You are an agent which acts as a security analyst, evaluating XDR incident data from a SASE provider, 
responding with recommendations for how the incident should be handled.

The response will be a JSON object in this format:
{
    "additionalInfo": "Free text for the analyst to enter additional information about the XDR story",
    "severity": "String from a fixed list of predefined values, for analyst to assign the severity of a Malicious XDR story",
    "status": "String from a fixed list of predefined values, for the current status of the XDR story",
    "threatClassification": "Brief classification of the threat no more than 5 words.",
    "threatType": {
        "details": "Free text to provide more information on the type of threat",
        "name": "Free text to provide a short, descriptive name for the threat",
        "recommendedAction": "Free text to provide suggestions for how the threat should be handled"
    }
    "verdict": "String from a fixed list of predefined values, for the analyst to assign the verdict of the XDR story"
    "explanation": "Free text field where you explain your reasoning for the severity, status and verdict"
}

Severity values are:
["High", "Medium", "Low"]

Status values are:
["Closed","Monitoring","PendingAnalysis","PendingMoreInfo"]

Verdict values are:
["Benign","Informational","Malicious","Suspicious"]

Low severity incidents should have status=Closed. 
Medium severity incidents where all actions were blocked should have status=Closed.
Incidents where all actions were blocked should have status=Closed.
Network scanning is a low severity incident.
Suspicious incidents should have status=Closed if there is no malicious activity.
    """
    user_prompt = json.dumps(story)
    response = openai.chat.completions.create(
        model = "gpt-4o",
        messages = [
            {"role":"system", "content": system_prompt},
            {"role":"user", "content": user_prompt}
        ],
        response_format={"type":"json_object"}
    )
    return json.loads(response.choices[0].message.content)

Having defined our function including the system prompt, the agent can now loop over the stories, submitting each one, and recording the feedback. We'll print a short line of output each time to see the decision and explanation.

In [5]:
#
# Iterate over the stories, asking for an LLM recommendation for each one
#
results = {}
statuses = defaultdict(lambda:0)
for i, story in enumerate(stories):
    if story["incident"]["status"] != "Closed":

        #
        # Make the submission and get the result
        #
        result = get_llm_recommendation(story)

        #
        # Skip any where we didn't get the right fields
        #
        fields = ["status", "severity", "verdict", "explanation", "threatClassification", "threatType"]
        if not all([F in result for F in fields]):
            print(f'MISSING FIELD: {fields-result.keys()}')
            continue
                

        #
        # Log the result
        #
        results[story["id"]] = result            
        statuses[result["status"]] += 1
        line = f'{i+1}/{len(stories)} id:{story["id"]} status:{story["incident"]["status"]}->{result["status"]} '
        line += f'criticality:{story["incident"]["criticality"]}->{result["severity"]} '
        line += f'verdict:{story["incident"]["predictedVerdict"]}->{result["verdict"]}'
        line += f'\nexplanation:{result["explanation"]}\n'
        print(line)

#
# Primt summary
#
for status, count in statuses.items():
    print(f'{status}:{count}')

1/71 id:66c08a7427f1a81d2bf5080b status:Open->PendingAnalysis criticality:3->High verdict:None->Suspicious
explanation:The connection to a low-reputation domain using a scripting language over HTTP could suggest an attempted command and control communication. This suspicious behavior, combined with the use of a scripting language, often reflects malicious intent, necessitating further investigation to determine the true nature of the activity. Given the potential for misuse, the severity is considered high and the status is set to pending analysis as more detail is required to ascertain the intent definitively.

2/71 id:66b04dd163405215b7ddd526 status:Open->PendingAnalysis criticality:3->Medium verdict:None->Suspicious
explanation:Given the nature of the behavior—HTTP communication using a scripting language to a low-popularity domain—there is a high suspicion of this being preparatory or active malware-related activity. The severity is marked as Medium due to the potential seriousness

### Step 3: Submitting the LLM responses as analyst feedback using the Cato API

Our agent will be able to submit feedback, set the severity and threat type and even close XDR incidents using the **analystFeedback** mutation, which is documented here: [https://api.catonetworks.com/documentation/#mutation-xdr.analystFeedback](https://api.catonetworks.com/documentation/#mutation-xdr.analystFeedback). This is a much smaller, simpler call than the XDR stories call the agent used to download the stories. The biggest difficulty we have is in transforming the response from the LLM to fit the Cato API's input parameters, as defined [here](https://api.catonetworks.com/documentation/#definition-AnalystFeedbackInput). However... compare those input parameters with the JSON structure we asked the LLM for in the system prompt, and what do you see... that's right, they are almost the same. By telling the LLM to provide the response using the same fields that the API mutation requires, we have made it very easy for the agent to transform the data from LLM to API.

The only real change is that instead of using the **additionalInfo** field provided by the LLM, I use instead the **explanation** field we asked it for. I found in testing that the additionalInfo was largely redundant, being very similar to the threatType data. I also thought it would be useful for a human analyst to be able to look at a story in the Cato Management Application and see not only that it has been worked by the LLM, but also see an explanation of why the LLM took the decisions which it did.

So all the agent needs to do now is define the query and loop over the saved story analysis results, submitting each one. There is a chance of errors here, if the LLM for whatever reason decided to go off-piste and provide a value for an enum field which isn't part of the enum. We'll print those but otherwise not worry about them too much - all it means is that the story will remain unmodified, so a human analyst will need to take a look at it.

In [6]:
#
# The analystFeedback mutation call
#
query = '''mutation analystFeedback($accountId: ID!, $input: AnalystFeedbackInput!) {
  xdr(accountId: $accountId) {
    analystFeedback(input: $input) {
      story {
        id
      }
    }
  }
}'''


#
# Iterate over the results, executing the feedback in a mutation call to the Cato API.
#
for story_id, result in results.items():

    #
    # Log the entry
    #
    print(f'Processing id:{story_id} status:{result["status"]}')

    #
    # Map the LLM result to the Cato API input fields
    #
    variables = {
        "accountId": os.environ["CATO_ACCOUNT_ID"],
        "input": {
            "storyId": story_id,
            "additionalInfo": f'AI:{result["explanation"]}',
            "severity": result["severity"],
            "status": result["status"],
            "threatClassification": result["threatClassification"],
            "threatType": result["threatType"],
            "verdict": result["verdict"]
        }
    }

    #
    # Make the API call, reporting any failures
    #
    success, response = C.send("analystFeedback", variables, query)
    if not success:
        print(f'FAILURE {story_id}')
        print(json.dumps(variables, indent=2))
        print(response,"\n\n")

Processing id:66c08a7427f1a81d2bf5080b status:PendingAnalysis
Processing id:66b04dd163405215b7ddd526 status:PendingAnalysis
Processing id:67c4271e28943b55b76a09c1 status:PendingAnalysis
Processing id:67c4270b28943b55b76a09b1 status:Monitoring
Processing id:67c426e028943b55b76a09a0 status:PendingAnalysis
Processing id:67c4279d28943b55b76a0a44 status:Closed
Processing id:67c4278f28943b55b76a0a32 status:PendingAnalysis
Processing id:67c46a11ddfded5aa324a789 status:Closed
Processing id:67c427c028943b55b76a0a7f status:Closed
Processing id:67c4271428943b55b76a09b9 status:Closed
Processing id:67c4275428943b55b76a09e9 status:Monitoring
Processing id:67c4273828943b55b76a09d5 status:Closed
Processing id:67c41b9628943b55b76a033b status:Closed
Processing id:6565ad05f916b2799c1790b6 status:Closed
Processing id:67c41f1528943b55b76a0668 status:PendingAnalysis
Processing id:6565ad04f916b2799c1790b4 status:Closed
Processing id:6565acf6f916b2799c179099 status:Closed
Processing id:6565acfef916b2799c1790a

### Conclusion ###

There you have it, all of the Python code required to create an AI Agent which can use an LLM to triage and manage a Cato customer's XDR story feed. So is it an actual agent? **No, this is not yet an agent.** This is a Jupyter Notebook which is not really suitable for independent execution. As described in the introduction, this is one of the mandatory requirements for something to be called Agentic AI. 

So what would be required to turn this into an agent? All you'd need to do is copy the code into a single Python script or module, and then have a way of scheduling that code to run 24x7 with no human intervention. Many customers would use a cloud service such as AWS Lambda or Azure Functions, but it could also be as simple as a cron job running on a Linux VM.

As well as coming up with a suitable platform for continuous operation, someone looking to run an agent such as this in a production environment should probably also consider:

* Tuning the system prompt to better align the recommendations with the organisation's objectives and risk profile. For example, additional examples could be provided.
* As mentioned, Cato XDR stories are very rich objects with potentially thousands of fields. The larger the object, the higher the token count, the more expensive the LLM cost. Removing some of these fields could result in significant savings without decreasing the quality of the model recommendations. For example, it might not be necessary to send all of the IoAs, timeseries values and incident fields. The company might also decide not to send very large stories to the model for analysis, and instead leave these for human analysts.
* A way for the Agent to log and report on its actions, to assist with ongoing QA and monitoring.