DDS: Palo Alto Networks Cortex XSOAR v1.0.0#23331
DDS: Palo Alto Networks Cortex XSOAR v1.0.0#23331jaypatel7-crest wants to merge 15 commits intoDataDog:masterfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4eb02aadc0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "id": 272995163, | ||
| "name": "High integration error rate alert", | ||
| "type": "query alert", | ||
| "query": "sum(last_10m):sum:palo_alto_networks_cortex_xsoar.api_execution.count{integration_name:*, -api_response_type:successful}.as_count() / sum:palo_alto_networks_cortex_xsoar.api_execution.count{integration_name:*} >= 0.5", |
There was a problem hiding this comment.
Normalize denominator with as_count in error-rate query
This monitor computes error_count / total_calls but only the numerator is converted to counts (...-api_response_type:successful}.as_count()), while the denominator stays as sum:...api_execution.count{integration_name:*}. That mixes count and rate semantics, so the resulting value is not a true error ratio and can trigger noisy or incorrect alerts; use a consistent count rollup on both sides of the division.
Useful? React with 👍 / 👎.
|
Created DOCS-14046 for docs review. |
|
This PR does not modify any files shipped with the agent. To help streamline the release process, please consider adding the |
| @@ -0,0 +1,70 @@ | |||
| ## Overview | |||
|
|
|||
| [Palo Alto Networks Cortex XSOAR][1] is a security orchestration, automation, and unifying incident response (SOAR) platform that helps teams automate incident handling and integrate security tools to enhance SOC efficiency and reduce remediation time. | |||
There was a problem hiding this comment.
| [Palo Alto Networks Cortex XSOAR][1] is a security orchestration, automation, and unifying incident response (SOAR) platform that helps teams automate incident handling and integrate security tools to enhance SOC efficiency and reduce remediation time. | |
| [Palo Alto Networks Cortex XSOAR][1] is a security orchestration, automation, and unifying incident response (SOAR) platform that helps teams automate incident handling, integrate security tools, and reduce remediation time. |
| This integration parses and ingests the following types of logs: | ||
|
|
||
| - **Audit Logs**: Capture all administrative user activities within Palo Alto Networks Cortex XSOAR. | ||
| - **Incidents**: Capture key details of incidents, including severity, status, type, and ownership, to support tracking and investigation activities in Palo Alto Networks Cortex XSOAR. |
There was a problem hiding this comment.
| - **Incidents**: Capture key details of incidents, including severity, status, type, and ownership, to support tracking and investigation activities in Palo Alto Networks Cortex XSOAR. | |
| - **Incidents**: Capture incident details, including severity, status, type, and ownership, to support tracking and investigation in Palo Alto Networks Cortex XSOAR. |
| - **Audit Logs**: Capture all administrative user activities within Palo Alto Networks Cortex XSOAR. | ||
| - **Incidents**: Capture key details of incidents, including severity, status, type, and ownership, to support tracking and investigation activities in Palo Alto Networks Cortex XSOAR. | ||
|
|
||
| You can visualize detailed insights into these logs through the out-of-the-box dashboards. Additionally, ready-to-use Cloud SIEM detection rules are available to help you monitor and respond to potential security threats effectively. |
There was a problem hiding this comment.
| You can visualize detailed insights into these logs through the out-of-the-box dashboards. Additionally, ready-to-use Cloud SIEM detection rules are available to help you monitor and respond to potential security threats effectively. | |
| Visualize detailed insights into these logs with out-of-the-box dashboards. This integration also includes Cloud SIEM detection rules to help you monitor and respond to potential security threats. |
| - **Automation Insight Metrics**: Provide visibility into playbook, task, and command execution activity, including counts, failures, and execution duration, to help monitor automation performance, identify errors, and evaluate operational efficiency. | ||
| - **API Execution Metrics**: Provide visibility into API execution activity, including total calls and rate-limited requests, to help monitor integration usage, detect throttling events, and evaluate API performance. | ||
| - **SLA Metrics**: Provide visibility into incident response timelines, including mean time to detection, triage, containment, and resolution, along with counts of items within and outside SLA thresholds, helping monitor response performance and compliance. |
There was a problem hiding this comment.
| - **Automation Insight Metrics**: Provide visibility into playbook, task, and command execution activity, including counts, failures, and execution duration, to help monitor automation performance, identify errors, and evaluate operational efficiency. | |
| - **API Execution Metrics**: Provide visibility into API execution activity, including total calls and rate-limited requests, to help monitor integration usage, detect throttling events, and evaluate API performance. | |
| - **SLA Metrics**: Provide visibility into incident response timelines, including mean time to detection, triage, containment, and resolution, along with counts of items within and outside SLA thresholds, helping monitor response performance and compliance. | |
| - **Automation Insight Metrics**: Track playbook, task, and command execution activity, including counts, failures, and execution duration. | |
| - **API Execution Metrics**: Track API execution activity, including total calls and rate-limited requests. | |
| - **SLA Metrics**: Track incident response timelines, including mean time to detection, triage, containment, and resolution, along with counts of items within and outside SLA thresholds. |
| - **API Execution Metrics**: Provide visibility into API execution activity, including total calls and rate-limited requests, to help monitor integration usage, detect throttling events, and evaluate API performance. | ||
| - **SLA Metrics**: Provide visibility into incident response timelines, including mean time to detection, triage, containment, and resolution, along with counts of items within and outside SLA thresholds, helping monitor response performance and compliance. | ||
|
|
||
| Visualize detailed insights into these metrics through the out-of-the-box dashboards. Additionally, monitors are provided to alert you to any potential issues. |
There was a problem hiding this comment.
| Visualize detailed insights into these metrics through the out-of-the-box dashboards. Additionally, monitors are provided to alert you to any potential issues. | |
| Visualize detailed insights into these metrics with out-of-the-box dashboards. This integration also includes monitors to alert you to any potential issues. |
| "created_at": "2026-04-16", | ||
| "last_updated_at": "2026-04-16", | ||
| "title": "Anomalous spikes in playbook execution failure", | ||
| "description": "It monitors anomalous spikes in playbook failure rates by comparing failed vs total executions over time. Sudden increases may signal logic issues, integration failures, or dependency problems, impacting automation reliability and efficiency.", |
There was a problem hiding this comment.
| "description": "It monitors anomalous spikes in playbook failure rates by comparing failed vs total executions over time. Sudden increases may signal logic issues, integration failures, or dependency problems, impacting automation reliability and efficiency.", | |
| "description": "Monitors anomalous spikes in playbook failure rates by comparing failed executions against total executions over time. Sudden increases may signal logic issues, integration failures, or dependency problems that impact automation reliability.", |
| "created_at": "2026-04-17", | ||
| "last_updated_at": "2026-04-17", | ||
| "title": "Connectivity errors per integration alert", | ||
| "description": "It monitors for connectivity-related API execution errors per integration, which may indicate communication failures, network issues, or unavailable external services. Persistent errors can disrupt automation workflows and impact integration reliability in Cortex XSOAR.", |
There was a problem hiding this comment.
| "description": "It monitors for connectivity-related API execution errors per integration, which may indicate communication failures, network issues, or unavailable external services. Persistent errors can disrupt automation workflows and impact integration reliability in Cortex XSOAR.", | |
| "description": "Monitors connectivity-related API execution errors per integration, which may indicate communication failures, network issues, or unavailable external services. Persistent errors can disrupt automation workflows and reduce integration reliability.", |
| "created_at": "2026-04-16", | ||
| "last_updated_at": "2026-04-16", | ||
| "title": "High integration error rate alert", | ||
| "description": "It monitors for a high rate of integration errors, which may indicate failing commands, authentication problems, or service instability. Sustained error rates can disrupt automation workflows and reduce integration reliability in Cortex XSOAR.", |
There was a problem hiding this comment.
| "description": "It monitors for a high rate of integration errors, which may indicate failing commands, authentication problems, or service instability. Sustained error rates can disrupt automation workflows and reduce integration reliability in Cortex XSOAR.", | |
| "description": "Monitors the rate of integration errors, which may indicate failing commands, authentication problems, or service instability. Sustained error rates can disrupt automation workflows and reduce integration reliability.", |
| "name": "High integration error rate alert", | ||
| "type": "query alert", | ||
| "query": "sum(last_10m):sum:palo_alto_networks_cortex_xsoar.api_execution.count{integration_name:*, -api_response_type:successful}.as_count() / sum:palo_alto_networks_cortex_xsoar.api_execution.count{integration_name:*}.as_count() >= 0.5", | ||
| "message": "{{#is_warning}}\r\n⚠️ **Warning**: Elevated rate-limited API calls detected**.\r\n\r\n📊 Current error rate is **{{value}}**, which is above the warning threshold.\r\n\r\nPlease monitor API usage and check for potential throttling or quota limits.\r\n\r\n{{/is_warning}}\r\n\r\n{{#is_alert}}\r\n🚨 **Alert**: High integration error rate detected**.\r\n\r\n📊 Current error rate is **{{value}}**, exceeding the alert threshold.\r\n\r\nThis indicates that a significant portion of API calls are being rate-limited. Investigate API usage spikes, integration behavior, or vendor rate limits immediately.\r\n\r\n{{/is_alert}}\r\n\r\n@example@example.com", |
There was a problem hiding this comment.
Is this correct?
**Warning**: Elevated rate-limited API calls detected**
There was a problem hiding this comment.
It was a typo. I’ve corrected it.
| "version": 2, | ||
| "created_at": "2026-04-16", | ||
| "last_updated_at": "2026-04-16", | ||
| "title": "High integration error rate alert", |
There was a problem hiding this comment.
Optional suggestion. For all three monitors, "alert" in the title is redundant since the monitor is already an alert.
There was a problem hiding this comment.
This make sense, I've updated the PR.
estherk15
left a comment
There was a problem hiding this comment.
Thanks for making those updates!
What does this PR do?
Motivation
Review checklist (to be filled by reviewers)
qa/skip-qalabel if the PR doesn't need to be tested during QA.backport/<branch-name>label to the PR and it will automatically open a backport PR once this one is merged