Skip to content

Optimize latest transform source queries across CDR packages#18098

Merged
maxcold merged 5 commits intoelastic:mainfrom
maxcold:optimize-transform-queries
Apr 13, 2026
Merged

Optimize latest transform source queries across CDR packages#18098
maxcold merged 5 commits intoelastic:mainfrom
maxcold:optimize-transform-queries

Conversation

@maxcold
Copy link
Copy Markdown
Contributor

@maxcold maxcold commented Mar 27, 2026

Proposed commit message

Optimize latest transform source queries across 10 CDR packages

Add @timestamp range filter and _tier exclusion (data_cold, data_frozen) to 17 security_solution-* latest transforms. This bounds the source query to match the retention period and skips cold/frozen tier storage.

Problem

All CDR latest transforms (except CSP misconfiguration) scan the entire source index with no timestamp bound or storage tier filter. For large environments (e.g., 215M Wiz vulnerability docs), this causes:

  • Transform stuck in "indexing" state for hours/days on first checkpoint
  • Destination index grows to 100+ GB before retention kicks in
  • Unnecessary load on cold/frozen tier storage

Solution

Apply the same optimizations already proven on cloud_security_posture misconfiguration transform (v0.3.0):

  1. @timestamp range filter — limit source query to match retention period
  2. _tier exclusion — skip data_cold and data_frozen tiers
  3. Normalize 24h retention to 26h — 2h buffer prevents empty Findings page during ingestion gaps

Changes per retention group

Retention Packages Range filter Transforms
4h m365_defender, microsoft_defender_endpoint now-4h 2
24h→26h aws (x2), prisma_cloud (x2), qualys_vmdr, wiz misconfig now-26h 6
26h microsoft_defender_cloud (x2) now-26h 2
90d aws inspector, aws_securityhub, google_scc (x2), rapid7_insightvm, tenable_io, wiz vuln now-90d 7

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have added an entry to each package's changelog.yml file.

How to test this PR locally

  1. Install any modified package via elastic-package install --zip
  2. Verify the transform's source query includes the @timestamp range and _tier exclusion
  3. With data in cold/frozen tiers, confirm the transform skips those documents

Related issues

Fixes #18370

maxcold added 2 commits March 27, 2026 12:44
Add @timestamp range filter and _tier exclusion (data_cold, data_frozen)
to 17 security_solution latest transforms. This bounds the source query
to match the retention period and skips cold/frozen tier storage, preventing
transforms from scanning entire indices in large environments.

Changes per retention group:
- 4h retention (m365_defender, microsoft_defender_endpoint): range now-4h
- 24h→26h retention (aws x2, prisma_cloud x2, qualys_vmdr, wiz misconfig):
  bumped max_age to 26h with 2h buffer, range now-26h
- 26h retention (microsoft_defender_cloud x2): range now-26h
- 90d retention (aws inspector, aws_securityhub, google_scc x2,
  rapid7_insightvm, tenable_io, wiz vuln): range now-90d
@andrewkroh andrewkroh added Integration:qualys_vmdr Qualys VMDR Integration:google_scc Google Security Command Center Integration:m365_defender Microsoft Defender XDR Integration:microsoft_defender_endpoint Microsoft Defender for Endpoint Integration:rapid7_insightvm Rapid7 InsightVM Integration:aws_securityhub AWS Security Hub Integration:aws AWS Integration:tenable_io Tenable Vulnerability Management Integration:wiz Wiz Integration:microsoft_defender_cloud Microsoft Defender for Cloud Integration:prisma_cloud Palo Alto Prisma Cloud labels Mar 27, 2026
@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

elastic-vault-github-plugin-prod bot commented Mar 27, 2026

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

maxcold added 3 commits April 7, 2026 13:13
The transform source queries added in e22d379 filter on @timestamp >= now-90d,
but the Docker mock service configs had timestamps from 2018-2023 which fell
outside this window during CI system tests, causing "no documents found in
preview for transform" errors.

Shift all @timestamp-mapped fields in the 6 affected packages' Docker mock
configs to March 2026 dates so they fall within the 90-day window.
Resolve version/changelog conflicts with upstream changes in aws (6.4.2-6.4.3),
m365_defender (5.12.3), and tenable_io (4.9.1). Keep our version bumps on top.
Merge tenable_io Docker config with upstream's added empty chunks/2 route while
preserving our updated timestamps.
The pubsub system test uses finding.log (separate from config.yml used by the
default test). Update eventTime values to March 2026 to fall within the
transform's now-90d filter window.
@elasticmachine
Copy link
Copy Markdown

💚 Build Succeeded

History

@maxcold maxcold marked this pull request as ready for review April 9, 2026 08:49
@maxcold maxcold requested review from a team as code owners April 9, 2026 08:49
@maxcold maxcold requested a review from kcreddy April 9, 2026 08:49
@andrewkroh andrewkroh added the Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] label Apr 9, 2026
@elasticmachine
Copy link
Copy Markdown

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@maxcold maxcold merged commit c3827e6 into elastic:main Apr 13, 2026
9 checks passed
@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package aws - 6.5.0 containing this change is available at https://epr.elastic.co/package/aws/6.5.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package aws_securityhub - 0.3.0 containing this change is available at https://epr.elastic.co/package/aws_securityhub/0.3.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package google_scc - 2.4.0 containing this change is available at https://epr.elastic.co/package/google_scc/2.4.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package m365_defender - 5.13.0 containing this change is available at https://epr.elastic.co/package/m365_defender/5.13.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package microsoft_defender_cloud - 3.4.0 containing this change is available at https://epr.elastic.co/package/microsoft_defender_cloud/3.4.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package microsoft_defender_endpoint - 4.6.0 containing this change is available at https://epr.elastic.co/package/microsoft_defender_endpoint/4.6.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package prisma_cloud - 4.1.0 containing this change is available at https://epr.elastic.co/package/prisma_cloud/4.1.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package qualys_vmdr - 6.18.0 containing this change is available at https://epr.elastic.co/package/qualys_vmdr/6.18.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package rapid7_insightvm - 2.8.0 containing this change is available at https://epr.elastic.co/package/rapid7_insightvm/2.8.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package tenable_io - 4.10.0 containing this change is available at https://epr.elastic.co/package/tenable_io/4.10.0/

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package wiz - 4.2.0 containing this change is available at https://epr.elastic.co/package/wiz/4.2.0/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Integration:aws_securityhub AWS Security Hub Integration:aws AWS Integration:google_scc Google Security Command Center Integration:m365_defender Microsoft Defender XDR Integration:microsoft_defender_cloud Microsoft Defender for Cloud Integration:microsoft_defender_endpoint Microsoft Defender for Endpoint Integration:prisma_cloud Palo Alto Prisma Cloud Integration:qualys_vmdr Qualys VMDR Integration:rapid7_insightvm Rapid7 InsightVM Integration:tenable_io Tenable Vulnerability Management Integration:wiz Wiz Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize latest transform source queries across CDR packages

4 participants