FIX: add support for offset and limit on listing aggregations by IceS2 · Pull Request #25943 · open-metadata/OpenMetadata

IceS2 · 2026-02-17T18:33:08Z

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

Checklist:

I have read the CONTRIBUTING document.
My PR title is Fixes <issue-number>: <short explanation>
I have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

Added offset and limit pagination support for aggregated queries:\n - Extended ListParams SDK model with offset and latest boolean fields enabling pagination parameters\n - Enhanced EntityTimeSeriesRepository.listLatestFromSearch() to accept limit/offset/sortField/sortType for paginated aggregations\n\n- Implemented dual-aggregation pagination strategy:\n - Added bucket_sort and stats_bucket pipeline aggregations (Elasticsearch and OpenSearch) to slice result buckets and count totals accurately\n - Parallel aggregation trees when paginating: byTerms for sliced results + byTermsCount for exact filtered count\n\n- Updated resource layer to support pagination:\n - TestCaseResolutionStatusResource and TestCaseResultResource now pass pagination parameters to aggregation method\n - Removed documentation stating "offset and limit are ignored"\n\n- Added comprehensive test coverage:\n - Integration test (IncidentPaginationIT) verifies pagination across multiple pages and edge cases (11 test cases, 5-item pages)\n - Unit tests (EntityTimeSeriesRepositoryPaginationTest and SearchAggregationTest) validate aggregation node building and bucket_sort/stats_bucket creation"

_{This will update automatically on new commits.}

github-actions · 2026-02-18T09:14:58Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

gitar-bot · 2026-02-19T00:41:06Z

      aggregationsMap.removeIf(
          aggregationMap -> aggregationMap.get("aggType").contains("bucket_selector"));
      for (int j = 0; j < aggregationsMap.size(); j++) {


💡 Bug: bucket_sort not excluded from getAggregationMetadata()

The getAggregationMetadata() method (line 159-161) removes bucket_selector from metadata but does not remove bucket_sort. Since bucket_sort is also a pipeline aggregation (not a dimension or metric), it should be excluded just like bucket_selector.

When bucket_sort appears as a leaf node in metadata, it will:

Add null to the metrics list (line 171, since it has no field key)

Add bucket_sort#pagination to the keys list (line 184)

This could corrupt report metadata if the aggregation tree is processed through the generic aggregation path. Currently this doesn't affect listLatestFromSearch (which doesn't use getAggregationMetadata), but it's a latent bug if the tree structure is reused elsewhere.

Suggested fix:

aggregationsMap.removeIf( aggregationMap -> aggregationMap.get("aggType").contains("bucket_selector") || aggregationMap.get("aggType").contains("bucket_sort"));

_{Was this helpful? React with 👍 / 👎}

gitar-bot · 2026-02-19T09:26:14Z

      // remove bucket_selector from metadata as it is a filter and neither a dimension nor a metric
      aggregationsMap.removeIf(
          aggregationMap -> aggregationMap.get("aggType").contains("bucket_selector"));


💡 Edge Case: stats_bucket also not excluded from getAggregationMetadata()

The existing finding notes that bucket_sort isn't excluded from getAggregationMetadata(). The same issue applies to stats_bucket, another pipeline aggregation added in this PR.

At line 159, only bucket_selector is removed:

aggregationsMap.removeIf( aggregationMap -> aggregationMap.get("aggType").contains("bucket_selector"));

If the pagination aggregation tree is ever passed through the data quality report path, both bucket_sort and stats_bucket nodes would leak into the metadata, potentially adding null to the metrics list (since neither has a "field" key).

While this isn't currently triggered (the pagination aggregation doesn't go through getAggregationMetadata()), it's a defensive fix worth making alongside the existing finding. Consider generalizing the exclusion to all pipeline aggregations.

Suggested fix:

// remove pipeline aggregations from metadata as they are neither a dimension nor a metric aggregationsMap.removeIf( aggregationMap -> { String aggType = aggregationMap.get("aggType"); return aggType.contains("bucket_selector") || aggType.contains("bucket_sort") || aggType.contains("stats_bucket"); });

_{Was this helpful? React with 👍 / 👎}

gitar-bot · 2026-02-19T09:26:52Z

🔍 CI failure analysis for 6c8f33d: maven-collate-ci failed again (5th consecutive occurrence) due to external Collate workflow failure. Cascading from flaky tests. Pagination code remains correct.

Issue

maven-collate-ci job failed - 5th consecutive occurrence of this cascading failure pattern from external Collate workflow.

Root Cause

Cascading failure from external Collate workflow encountering flaky integration tests.

Failed Job (Current Run - Commit `6c8f33d`)

maven-collate-ci (64137552240): External workflow failure

Details

Job Flow:

✅ Verified PR labels
✅ Triggered Collate workflow (commit 6c8f33d)
⏳ Waited 27 minutes 43 seconds
❌ Workflow failed: conclusion=failure
❌ Job failed: ##[error]Workflow run has failed

Pattern Analysis

maven-collate-ci failures - all 5 consecutive occurrences:

Commit 774bad0 (64042523505)
Commit c14c6f6 (64120353941)
Commit 6c8f33d (64123966800)
Commit 6c8f33d (64127470908)
Commit 6c8f33d (64137552240, this run)

Conclusion: Persistent pattern of external Collate workflow failures across all CI runs for this PR.

Why This Is Unrelated to PR

Wrapper job triggering external workflow
Collate workflow runs same test suite in separate environment
Encountering same flaky infrastructure issues
PR modifies only pagination logic
IncidentPaginationIT continues to pass in direct CI runs

Conclusion

Fifth consecutive maven-collate-ci cascading failure. Environmental flakiness in external Collate workflow. Pagination functionality working correctly.

The pagination functionality is working correctly and the PR is ready from a code perspective.

Code Review 👍 Approved with suggestions 4 resolved / 6 findings

Well-implemented pagination feature with solid test coverage. The dual-aggregation strategy for accurate total counts is sound. One prior finding about pipeline aggregation exclusion in metadata remains unresolved, and stats_bucket has the same issue.

💡 Edge Case: stats_bucket also not excluded from getAggregationMetadata()

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchAggregation.java:158-160

The existing finding notes that bucket_sort isn't excluded from getAggregationMetadata(). The same issue applies to stats_bucket, another pipeline aggregation added in this PR.

At line 159, only bucket_selector is removed:

aggregationsMap.removeIf(
    aggregationMap -> aggregationMap.get("aggType").contains("bucket_selector"));

If the pagination aggregation tree is ever passed through the data quality report path, both bucket_sort and stats_bucket nodes would leak into the metadata, potentially adding null to the metrics list (since neither has a "field" key).

While this isn't currently triggered (the pagination aggregation doesn't go through getAggregationMetadata()), it's a defensive fix worth making alongside the existing finding. Consider generalizing the exclusion to all pipeline aggregations.

Suggested fix

        // remove pipeline aggregations from metadata as they are neither a dimension nor a metric
        aggregationsMap.removeIf(
            aggregationMap -> {
              String aggType = aggregationMap.get("aggType");
              return aggType.contains("bucket_selector")
                  || aggType.contains("bucket_sort")
                  || aggType.contains("stats_bucket");
            });

💡 Bug: bucket_sort not excluded from getAggregationMetadata()

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchAggregation.java:159-161

The getAggregationMetadata() method (line 159-161) removes bucket_selector from metadata but does not remove bucket_sort. Since bucket_sort is also a pipeline aggregation (not a dimension or metric), it should be excluded just like bucket_selector.

When bucket_sort appears as a leaf node in metadata, it will:

Add null to the metrics list (line 171, since it has no field key)
Add bucket_sort#pagination to the keys list (line 184)

This could corrupt report metadata if the aggregation tree is processed through the generic aggregation path. Currently this doesn't affect listLatestFromSearch (which doesn't use getAggregationMetadata), but it's a latent bug if the tree structure is reused elsewhere.

Suggested fix

        aggregationsMap.removeIf(
            aggregationMap -> aggregationMap.get("aggType").contains("bucket_selector")
                || aggregationMap.get("aggType").contains("bucket_sort"));

✅ 4 resolved

✅ Bug: Cardinality total count ignores bucket_selector filtering

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityTimeSeriesRepository.java:511 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityTimeSeriesRepository.java:441-452
The cardinality aggregation counts all unique values of the groupBy field across the entire index (or query scope), but the bucket_selector pipeline aggregation filters out buckets that don't match content filters (i.e., where the latest timestamp doesn't match the matching timestamp). This means the cardinality-reported total count will be higher than the actual number of valid buckets that pass the bucket_selector filter.

For example, if there are 200 unique test cases (cardinality = 200), but only 150 match the content filters, the API will report total: 200 while only 150 results are actually pageable. This causes the UI to show more pages than actually exist, with later pages returning empty results.

The cardinality aggregation doesn't account for the bucket_selector filtering logic. To get an accurate count, you'd need a different approach — for example, running a separate query with the content filters applied, or using the terms aggregation with a large enough size to get all buckets and counting the results post-filter.

✅ Bug: Terms aggregation size=100 silently truncates bucket_sort pagination

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityTimeSeriesRepository.java:467 📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchAggregation.java:38
The terms aggregation created in buildComplexAggregation() uses SearchAggregation.terms("byTerms", groupBy) which hardcodes size: 100 (see SearchAggregation.java:38). The bucket_sort pipeline aggregation operates on the buckets already returned by the parent terms aggregation, so it can only paginate within those 100 buckets.

If a user has more than 100 unique groups (e.g., 500 test cases) and requests offset=50, limit=100, the terms aggregation returns only 100 buckets, the bucket_selector filters some out, and then bucket_sort tries to skip 50 and return 100 — but there aren't enough buckets. Meanwhile, the cardinality aggregation correctly reports the true total (500), creating a mismatch between reported total and available pages.

The terms aggregation size needs to be increased to accommodate pagination. At minimum, it should be Math.min(offset + limit, MAX_AGGREGATE_SIZE) when pagination parameters are provided, or set to MAX_AGGREGATE_SIZE to ensure all buckets are available for the bucket_selector filter and subsequent bucket_sort.

✅ Quality: Swallowed exception hides failures in cardinality extraction

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityTimeSeriesRepository.java:450
The catch block at lines 450-452 silently swallows all exceptions during cardinality extraction. When pagination is in effect, falling back to entityList.size() will return the size of the current page rather than the true total, which would break pagination metadata. At minimum, the exception should be logged at WARN level so failures are observable in production.
} catch (Exception e) {
  LOG.warn("Failed to extract cardinality from aggregation response, falling back to entity list size", e);
}

✅ Quality: precision_threshold in cardinality builder is never applied

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchAggregation.java:134
The precision_threshold parameter is set in SearchAggregation.cardinality() at line 134 (value.put("precision_threshold", "3000")), but the existing ElasticCardinalityAggregations and OpenCardinalityAggregations classes do not read or apply this parameter when building the aggregation — they only read params.get("field"). This means the precision threshold setting is dead code that may mislead maintainers into thinking it's being applied. Either remove it from the builder, or update the cardinality aggregation implementations to use it.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

sonarqubecloud · 2026-02-19T10:22:44Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

add support for offset and limit on listing aggregations

5d99685

IceS2 requested a review from a team as a code owner February 17, 2026 18:33

IceS2 temporarily deployed to test February 17, 2026 18:33 — with GitHub Actions Inactive

IceS2 had a problem deploying to test February 17, 2026 18:33 — with GitHub Actions Failure

IceS2 temporarily deployed to test February 17, 2026 18:33 — with GitHub Actions Inactive

IceS2 had a problem deploying to test February 17, 2026 18:33 — with GitHub Actions Failure

IceS2 temporarily deployed to test February 17, 2026 18:33 — with GitHub Actions Inactive

IceS2 had a problem deploying to test February 17, 2026 18:33 — with GitHub Actions Failure

github-actions Bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Feb 17, 2026

gitar-bot Bot reviewed Feb 17, 2026

View reviewed changes

Comment thread ...etadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityTimeSeriesRepository.java Outdated

gitar-bot Bot reviewed Feb 17, 2026

View reviewed changes

Comment thread ...etadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityTimeSeriesRepository.java Outdated

gitar-bot Bot reviewed Feb 17, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/search/SearchAggregation.java Outdated

TeddyCr previously approved these changes Feb 17, 2026

View reviewed changes

Merge branch 'main' into use-limit-and-offset-on-listing-agg

cadafdb

IceS2 had a problem deploying to test February 18, 2026 08:44 — with GitHub Actions Error

add tests

b2100f1

IceS2 dismissed TeddyCr’s stale review via b2100f1 February 18, 2026 09:12

IceS2 had a problem deploying to test February 18, 2026 09:12 — with GitHub Actions Failure

IceS2 had a problem deploying to test February 18, 2026 16:57 — with GitHub Actions Error

fix couple issues

774bad0

IceS2 temporarily deployed to test February 18, 2026 17:50 — with GitHub Actions Inactive

IceS2 had a problem deploying to test February 18, 2026 17:50 — with GitHub Actions Failure

IceS2 temporarily deployed to test February 18, 2026 17:50 — with GitHub Actions Inactive

gitar-bot Bot reviewed Feb 19, 2026

View reviewed changes

Merge branch 'main' into use-limit-and-offset-on-listing-agg

c14c6f6

IceS2 had a problem deploying to test February 19, 2026 08:46 — with GitHub Actions Error

IceS2 temporarily deployed to test February 19, 2026 08:46 — with GitHub Actions Inactive

IceS2 had a problem deploying to test February 19, 2026 08:46 — with GitHub Actions Error

pmbrull approved these changes Feb 19, 2026

View reviewed changes

Merge branch 'main' into use-limit-and-offset-on-listing-agg

6c8f33d

IceS2 temporarily deployed to test February 19, 2026 09:20 — with GitHub Actions Inactive

gitar-bot Bot reviewed Feb 19, 2026

View reviewed changes

TeddyCr approved these changes Feb 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: add support for offset and limit on listing aggregations#25943

FIX: add support for offset and limit on listing aggregations#25943
IceS2 merged 11 commits intomainfrom
use-limit-and-offset-on-listing-agg

IceS2 commented Feb 17, 2026 •

edited by gitar-bot Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Feb 18, 2026

Uh oh!

gitar-bot Bot Feb 19, 2026

Uh oh!

gitar-bot Bot Feb 19, 2026

Uh oh!

gitar-bot Bot commented Feb 19, 2026 •

edited

Loading

Issue

Root Cause

Failed Job (Current Run - Commit `6c8f33d`)

Details

Pattern Analysis

Why This Is Unrelated to PR

Conclusion

Uh oh!

sonarqubecloud Bot commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

IceS2 commented Feb 17, 2026 • edited by gitar-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes:

Type of change:

Checklist:

Summary by Gitar

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Feb 18, 2026

Uh oh!

gitar-bot Bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Root Cause

Failed Job (Current Run - Commit 6c8f33d)

Details

Pattern Analysis

Why This Is Unrelated to PR

Conclusion

Uh oh!

sonarqubecloud Bot commented Feb 19, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

IceS2 commented Feb 17, 2026 •

edited by gitar-bot Bot

Loading

gitar-bot Bot commented Feb 19, 2026 •

edited

Loading

Failed Job (Current Run - Commit `6c8f33d`)