Skip to content

Only calculate aggregations when requested#961

Open
jazairi wants to merge 2 commits intomainfrom
use-491
Open

Only calculate aggregations when requested#961
jazairi wants to merge 2 commits intomainfrom
use-491

Conversation

@jazairi
Copy link
Copy Markdown
Contributor

@jazairi jazairi commented Apr 28, 2026

Why these changes are being introduced:

Aggregations are currently calculated even when
they are not requested in the GraphQL query. It
would be more efficient to calculate them only
when they are needed.

Relevant ticket(s):

How this addresses that need:

This adds a requested_aggregations method to
Query Type that evaluates which aggregations are
requested in the query. This information is then
used in the Aggregations and Opensearch models
to calculate only the requested aggregations.

Side effects of this change:

None.

Developer

  • All new ENV is documented in README
  • All new ENV has been added to Heroku Pipeline, Staging and Prod
  • ANDI or Wave has been run in accordance to
    our guide and
    all issues introduced by these changes have been resolved or opened as new
    issues (link to those issues in the Pull Request details above)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer

  • The commit message is clear and follows our guidelines
    (not just this pull request message)
  • There are appropriate tests covering any new functionality
  • The documentation has been updated or is unnecessary
  • The changes have been verified
  • New dependencies are appropriate or there were no changes

Requires database migrations?

NO

Includes new or updated dependencies?

NO

@qltysh
Copy link
Copy Markdown

qltysh Bot commented Apr 28, 2026

❌ 3 blocking issues (4 total)

Tool Category Rule Count
rubocop Lint Assignment Branch Condition size for search is too high. [<6, 16, 0> 17.09/17] 1
rubocop Style Line is too long. [160/120] 1
rubocop Lint Avoid parameter lists longer than 5 parameters. [8/5] 1
qlty Structure Function with many parameters (count = 8): search 1

Comment thread app/graphql/types/query_type.rb Outdated
Comment thread app/graphql/types/query_type.rb Outdated
Comment thread app/graphql/types/query_type.rb Outdated
Comment thread app/models/opensearch.rb Outdated
Comment thread test/models/aggregations_test.rb
Comment thread test/models/aggregations_test.rb Outdated
Comment thread test/models/aggregations_test.rb Outdated
Comment thread test/models/aggregations_test.rb Outdated
Comment thread test/models/aggregations_test.rb Outdated
Comment thread test/models/aggregations_test.rb Outdated
@mitlib mitlib temporarily deployed to timdex-api-p-use-491-usixf4tyg April 28, 2026 21:36 Inactive
@jazairi jazairi temporarily deployed to timdex-api-p-use-491-usixf4tyg April 28, 2026 21:38 Inactive

results = Opensearch.new.search(from, query, Timdex::OSClient, highlight: highlight_requested?, index: index, fulltext: fulltext, query_mode: query_mode)
results = Opensearch.new.search(from, query, Timdex::OSClient, highlight: highlight_requested?, index: index,
fulltext: fulltext, query_mode: query_mode, requested_aggregations: requested_aggregations)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is too long. [160/120] [rubocop:Layout/LineLength]

Comment thread app/models/opensearch.rb
MAX_SIZE = 200

def search(from, params, client, highlight: false, index: nil, fulltext: false, query_mode: 'keyword')
def search(from, params, client, highlight: false, index: nil, fulltext: false, query_mode: 'keyword',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 2 issues:

1. Function with many parameters (count = 8): search [qlty:function-parameters]


2. Avoid parameter lists longer than 5 parameters. [8/5] [rubocop:Metrics/ParameterLists]

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to improve search efficiency by only computing OpenSearch aggregations when the GraphQL query actually requests them, using GraphQL field-usage analysis to determine which aggregations to include.

Changes:

  • Add requested_aggregations detection in GraphQL QueryType#search and pass it through to the OpenSearch query layer.
  • Add Aggregations.for_request to filter the aggregation definitions down to only requested ones, and update Opensearch#build_query to conditionally include aggregations.
  • Add model tests covering conditional inclusion/exclusion of aggregations and for_request filtering behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
app/graphql/types/query_type.rb Computes requested aggregation fields from tracer used_fields, passes them to OpenSearch, and makes bucket-collapsing nil-safe.
app/models/opensearch.rb Adds requested_aggregations: param and only injects aggregations into the OpenSearch request when present.
app/models/aggregations.rb Introduces for_request to select only a subset of known aggregations.
test/models/opensearch_test.rb Adds tests asserting aggregations are included/excluded in built query based on requested list.
test/models/aggregations_test.rb Adds tests for Aggregations.for_request selection/validation behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread app/graphql/types/query_type.rb Outdated
Comment on lines +126 to +129
def requested_aggregations
used_fields = context[:tracers].first.log_data[:used_fields]
used_fields.select { |field| field.start_with?('Aggregations.') }
.map { |field| field.sub('Aggregations.', '').to_sym }
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requested_aggregations is derived from used_fields entries which (by default in graphql-ruby) are GraphQL names like Aggregations.accessToFiles / Aggregations.contentType. Converting the suffix directly to to_sym will produce symbols like :accessToFiles / :contentType, which won't match the snake_case keys used by Aggregations.all (e.g., :access_to_files, :content_type). Additionally, the GraphQL field format corresponds to the OpenSearch aggregation key content_format, so :format will never be requested with the current mapping. Consider underscoring the extracted field name (and mapping format -> content_format) before passing it into Opensearch/Aggregations.for_request, and add/adjust tests to cover at least one camelCase field like contentType or accessToFiles and the format/content_format mapping.

Suggested change
def requested_aggregations
used_fields = context[:tracers].first.log_data[:used_fields]
used_fields.select { |field| field.start_with?('Aggregations.') }
.map { |field| field.sub('Aggregations.', '').to_sym }
def requested_aggregation_field(field_name)
return :content_format if field_name == 'format'
field_name.underscore.to_sym
end
def requested_aggregations
used_fields = context[:tracers].first.log_data[:used_fields]
used_fields.select { |field| field.start_with?('Aggregations.') }
.map { |field| requested_aggregation_field(field.sub('Aggregations.', '')) }

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems critical, but it's also confusing to me. I'm not sure why the aggregations model has a different name for the format aggregation. I'm going to look at it tomorrow with fresh eyes to see if it makes more sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some discussion, we think this is related to format being a stop word in the REST API. Out of caution, I've decided to leave this mapping as-is.

Comment thread app/graphql/types/query_type.rb
jazairi added 2 commits May 4, 2026 13:39
Why these changes are being introduced:

Aggregations are currently calculated even when
they are not requested in the GraphQL query. It
would be more efficient to calculate them only
when they are needed.

Relevant ticket(s):

- [USE-491](https://mitlibraries.atlassian.net/browse/USE-491)

How this addresses that need:

This adds a `requested_aggregations` method to
Query Type that evaluates which aggregations are
requested in the query. This information is then
used in the Aggregations and Opensearch models
to calculate only the requested aggregations.

Side effects of this change:

None.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants