Skip to content

Adding support for big decimal aggregations via MSQ #18164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 1, 2025

Conversation

cryptoe
Copy link
Contributor

@cryptoe cryptoe commented Jun 20, 2025

  • Adding support for big decimal aggregations via MSQ
  • Minor refactoring to move classes into separate packages.

Release note

Adding support for big decimal aggregations via MSQ

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

* Minor refactoring to move classes into separate packages.
SELECT
TIME_PARSE(TRIM("timestamp")) AS "__time",
"itemName",
BIG_SUM("saleAmount") as amount
Copy link
Member

@kgyrtkirk kgyrtkirk Jun 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess descendants of CompressedBigDecimalSqlAggregatorTestBase is not run under MSQ as it would be weird to load that under the MSQ module.

I wonder if it would be possible to either run those tests somehow from the quidem-ut module - or possibly just exercise some query via an iq file from the same module to ensure that it can't be broken too easily

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it would make sense to test this by either:

  • loading MSQ into the compressed-big-decimal extension as a test dependency;
  • or, loading both in quidem-ut.

SELECT
TIME_PARSE(TRIM("timestamp")) AS "__time",
"itemName",
BIG_SUM("saleAmount") as amount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it would make sense to test this by either:

  • loading MSQ into the compressed-big-decimal extension as a test dependency;
  • or, loading both in quidem-ut.

@Override
public AggregatorFactory withName(String newName)
{
return new CompressedBigDecimalMaxAggregatorFactory(newName, fieldName, size, scale, strictNumberParsing);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this the fix (along with similar changes in min, sum)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it was adding the withName implementation to the other aggregator factories.

@@ -144,6 +144,31 @@ IngestionSpec syntax:
}
}
```

Sql based ingestion sample query:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"SQL-based" is how we usually write this.

@capistrant capistrant added this to the 34.0.0 milestone Jun 25, 2025
* Review comments
@github-actions github-actions bot added Area - Batch Ingestion Area - Dependencies Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Jun 27, 2025
@cryptoe
Copy link
Contributor Author

cryptoe commented Jun 27, 2025

@kgyrtkirk @gianm

  • Added the quidem test to both DART and MSQ.
  • Addressed review comments.

PTAL.

Copy link
Member

@kgyrtkirk kgyrtkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @cryptoe for the updates!

@kgyrtkirk
Copy link
Member

I think you will need to add the druid-compressed-bigdecimal to the quidem-ut module

@cryptoe cryptoe merged commit 14e16cf into apache:master Jul 1, 2025
77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - Dependencies Area - Documentation Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants