Skip to content

Add support for aggregates with an internal stype#8505

Merged
colm-mchugh merged 7 commits intocitusdata:mainfrom
visridha:visridha/add-internal-agg-support
Mar 17, 2026
Merged

Add support for aggregates with an internal stype#8505
colm-mchugh merged 7 commits intocitusdata:mainfrom
visridha:visridha/add-internal-agg-support

Conversation

@visridha
Copy link
Copy Markdown
Contributor

DESCRIPTION: Add support for aggregates with an internal stype

Citus has historically required custom aggregates to not have an internal stype except for specific internal aggregates. This has led to a number of workarounds to get performance and custom aggregates working with distributed tables.

This change removes that restriction by mirroring Postgres's use of the SERIALFUNC and DESERIALFUNC to roundtrip state for aggregates' internal stype metadata between workers and coordinators allowing more natural use of custom aggregates in Citus.

@onurctirtir onurctirtir self-requested a review March 12, 2026 08:55
SELECT key, internalsum(val), sum(val) from aggdata group by key order by key;

DROP AGGREGATE internalsum(int8);

Copy link
Copy Markdown
Contributor

@colm-mchugh colm-mchugh Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of additional test suggestions:

  1. An aggregate with an internal stype but no serial funcs. This should be caught by error handling, as before this PR - can this be tested ?
  2. An aggregate with internal stype that is text serializable only (is false for IsAggTransTypeBinarySerializable()) - This would exercise the serialization support added to worker_partial_agg_ffunc().

I'm not sure if the second suggestion is gated by the first - an internal stype with invalid serialfn should result in an error - if so then is the change to worker_partial_agg_ffunc() redundant ? In any case, the ask is to exercise the change in worker_partial_agg_ffunc() if possible.

Copy link
Copy Markdown
Contributor Author

@visridha visridha Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test for the aggregate without serialfunc. the thing is SERIAL/DESERIAL will always go through the binary agg since the output of a SERIALFUNC is bytea (by construction) so the second is not testable

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the additional tests. Given that 2) is not testable, does it make sense to remove the serial check from worker_partial_agg_ffunc() ? Given that these are execution functions, keep them as lean as possible.

Copy link
Copy Markdown

@sfc-gh-mslot sfc-gh-mslot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for context. One thing that has historically kept us from doing this was uncertainty around whether serialized state is actually safe to transfer over the network (e.g. never contains things like OIDS), since it's generally only used for IPC.

#120
#3916

I would still go ahead, because it's a very worthwhile optimization, but maybe have a GUC to enable/disable in case nasty errors come up.

Copy link
Copy Markdown
Contributor

@colm-mchugh colm-mchugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerned that a serialization function that returns NULL on non-NULL input is not handled - see comments in CheckAndCallSerialFunc() - and with no constraints on user-defined aggregates (such as strictness or non NULL) we could be exposed here.

Copy link
Copy Markdown
Contributor

@colm-mchugh colm-mchugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, couple of minor asks.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 22.41379% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.80%. Comparing base (84ddcd9) to head (0952cd4).

❌ Your patch check has failed because the patch coverage (22.41%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (44.80%) is below the target coverage (87.50%). You can increase the head coverage or adjust the target coverage.

❗ There is a different number of reports uploaded between BASE (84ddcd9) and HEAD (0952cd4). Click for more details.

HEAD has 91 uploads less than BASE
Flag BASE (84ddcd9) HEAD (0952cd4)
16_regress_check-pytest 1 0
16_regress_check-follower-cluster 1 0
18_regress_check-columnar-isolation 1 0
17_regress_check-tap 1 0
17_regress_check-add-backup-node 1 0
17_regress_check-follower-cluster 1 0
18_regress_check-add-backup-node 1 0
16_regress_check-enterprise-failure 1 0
17_regress_check-split 1 0
17_regress_check-operations 1 0
18_regress_check-operations 1 0
17_arbitrary_configs_5 1 0
18_regress_check-multi-1 1 0
16_18_upgrade 1 0
18_regress_check-tap 1 0
17_regress_check-columnar-isolation 1 0
18_regress_check-follower-cluster 1 0
16_regress_check-tap 1 0
16_regress_check-columnar-isolation 1 0
17_regress_check-query-generator 1 0
17_18_upgrade 1 0
16_regress_check-enterprise-isolation-logicalrep-3 1 0
16_17_upgrade 1 0
16_regress_check-add-backup-node 1 0
16_arbitrary_configs_1 1 0
17_regress_check-enterprise-isolation-logicalrep-1 1 0
17_citus_upgrade 1 0
17_regress_check-pytest 1 0
16_citus_upgrade 1 0
18_regress_check-query-generator 1 0
18_regress_check-pytest 1 0
18_regress_check-columnar 1 0
16_regress_check-query-generator 1 0
17_regress_check-enterprise-isolation-logicalrep-3 1 0
18_regress_check-enterprise-isolation-logicalrep-3 1 0
17_regress_check-enterprise-failure 1 0
17_regress_check-enterprise-isolation-logicalrep-2 1 0
18_regress_check-enterprise-failure 1 0
18_regress_check-enterprise-isolation-logicalrep-2 1 0
18_regress_check-vanilla 1 0
17_regress_check-columnar 1 0
16_regress_check-split 1 0
17_regress_check-enterprise 1 0
18_regress_check-multi-1-create-citus 1 0
16_regress_check-vanilla 1 0
16_regress_check-failure 1 0
17_regress_check-vanilla 1 0
18_regress_check-multi-mx 1 0
16_regress_check-multi-1-create-citus 1 0
16_regress_check-multi-mx 1 0
17_regress_check-multi-mx 1 0
18_regress_check-enterprise 1 0
16_regress_check-enterprise 1 0
18_regress_check-enterprise-isolation-logicalrep-1 1 0
17_regress_check-multi-1-create-citus 1 0
18_cdc_installcheck 1 0
16_regress_check-enterprise-isolation-logicalrep-2 1 0
16_regress_check-columnar 1 0
18_regress_check-split 1 0
16_regress_check-operations 1 0
16_regress_check-isolation 1 0
18_regress_check-multi 1 0
16_regress_check-enterprise-isolation 1 0
17_regress_check-failure 1 0
18_regress_check-failure 1 0
16_regress_check-enterprise-isolation-logicalrep-1 1 0
18_regress_check-enterprise-isolation 1 0
17_regress_check-enterprise-isolation 1 0
17_regress_check-isolation 1 0
18_regress_check-isolation 1 0
16_cdc_installcheck 1 0
17_cdc_installcheck 1 0
17_arbitrary_configs_0 1 0
16_arbitrary_configs_0 1 0
18_arbitrary_configs_0 1 0
16_regress_check-multi 1 0
18_arbitrary_configs_5 1 0
18_arbitrary_configs_1 1 0
16_arbitrary_configs_2 1 0
17_arbitrary_configs_2 1 0
18_arbitrary_configs_3 1 0
17_arbitrary_configs_3 1 0
16_arbitrary_configs_3 1 0
17_regress_check-multi-1 1 0
18_arbitrary_configs_4 1 0
16_regress_check-multi-1 1 0
17_arbitrary_configs_4 1 0
16_arbitrary_configs_4 1 0
17_regress_check-multi 1 0
17_arbitrary_configs_1 1 0
16_arbitrary_configs_5 1 0
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #8505       +/-   ##
===========================================
- Coverage   88.92%   44.80%   -44.12%     
===========================================
  Files         286      286               
  Lines       63129    62253      -876     
  Branches     7914     7652      -262     
===========================================
- Hits        56135    27892    -28243     
- Misses       4727    31961    +27234     
- Partials     2267     2400      +133     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@colm-mchugh colm-mchugh merged commit 347d723 into citusdata:main Mar 17, 2026
289 of 290 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants