Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add collect agg function #1292

Merged
merged 1 commit into from
Feb 17, 2023
Merged

Add collect agg function #1292

merged 1 commit into from
Feb 17, 2023

Conversation

acquamarin
Copy link
Collaborator

@acquamarin acquamarin commented Feb 15, 2023

This PR adds support for collect function.

LIST agg function:

1. Description:

Returns a LIST containing all the values of a column.

2. Example:

MATCH (p:person) RETURN p.gender, collect(p.workedHours)
1|[[10,5],[4,5],[2]]
2|[[12,8],[1,9],[3,4,5,6,7],[1],[10,11,12,3,4,5,6,7]]

3. Implementation:

a. The aggregate operator will accumulate elements in a factorizedTable which is stored in the collectAggState.
b. During the AggregateScan: the collectAggState will output all the elements as a list and store the list in the output valueVector.

4. Note:

a. We currently do not support compute hash value for ku_list_t. Therefore, users are not allowed to run a query which requires to compute the hash value of ku_list_t.
E.g.

MATCH (p:person) RETURN collect(distinct p.workedHours)

This query needs to compute the hash value of p.workedHours, which is not supported.

b. We currently do not support storing NULL values in ku_list_t. If the elements to collect contain NULL, undefined behaviour may happen during query processing.

Copy link
Contributor

@andyfengHKU andyfengHKU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks fine to me. But @ray6080 should take a look

src/function/built_in_aggregate_functions.cpp Show resolved Hide resolved
src/include/function/function_definition.h Show resolved Hide resolved
src/include/function/aggregate/collect.h Outdated Show resolved Hide resolved
src/include/function/aggregate/collect.h Outdated Show resolved Hide resolved
src/include/function/aggregate/collect.h Outdated Show resolved Hide resolved
src/include/function/aggregate/collect.h Outdated Show resolved Hide resolved
src/include/function/aggregate/collect.h Outdated Show resolved Hide resolved
src/include/function/aggregate/collect.h Show resolved Hide resolved
Copy link
Contributor

@ray6080 ray6080 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in. Once we're done with changes in buffer manager, we can revisit these designs again.

src/include/function/aggregate/collect.h Show resolved Hide resolved
@acquamarin acquamarin merged commit 49bb742 into master Feb 17, 2023
@acquamarin acquamarin deleted the collect-func branch February 17, 2023 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants