Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count indexes should be cleared when decremented to zero #737

Closed
alecgrieser opened this issue Sep 29, 2019 · 1 comment · Fixed by #1097
Closed

Count indexes should be cleared when decremented to zero #737

alecgrieser opened this issue Sep 29, 2019 · 1 comment · Fixed by #1097
Assignees
Labels
enhancement New feature or request

Comments

@alecgrieser
Copy link
Contributor

The COMPARE_AND_CLEAR mutation type allows the user specify a value and to conditionally clear a key if the value is equal to that key. The count and possibly the sum index types should use this to remove keys that have been decremented to zero. In particular, this is good as with grouped aggregates, one can wind up in situations where if the grouping key is something that increases with time, then older grouping keys will never be garbage collected out. (For example, imagine a grouping column of, say, "day" to bucket together the number of records per day. Then if only k days of history are kept around, there will only be at most k non-zero index keys, but the older days are never cleared out.) Both the count and sum indexes treat null (i.e., unset keys) identically to keys set to zero.

@alecgrieser alecgrieser added the enhancement New feature or request label Sep 29, 2019
@alecgrieser alecgrieser changed the title Count indexes should use clear when decremented to zero Count indexes should be cleared when decremented to zero Sep 29, 2019
@alecgrieser
Copy link
Contributor Author

One thing about sum that perhaps makes this more complicated is that a "sum" index can be zero either if all elements (or all elements in a group) are removed or if the sum is legitimately zero (because, say, all elements are zero or there are both positive and negative elements), and we probably do want to retain zero entries in that case, but we don't have a good way to distinguish the two.

It might perhaps make sense to add this add an option, with it set to "don't clear" by default for two reasons:

  1. Behavior preservation
  2. Keep the same behavior for "sum" and "count" indexes

Then we encourage (through documentation) that the option be enabled on count indexes but not sum indexes (but you can enable it if you either don't care about the zero sum groups or only have positive values in the sum).

@MMcM MMcM self-assigned this Jan 11, 2021
MMcM added a commit to MMcM/fdb-record-layer that referenced this issue Jan 11, 2021
MMcM added a commit to MMcM/fdb-record-layer that referenced this issue Jan 13, 2021
…mented to zero (FoundationDB#1097)

This takes the approach suggested in the comment of making this an optional behavior defaulting to off, for the reasons stated there: it avoids incompatible changes and makes `SUM` and `COUNT` symmetrical.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants