-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stringFirst/stringLast crashes at aggregation time #7243
Comments
I'd be happy to fix this, and it's probably a small fix, but I'm lost as to where the appropriate place to add a unit test is. It seems like the issue here is less "a specific class's implementation has a bug" and more "a class implements the wrong design which only fails in a larger context", so it seems like I'd want to write a test that actually uses the larger context that a StringFirstAggregatorFactory is used in, instead of just being a unit test that StringFirstAggregateCombiner implements what we now think it should implement. Anyone have a good idea of where such a test could go? (An integration test seems overkill.) |
Maybe this is a question that @leventov (who added AggregateCombiners) could help with? I'm wondering where would be a good place to write a test that verifies that a specific AggregateCombiner works in the context of its AggregatoryFactory. My belief is that StringFirstAggregateCombiner (and StringLastAggregateCombiner) doesn't actually act on the right data type, but it doesn't seem like this is something that should be tested just by testing SFAC itself... |
The problem is that Related to #6039. |
I honestly couldn't figure out what the purpose of the For a short term fix (to fix the bug, without the refactors you're suggesting), is the suggestion that |
It seems to me that no, you cannot return As a short term fix you can override |
Oh I see. I was getting |
I cannot think of a better place than For your patch, it's not obligatory to add such tests, at least because many other aggregator factories currently don't have such tests either. |
Sure, I guess the problem is I can write tests that validate that the particular classes do what I wrote them to do, but I'm not sure how to write tests that validate that all the different components of aggregating work together properly with the right types, especially as I don't really understand this corner of Druid well. The difference between "does the code do what I think it does" and "does the code implement the right thing". |
I saw this message in the latest version : |
@glasser Hi David, do you know if there is any progress on this issue? We were hoping to use this feature for a use-case but have come across this bug. Otherwise, do you know if there is an undocumented workaround? |
My attention has been elsewhere but my situation remains the same -- the fix seems trivial but I'd be way more comfortable making it if there was a more integration/end to end test showing that the fix was correct, and I need advice on how to do that. |
Is there any update on this topic? It would be neat to be able to use this feature at ingestion time. It would allow us to reduce the amount of stored data and hopefully speed up our queries. |
We also encountered similar issue. We have attempted the fix proposed in the above conversation. I will try to post a PR for review soon. |
Affected Version
0.13.0-incubating (and maybe older versions)
Description
See discussion on #5789 for background.
Set up a fresh install of Druid 0.13-incubating following the quickstart.
Write this index spec:
and post it:
The log reads:
@gianm suggested in #5789 that the AggregateCombiner code was just not running at all and that it should always be acting on SerializablePairLongString values rather than Strings. I am not enough of an expert on aggregation to know if that's correct.
Note that I needed to ensure that the task spec specified rollup: true, a large enough queryGranularity so that multiple rows do get combined, and a small enough maxRowsInMemory so that multiple spill files get combined.
The text was updated successfully, but these errors were encountered: