-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How Druid should handle situation when string dimension column is queried as numeric? #4888
Comments
One current inconsistency is that with expression-based column selectors (anything that goes through Parser/Expr) the behavior is (3). See IdentifierExpr + how it handles strings that are treated as numbers. But with direct column selectors the behavior is (1). In particular this means that e.g. a longSum aggregator behaves differently if it's Although, on the third hand, dimensions behave differently from aggregators. They act more like (3). There's some code that sets up the "proper" column selector that matches the type in the segment, and groups using that seelctor on a per-segment basis, then uses the user-specified IMO, making (3) consistent across the board would be good. I don't think (2) or (4) are good, it's not in the spirit of Druid being a generally "loose schema" data store. IMO (3) is preferable to (1) since it enables easy schema changes from string -> numeric type and vice versa. This was the motivator for making groupBY/topN behave the way they do: it was introduced along with the numeric dimension feature. |
consistently doing (3) makes sense to me too specially to behave well with schema migration. |
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions. |
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time. |
thanks to stalebot ,I had forgotten about this. It turns out I have now gotten use cases where at least the aggregators need to have behavior (3) . I had created an issue to describe my thoughts in #8148 and I think I'm gonna make a PR soonish to make that change as using expressions with multi value columns is becoming tricky for the service that generates queries. |
How this behaviour is chosen?
The text was updated successfully, but these errors were encountered: