Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport] Fix four bugs with numeric dimension output types. #6230

Merged
merged 1 commit into from
Aug 26, 2018

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Aug 25, 2018

This patch includes the following bug fixes:

  • TopNColumnSelectorStrategyFactory: Cast dimension values to the output type
    during dimExtractionScanAndAggregate instead of updateDimExtractionResults.
    This fixes a bug where, for example, grouping on doubles-cast-to-longs would
    fail to merge two doubles that should have been combined into the same long value.
  • TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns
    as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long
    would fail to merge two strings that should have been combined.
  • GroupByQuery: Cast numeric types to the expected output type before comparing them
    in compareDimsForLimitPushDown. This fixes ClassCastException in groupBy when sorting on numeric columns containing nulls #6123.
  • GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into
    the proper output type. This fixes an inconsistency between results that came
    from cache vs. not-cache: for example, Jackson sometimes deserializes integers
    as Integers and sometimes as Longs.

And the following code-cleanup changes, related to the fixes above:

  • DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType,
    and converterFromTypeToType to make it easier to handle casting operations.
  • TopN in general: Rename various "dimName" variables to "dimValue" where they
    actually represent dimension values. The old names were confusing.
  • Remove unused imports.

* Fix four bugs with numeric dimension output types.

This patch includes the following bug fixes:

- TopNColumnSelectorStrategyFactory: Cast dimension values to the output type
  during dimExtractionScanAndAggregate instead of updateDimExtractionResults.
  This fixes a bug where, for example, grouping on doubles-cast-to-longs would
  fail to merge two doubles that should have been combined into the same long value.
- TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns
  as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long
  would fail to merge two strings that should have been combined.
- GroupByQuery: Cast numeric types to the expected output type before comparing them
  in compareDimsForLimitPushDown. This fixes apache#6123.
- GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into
  the proper output type. This fixes an inconsistency between results that came
  from cache vs. not-cache: for example, Jackson sometimes deserializes integers
  as Integers and sometimes as Longs.

And the following code-cleanup changes, related to the fixes above:

- DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType,
  and converterFromTypeToType to make it easier to handle casting operations.
- TopN in general: Rename various "dimName" variables to "dimValue" where they
  actually represent dimension values. The old names were confusing.

* Remove unused imports.
@gianm gianm added the Backport label Aug 25, 2018
@gianm gianm added this to the 0.12.3 milestone Aug 25, 2018
@gianm gianm changed the title Fix four bugs with numeric dimension output types. (#6220) [Backport] Fix four bugs with numeric dimension output types. Aug 25, 2018
@fjy fjy merged commit 0b79e76 into apache:0.12.3 Aug 26, 2018
@gianm gianm deleted the backport-6220-to-0.12.3 branch August 26, 2018 23:19
leventov pushed a commit to metamx/druid that referenced this pull request Aug 31, 2018
…che#6230)

* Fix four bugs with numeric dimension output types.

This patch includes the following bug fixes:

- TopNColumnSelectorStrategyFactory: Cast dimension values to the output type
  during dimExtractionScanAndAggregate instead of updateDimExtractionResults.
  This fixes a bug where, for example, grouping on doubles-cast-to-longs would
  fail to merge two doubles that should have been combined into the same long value.
- TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns
  as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long
  would fail to merge two strings that should have been combined.
- GroupByQuery: Cast numeric types to the expected output type before comparing them
  in compareDimsForLimitPushDown. This fixes apache#6123.
- GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into
  the proper output type. This fixes an inconsistency between results that came
  from cache vs. not-cache: for example, Jackson sometimes deserializes integers
  as Integers and sometimes as Longs.

And the following code-cleanup changes, related to the fixes above:

- DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType,
  and converterFromTypeToType to make it easier to handle casting operations.
- TopN in general: Rename various "dimName" variables to "dimValue" where they
  actually represent dimension values. The old names were confusing.

* Remove unused imports.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants