Skip to content

Adding null check to earliest and latest aggs#15972

Merged
LakshSingla merged 3 commits intoapache:masterfrom
somu-imply:earliest
Mar 19, 2024
Merged

Adding null check to earliest and latest aggs#15972
LakshSingla merged 3 commits intoapache:masterfrom
somu-imply:earliest

Conversation

@somu-imply
Copy link
Contributor

@somu-imply somu-imply commented Feb 26, 2024

I have been able to replicate this case in MSQ insert only, trying to add a test case. But overall the issue is this. When data is ingested using the following

insert into mytest
select
  time_parse("time") as __time,
  cityName
from table(
  extern(
    '{"type": "s3", "prefixes": ["s3://imply-eng-datasets/qa/IngestionTest/wikipedia/files/wikiticker-2015-09-12-sampled.json.gz"]}',
    '{"type": "json"}',
    '[{"name": "time", "type": "string"}, {"name": "channel", "type": "string"}, {"name": "cityName", "type": "string"}, {"name": "comment", "type": "string"}, {"name": "countryIsoCode", "type": "string"}, {"name": "countryName", "type": "string"}, {"name": "isAnonymous", "type": "string"}, {"name": "isMinor", "type": "string"}, {"name": "isNew", "type": "string"}, {"name": "isRobot", "type": "string"}, {"name": "isUnpatrolled", "type": "string"}, {"name": "metroCode", "type": "string"}, {"name": "namespace", "type": "string"}, {"name": "page", "type": "string"}, {"name": "regionIsoCode", "type": "string"}, {"name": "regionName", "type": "string"}, {"name": "user", "type": "string"}, {"name": "delta", "type": "long"}, {"name": "added", "type": "long"}, {"name": "deleted", "type": "long"}]'
  )
)
where countryName in ('Japan')
partitioned by YEAR
clustered by __time

insert into mytest
select
  time_parse("time") as __time,
  added
from table(
  extern(
    '{"type": "s3", "prefixes": ["s3://imply-eng-datasets/qa/IngestionTest/wikipedia/files/wikiticker-2015-09-12-sampled.json.gz"]}',
    '{"type": "json"}',
    '[{"name": "time", "type": "string"}, {"name": "channel", "type": "string"}, {"name": "cityName", "type": "string"}, {"name": "comment", "type": "string"}, {"name": "countryIsoCode", "type": "string"}, {"name": "countryName", "type": "string"}, {"name": "isAnonymous", "type": "string"}, {"name": "isMinor", "type": "string"}, {"name": "isNew", "type": "string"}, {"name": "isRobot", "type": "string"}, {"name": "isUnpatrolled", "type": "string"}, {"name": "metroCode", "type": "string"}, {"name": "namespace", "type": "string"}, {"name": "page", "type": "string"}, {"name": "regionIsoCode", "type": "string"}, {"name": "regionName", "type": "string"}, {"name": "user", "type": "string"}, {"name": "delta", "type": "long"}, {"name": "added", "type": "long"}, {"name": "deleted", "type": "long"}]'
  )
)
where countryName in ('Japan')
partitioned by YEAR
clustered by __time

Running a query like this below would throw an NPE.

select earliest(added) from mytest


Error: RUNTIME_FAILURE (OPERATOR)
Cannot read field "lhs" because "inPair" is null
java.lang.NullPointerException

We missed a null check in numeric earliest and latest aggs. After adding it back, these test cases pass

Screen Shot 2024-02-26 at 2 39 00 PM

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@soumyava soumyava added Bug Area - Querying Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Feb 27, 2024
@LakshSingla
Copy link
Contributor

Can we add a test reproducing the issue? When is the inPair null?

@somu-imply
Copy link
Contributor Author

I have added a testcase with nulls to replicate the issue. Before the change the added test case would throw a NPE, which is now avoided

@LakshSingla LakshSingla merged commit c7823bc into apache:master Mar 19, 2024
@LakshSingla LakshSingla added this to the 29.0.1 milestone Mar 19, 2024
cryptoe pushed a commit to cryptoe/druid that referenced this pull request Mar 19, 2024
* Adding null check to earliest and latest aggs

* Native tests for null inPairs
cryptoe added a commit that referenced this pull request Mar 19, 2024
* Adding null check to earliest and latest aggs

* Native tests for null inPairs

Co-authored-by: Soumyava <93540295+somu-imply@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Querying Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants