Skip to content

Frames: Ensure nulls are read as default values when appropriate.#14020

Merged
LakshSingla merged 4 commits intoapache:masterfrom
gianm:fix-frames-null-longs
Apr 9, 2023
Merged

Frames: Ensure nulls are read as default values when appropriate.#14020
LakshSingla merged 4 commits intoapache:masterfrom
gianm:fix-frames-null-longs

Conversation

@gianm
Copy link
Contributor

@gianm gianm commented Apr 4, 2023

Fixes a bug where LongFieldWriter didn't write a properly transformed zero when writing out a null. This had no meaningful effect in SQL-compatible null handling mode, because the field would get treated as a null anyway. But it does have an effect in default-value mode: it would cause Long.MIN_VALUE to get read out instead of zero.

Also adds NullHandling checks to the various frame-based column selectors, allowing reading of nullable frames by servers in default-value mode.

Fixes a bug where LongFieldWriter didn't write a properly transformed
zero when writing out a null. This had no meaningful effect in SQL-compatible
null handling mode, because the field would get treated as a null anyway.
But it does have an effect in default-value mode: it would cause Long.MIN_VALUE
to get read out instead of zero.

Also adds NullHandling checks to the various frame-based column selectors,
allowing reading of nullable frames by servers in default-value mode.
@gianm gianm added Bug Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Apr 4, 2023
Comment on lines +146 to +150
final String sql = "INSERT INTO foo1\n"
+ "SELECT TIME_PARSE(dim1) AS __time, dim1 as cnt\n"
+ "FROM foo\n"
+ "PARTITIONED BY DAY\n"
+ "CLUSTERED BY dim1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you were to add a WHERE TIME_PARSE(dim1) is not null to the query, do the results of the two queries becomes the same regardless of mode? I think they should, but curious.

Copy link
Contributor Author

@gianm gianm Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried it, and yes, they are the same in that case. Both queries ingest the two rows with parseable timestamps, and ignore the four with unparseable timestamps.

Copy link
Contributor

@LakshSingla LakshSingla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the null handling be done in the StringFieldWriters as well? Consider the following test case in CalciteQueryTest (permalink) which works with the native engine, however doesn't with MSQ. Also due to the ORDER BY the order of the results produced can be different when comparing nulls versus when comparing null with ""

SELECT dim1, dim2, SUM(cnt) AS thecnt
FROM druid.foo
GROUP BY dim1, dim2
HAVING SUM(cnt) = 1
ORDER BY dim2
LIMIT 4

@gianm
Copy link
Contributor Author

gianm commented Apr 5, 2023

Should the null handling be done in the StringFieldWriters as well? Consider the following test case in CalciteQueryTest (permalink) which works with the native engine, however doesn't with MSQ.

@LakshSingla I didn't update the StringFieldWriter (or other writers, generally) but I did update the StringFieldReader to return null rather than "" in default-value mode. That's consistent with how selectors behave on regular segments.

However, this didn't help with the test testGroupByLimitPushDownWithHavingOnLong, at least in default-value mode. I think it's because coerce is turning the null into "" in the native path, but not the MSQ path. I have another PR I'm working on to update that, which may unlock the ability to run the test case in MSQ.

I did update the test to run with MSQ in SQL-compat mode though, by adding:

    if (NullHandling.sqlCompatible()) {
      msqCompatible();
    }

@gianm
Copy link
Contributor Author

gianm commented Apr 5, 2023

The failing test has to do with branch coverage in unit tests (jdk8, sql-compat=false) / processing_modules_test. That one isn't going to pass, since the new branches only execute when sql-compat=true. So I think that means we should ignore it.

Copy link
Contributor

@LakshSingla LakshSingla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. To disambiguate the coerce mentioned above, I think you mean the one that is present in the NativeQueryMaker right? If so, this will also fix a few other test cases that I was seeing fail because of a type mismatch between DOUBLE and FLOAT, so I think that would be pretty cool.

  2. Do we not require the changes in the StringFrameColumnReader for appropriate null handling as well? I see that there is something done in the getStringUtf8 method, so I might be wrong

  3. I think we should handle the nulls appropriately in the following line (permalink) as well, when the StringFieldReader has figured out that the field is a NULL_BYTE. WDYT?

  4. Unrelated to the change, while digging through that piece of code, I found the following condition (permalink)

if ((dataLength == 0 && NullHandling.replaceWithDefault()) ||
    (dataLength == 1 && memory.getByte(dataStart) == FrameWriterUtils.NULL_STRING_MARKER)) {
        return null;
}

Should it instead be:

if ((dataLength == 0 && NullHandling.replaceWithDefault()) ||
    (dataLength == 1 && memory.getByte(dataStart) == FrameWriterUtils.NULL_STRING_MARKER && NullHandling.replaceWithDefault())) {
        return null;
}

Rest of the changes LGTM 🚀

@gianm
Copy link
Contributor Author

gianm commented Apr 6, 2023

@LakshSingla

To disambiguate the coerce mentioned above, I think you mean the one that is present in the NativeQueryMaker right? If so, this will also fix a few other test cases that I was seeing fail because of a type mismatch between DOUBLE and FLOAT, so I think that would be pretty cool.

Yes, that's the one I mean. In a future PR I'm planning to use this same logic for MSQ results too.

Do we not require the changes in the StringFrameColumnReader for appropriate null handling as well? I see that there is something done in the getStringUtf8 method, so I might be wrong

It's already there: getStringUtf8 turns empty strings to nulls if NullHandling.replaceWithDefault().

I think we should handle the nulls appropriately in the following line (permalink) as well, when the StringFieldReader has figured out that the field is a NULL_BYTE. WDYT?

There's nothing special to do there, since in default-value mode, the convention is that nulls and empty strings are both returned as nulls from selectors. So the NULL_BYTE branch is already behaving as expected.

Unrelated to the change, while digging through that piece of code, I found the following condition (permalink)

It believe it's correct as-is. It's saying that we should return null in two cases:

  • dataLength == 0 && NullHandling.replaceWithDefault(): The frame contains an empty string, and NullHandling.replaceWithDefault() is true. In this mode, by convention selectors return null instead of empty string. (The method name is misleading; it refers to what happens later, in query results, when null are turned back into empty strings.)
  • dataLength == 1 && memory.getByte(dataStart) == FrameWriterUtils.NULL_STRING_MARKER): The frame contains an actual null. Behavior here is the same in both modes: a null is returned.

@LakshSingla LakshSingla merged commit d52bc33 into apache:master Apr 9, 2023
@LakshSingla
Copy link
Contributor

Thanks for the PR! Merging since codecov failures can be ignored due to #14020 (comment)

@gianm gianm deleted the fix-frames-null-longs branch April 14, 2023 01:53
cryptoe pushed a commit to cryptoe/druid that referenced this pull request Apr 14, 2023
…ache#14020)

* Frames: Ensure nulls are read as default values when appropriate.

Fixes a bug where LongFieldWriter didn't write a properly transformed
zero when writing out a null. This had no meaningful effect in SQL-compatible
null handling mode, because the field would get treated as a null anyway.
But it does have an effect in default-value mode: it would cause Long.MIN_VALUE
to get read out instead of zero.

Also adds NullHandling checks to the various frame-based column selectors,
allowing reading of nullable frames by servers in default-value mode.

(cherry picked from commit d52bc33)
gianm added a commit to gianm/druid that referenced this pull request Apr 18, 2023
The main change is the new tests: we now subclass CalciteJoinQueryTest
in CalciteSelectJoinQueryMSQTest twice, once for Broadcast and once for
SortMerge.

Two supporting production changes for default-value mode:

1) InputNumberDataSource is marked as concrete, to allow leftFilter to
   be pushed down to it.

2) In default-value mode, numeric frame field readers can now return nulls.
   This is necessary when stacking joins on top of joins: nulls must be
   preserved for semantics that match broadcast joins and native queries.

3) In default-value mode, StringFieldReader.isNull returns true on empty
   strings in addition to nulls. This is more consistent with the behavior
   of the selectors, which map empty strings to null as well in that mode.

As an effect of change (2), the InsertTimeNull change from apache#14020 (to
replace null timestamps with default timestamps) is reverted. IMO, this
is fine, as either behavior is defensible, and the change from apache#14020
hasn't been released yet.
gianm added a commit that referenced this pull request Apr 25, 2023
* MSQ: Subclass CalciteJoinQueryTest, other supporting changes.

The main change is the new tests: we now subclass CalciteJoinQueryTest
in CalciteSelectJoinQueryMSQTest twice, once for Broadcast and once for
SortMerge.

Two supporting production changes for default-value mode:

1) InputNumberDataSource is marked as concrete, to allow leftFilter to
   be pushed down to it.

2) In default-value mode, numeric frame field readers can now return nulls.
   This is necessary when stacking joins on top of joins: nulls must be
   preserved for semantics that match broadcast joins and native queries.

3) In default-value mode, StringFieldReader.isNull returns true on empty
   strings in addition to nulls. This is more consistent with the behavior
   of the selectors, which map empty strings to null as well in that mode.

As an effect of change (2), the InsertTimeNull change from #14020 (to
replace null timestamps with default timestamps) is reverted. IMO, this
is fine, as either behavior is defensible, and the change from #14020
hasn't been released yet.

* Adjust tests.

* Style fix.

* Additional tests.
clintropolis pushed a commit to clintropolis/druid that referenced this pull request May 8, 2023
…ache#14020)

* Frames: Ensure nulls are read as default values when appropriate.

Fixes a bug where LongFieldWriter didn't write a properly transformed
zero when writing out a null. This had no meaningful effect in SQL-compatible
null handling mode, because the field would get treated as a null anyway.
But it does have an effect in default-value mode: it would cause Long.MIN_VALUE
to get read out instead of zero.

Also adds NullHandling checks to the various frame-based column selectors,
allowing reading of nullable frames by servers in default-value mode.
@clintropolis clintropolis added this to the 26.0 milestone May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Documentation Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants