Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null fixes on Arrow bridge #7411

Closed
wants to merge 1 commit into from

Conversation

pedroerp
Copy link
Contributor

@pedroerp pedroerp commented Nov 4, 2023

Summary:
Ensure null_count is always set, add support for null constants, and
enhance tests to check the bitmap values match null_count.

Differential Revision: D50997553

Summary:
Ensure null_count is always set, add support for null constants, and
enhance tests to check the bitmap values match null_count.

Differential Revision: D50997553
Copy link

netlify bot commented Nov 4, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 8adeb88
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/654594eb07cecc0008bd4dbb

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 4, 2023
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D50997553

Copy link
Contributor

@kgpai kgpai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some questions though..


// If we're only exporting a subset, create a new validity buffer.
if (rows.changed()) {
nulls = AlignedBuffer::allocate<bool>(out.length, pool);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should the size of this be out.length or vec.size() or rows.end() ? Are we guranteed to have out.length >= vec.size()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kgpai out.length is set as rows.count() and guaranteed to be smaller than vec.size() (since rows.count() is a subset of it)


// Set null counts.
if (!rows.changed() && (vec.getNullCount() != std::nullopt)) {
out.null_count = *vec.getNullCount();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is wrong, but my impression was always that null count isnt always up to date and is best effort - maybe we should just use countNulls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getNullCount() is an std::optional. In many cases it's not set, but when it is set we ensure it is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I was uncertain if we ever unset it after setting it if some operation takes place on the vector. I will take your word on this though.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 8542bf8.

Copy link

Conbench analyzed the 1 benchmark run on commit 8542bf82.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants