Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-3550: [C++] use kUnknownNullCount for the default null_count argument #3781

Conversation

bkietz
Copy link
Member

@bkietz bkietz commented Feb 28, 2019

Previous default for this parameter in Array constructors was 0, which is not correct for all bitmaps the array might've been constructed with.

Sidecar: corrected TypeTraits<NullType> to reflect that it is parameter free and has a type_singleton.

@@ -325,7 +325,7 @@ class ARROW_EXPORT Array {
} else {
null_bitmap_data_ = NULLPTR;
}
data_ = data;
data_ = data->Copy();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to explicit it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy() produces a deep copy, whereas data_ = data would mean that subsequently modifying the argument also modifies the array:

shared_ptr<ArrayData> somedata = ArrayData::Make(int32(), 0);
Int32Array arr(somedata);
somedata->length = 3;
arr->length(); // 3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had forgotten that data_ is a pointer and not the ArrayData object directly.1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are You sure that it is the expected behaviour? If so We should test it.
cc @pitrou

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some unit tests?

@@ -325,7 +325,7 @@ class ARROW_EXPORT Array {
} else {
null_bitmap_data_ = NULLPTR;
}
data_ = data;
data_ = data->Copy();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me

data->null_count = data->length;
data_ = data;
data_ = data->Copy();
data_->null_count = data->length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think copying is correct. ArrayData is an internal API; copying should be addressed at a higher level

Copy link
Member Author

@bkietz bkietz Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revert it

@bkietz bkietz force-pushed the ARROW-3550-array-constructor-kUnknownNullCount branch from e18a1cd to fbe4915 Compare March 6, 2019 20:17

std::unique_ptr<Int32Array> arr_default_null_count(
new Int32Array(100, data, null_bitmap));
ASSERT_EQ(kUnknownNullCount, arr_default_null_count->data()->null_count);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wesm Is that sufficient or would you like to see similar cases for each affected array type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this works for me! Thanks =)

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@wesm
Copy link
Member

wesm commented Mar 6, 2019

will wait for build to run

@wesm wesm closed this in 09466ce Mar 7, 2019
@bkietz bkietz deleted the ARROW-3550-array-constructor-kUnknownNullCount branch April 8, 2019 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants