-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MATLAB] Add a public Valid
property to to the MATLAB arrow.array.<Array>
classes to query Null values (i.e. validity bitmap support)
#35598
Comments
take |
kou
added a commit
that referenced
this issue
May 28, 2023
…row.array.<Array>` classes to query Null values (i.e. validity bitmap support) (#35655) ### Rationale for this change Currently, the `arrow.array.<Array>` classes do not support querying the Null values (i.e. validity bitmap) on an Arrow array. Support for encoding Null values is an important part of the Arrow memory format, so the MATLAB Interface to Arrow should support it. There are likely multiple different APIs that the MATLAB interface should have to support Null values robustly. However, to focus on incremental delivery, we can start by adding a public `Valid` property to the `arrow.array.<Array>` classes, which would return a `logical` array of null values in the given array. ### What changes are included in this PR? 1. Added a new public property `Valid` to the `arrow.array.Array` superclass. 2. Implemented basic null value handling for `arrow.array.Float64Array` (i.e. treat `NaN` values in the input MATLAB array as null values in the corresponding `arrow.array.Float64Array`). 3. Implement null value substitution (i.e. substitute null values with `NaN`) for `Float64Array` in `toMATLAB` and `double` conversion methods. Example of creating an `arrow.array.Float64Array` from a MATLAB `double` array containing `NaN` values: ```matlab >> matlabArray = [1, 2, NaN, 4, NaN]' matlabArray = 1 2 NaN 4 NaN >> arrowArray = arrow.array.Float64Array(matlabArray) arrowArray = [ 1, 2, null, 4, null ] >> arrowArray.Valid ans = 5×1 logical array 1 1 0 1 0 >> all(~isnan(matlabArray) == arrowArray.Valid) ans = logical 1 ``` ### Are these changes tested? Yes, we have added the following test points for the `Valid` property of `arrow.array.Float64Array`: 1. `ValidBasic` 2. `ValidNoNulls` 4. `ValidAllNulls` 5. `ValidEmpty` ### Are there any user-facing changes? Yes. There is now a public property `Valid` on the arrow.array.Float64Array` class which is a MATLAB `logical` array encoding the null values in the underlying Arrow array, where `true` indicates an element is valid (i.e. not null) and `false` indicates that an element is invalid (i.e. null). ### Future Directions 1. Implement more null value related methods like `isvalid`, `isnull`, `packagedValidityBitmap`, etc. 2. Add null value (i.e. `Valid` property) support to the rest of the `arrow.array.Array` subclasses. ### Notes 1. Thank you to @ sgilmore10 for your help with this pull request! Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: sgilmore10 <74676073+sgilmore10@users.noreply.github.com> Co-authored-by: Kevin Gurney <kevin.p.gurney@gmail.com> Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Issue resolved by pull request 35655 |
This was referenced Jun 12, 2023
kou
added a commit
that referenced
this issue
Jun 13, 2023
### Rationale for this change Now that the MATLAB interface supports validity bitmaps and bit packing/unpacking (#35598), we can add support for a `BooleanArray` class. This is a follow up to the work on the `NumericArray` classes. `BooleanArray` maps to the MATLAB [`logical`](https://www.mathworks.com/help/matlab/logical-operations.html) type when calling `toMATLAB`. ### What changes are included in this PR? 1. Added a new `arrow.array.BooleanArray` class that can be converted to/from a MATLAB `logical` array. **Example**: ```matlab >> matlabArray = logical([true, false, true])' matlabArray = 3x1 logical array 1 0 1 >> arrowArray = arrow.array.BooleanArray(matlabArray) arrowArray = [ true, false, true ] >> convertedArrowArray = toMATLAB(arrowArray) convertedArrowArray = 3x1 logical array 1 0 1 ``` ### Are these changes tested? Yes. 1. Added a new `tBooleanArray.m` test class which follows the existing pattern for the `NumericArray` test classes. ### Are there any user-facing changes? Yes. 1. Added a new user-facing `arrow.array.BooleanArray` class. ### Notes 1. Thank you @ sgilmore10 for your help with this pull request! * Closes: #36040 Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: Kevin Gurney <kevin.p.gurney@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
Currently, the
arrow.array.<Array>
classes do not support querying the Null values (i.e. validity bitmap) on an Arrow array. Support for encoding Null values is an important part of the Arrow memory format, so the MATLAB Interface to Arrow should support it.There are likely multiple different APIs that the MATLAB interface should have to support Null values robustly. However, to focus on incremental delivery, we can start by adding a public
Valid
property to thearrow.array.<Array>
classes, which would return alogical
array of null values in the given array.Component(s)
MATLAB
The text was updated successfully, but these errors were encountered: