New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MATLAB] Add arrow.array.ChunkedArray
class
#37448
Comments
take |
kou
pushed a commit
that referenced
this issue
Sep 3, 2023
### Rationale for this change In order to add an `arrow.tabular.Table` class to the MATLAB Interface, we first need to add a MATLAB class representing `arrow::ChunkedArray`s. This is required because an `arrow::Table` is backed by a vector of `arrow::ChunkedArray`s, and the output of its `column(int index)` method is an `arrow::ChunkedArray`. ### What changes are included in this PR? 1. Introduced a new class called `arrow.array.ChunkedArray`. 2. `arrow.array.ChunkedArray` has the following properties: 1. `Type` - datatype of the `arrow.array.Array`s 2. `Length` - Sum of the `arrow.array.Array` lengths 3. `NumChunks` - Number of `arrow.array.Array`s 3. `arrow.array.ChunkedArray` has the following methods: 1. `chunk(index)` - Returns the `arrow.array.Array` stored at the specified index 2. `fromArrays(array1, array1, ..., arrayN, Type=type)` - Creates a `ChunkedArray` from the arrays provided. If `Type` is provided, all arrays are expected to have the specified `Type`. **Example Usage** ```matlab >> a1 = arrow.array(1:100); >> a2 = arrow.array(101:250); >> a3 = arrow.array(251:300); % Create a ChunkedArray from 3 Float64Arrays >> c = arrow.array.ChunkedArray.fromArrays(a1, a2, a3) c = ChunkedArray with properties: Type: [1×1 arrow.type.Float64Type] NumChunks: 3 Length: 300 % Extract the first chunk and compare it to a1 >> c1 = c.chunk(1); >> tf = isequal(c1, a1) tf = logical 1 % Create an empty ChunkedArray by providing the Type nv-pair >> c = arrow.array.ChunkedArray.fromArrays(Type=arrow.timestamp()) c = ChunkedArray with properties: Type: [1×1 arrow.type.TimestampType] NumChunks: 0 Length: 0 ``` ### Are these changes tested? Yes. I added a new test class called `tChunkedArray.m` that contains unit tests for the new class. ### Are there any user-facing changes? Yes. Users can now create a `ChunkedArray` in the MATLAB Interface. ### Future Directions 1. In this PR, we deliberately didn't include a convenience constructor function because we're not sure if we want users to create `ChunkedArray`s themselves. We think users will mostly use `ChunkedArray` when extracting columns from `Table`s. 2. We will implement more methods on `ChunkedArray`, such as `flatten()` and `combineChunks()`, etc. * Closes: #37448 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Closed
loicalleyne
pushed a commit
to loicalleyne/arrow
that referenced
this issue
Nov 13, 2023
…#37525) ### Rationale for this change In order to add an `arrow.tabular.Table` class to the MATLAB Interface, we first need to add a MATLAB class representing `arrow::ChunkedArray`s. This is required because an `arrow::Table` is backed by a vector of `arrow::ChunkedArray`s, and the output of its `column(int index)` method is an `arrow::ChunkedArray`. ### What changes are included in this PR? 1. Introduced a new class called `arrow.array.ChunkedArray`. 2. `arrow.array.ChunkedArray` has the following properties: 1. `Type` - datatype of the `arrow.array.Array`s 2. `Length` - Sum of the `arrow.array.Array` lengths 3. `NumChunks` - Number of `arrow.array.Array`s 3. `arrow.array.ChunkedArray` has the following methods: 1. `chunk(index)` - Returns the `arrow.array.Array` stored at the specified index 2. `fromArrays(array1, array1, ..., arrayN, Type=type)` - Creates a `ChunkedArray` from the arrays provided. If `Type` is provided, all arrays are expected to have the specified `Type`. **Example Usage** ```matlab >> a1 = arrow.array(1:100); >> a2 = arrow.array(101:250); >> a3 = arrow.array(251:300); % Create a ChunkedArray from 3 Float64Arrays >> c = arrow.array.ChunkedArray.fromArrays(a1, a2, a3) c = ChunkedArray with properties: Type: [1×1 arrow.type.Float64Type] NumChunks: 3 Length: 300 % Extract the first chunk and compare it to a1 >> c1 = c.chunk(1); >> tf = isequal(c1, a1) tf = logical 1 % Create an empty ChunkedArray by providing the Type nv-pair >> c = arrow.array.ChunkedArray.fromArrays(Type=arrow.timestamp()) c = ChunkedArray with properties: Type: [1×1 arrow.type.TimestampType] NumChunks: 0 Length: 0 ``` ### Are these changes tested? Yes. I added a new test class called `tChunkedArray.m` that contains unit tests for the new class. ### Are there any user-facing changes? Yes. Users can now create a `ChunkedArray` in the MATLAB Interface. ### Future Directions 1. In this PR, we deliberately didn't include a convenience constructor function because we're not sure if we want users to create `ChunkedArray`s themselves. We think users will mostly use `ChunkedArray` when extracting columns from `Table`s. 2. We will implement more methods on `ChunkedArray`, such as `flatten()` and `combineChunks()`, etc. * Closes: apache#37448 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this issue
Feb 19, 2024
…#37525) ### Rationale for this change In order to add an `arrow.tabular.Table` class to the MATLAB Interface, we first need to add a MATLAB class representing `arrow::ChunkedArray`s. This is required because an `arrow::Table` is backed by a vector of `arrow::ChunkedArray`s, and the output of its `column(int index)` method is an `arrow::ChunkedArray`. ### What changes are included in this PR? 1. Introduced a new class called `arrow.array.ChunkedArray`. 2. `arrow.array.ChunkedArray` has the following properties: 1. `Type` - datatype of the `arrow.array.Array`s 2. `Length` - Sum of the `arrow.array.Array` lengths 3. `NumChunks` - Number of `arrow.array.Array`s 3. `arrow.array.ChunkedArray` has the following methods: 1. `chunk(index)` - Returns the `arrow.array.Array` stored at the specified index 2. `fromArrays(array1, array1, ..., arrayN, Type=type)` - Creates a `ChunkedArray` from the arrays provided. If `Type` is provided, all arrays are expected to have the specified `Type`. **Example Usage** ```matlab >> a1 = arrow.array(1:100); >> a2 = arrow.array(101:250); >> a3 = arrow.array(251:300); % Create a ChunkedArray from 3 Float64Arrays >> c = arrow.array.ChunkedArray.fromArrays(a1, a2, a3) c = ChunkedArray with properties: Type: [1×1 arrow.type.Float64Type] NumChunks: 3 Length: 300 % Extract the first chunk and compare it to a1 >> c1 = c.chunk(1); >> tf = isequal(c1, a1) tf = logical 1 % Create an empty ChunkedArray by providing the Type nv-pair >> c = arrow.array.ChunkedArray.fromArrays(Type=arrow.timestamp()) c = ChunkedArray with properties: Type: [1×1 arrow.type.TimestampType] NumChunks: 0 Length: 0 ``` ### Are these changes tested? Yes. I added a new test class called `tChunkedArray.m` that contains unit tests for the new class. ### Are there any user-facing changes? Yes. Users can now create a `ChunkedArray` in the MATLAB Interface. ### Future Directions 1. In this PR, we deliberately didn't include a convenience constructor function because we're not sure if we want users to create `ChunkedArray`s themselves. We think users will mostly use `ChunkedArray` when extracting columns from `Table`s. 2. We will implement more methods on `ChunkedArray`, such as `flatten()` and `combineChunks()`, etc. * Closes: apache#37448 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
In order to add a
arrow.tabular.Table
class to the MATLAB Interface, we first need to add a MATLAB class representingarrow::ChunkedArray
s. This is required because anarrow::Table
is backed by a vector ofarrow::ChunkedArray
s, and the output of itscolumn(int index)
method is anarrow::ChunkedArray
.Component(s)
MATLAB
The text was updated successfully, but these errors were encountered: