-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MATLAB] Add arrow.tabular.Table
MATLAB class
#37571
Comments
kevingurney
added a commit
that referenced
this issue
Sep 7, 2023
### Rationale for this change Following on from #37525, which adds `arrow.array.ChunkedArray` to the MATLAB interface, this pull request adds support for a new `arrow.tabular.Table` MATLAB class. This pull request is intended to be an initial implementation of `Table` support and does not include all methods or properties that may be useful on `arrow.tabular.Table`. ### What changes are included in this PR? 1. Added new `arrow.tabular.Table` MATLAB class. **Properties** * `NumRows` * `NumColumns` * `ColumnNames` * `Schema` **Methods** * `fromArrays(<array-1>, ..., <array-N>)` * `column(<index>)` * `table()` * `toMATLAB()` **Example of `arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)` static construction method** ```matlab >> arrowTable = arrow.tabular.Table.fromArrays(arrow.array([1, 2, 3]), arrow.array(["A", "B", "C"]), arrow.array([true, false, true])) arrowTable = Column1: double Column2: string Column3: bool ---- Column1: [ [ 1, 2, 3 ] ] Column2: [ [ "A", "B", "C" ] ] Column3: [ [ true, false, true ] ] >> matlabTable = table(arrowTable) matlabTable = 3×3 table Column1 Column2 Column3 _______ _______ _______ 1 "A" true 2 "B" false 3 "C" true ``` 2. Added a new `arrow.table(<matlab-table>)` construction function which creates an `arrow.tabular.Table` from a MATLAB `table`. **Example of `arrow.table(<matlab-table>)` construction function** ```matlab >> matlabTable = table([1; 2; 3], ["A"; "B"; "C"], [true; false; true]) matlabTable = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> arrowTable = arrow.table(matlabTable) arrowTable = Var1: double Var2: string Var3: bool ---- Var1: [ [ 1, 2, 3 ] ] Var2: [ [ "A", "B", "C" ] ] Var3: [ [ true, false, true ] ] >> arrowTable.NumRows ans = int64 3 >> arrowTable.NumColumns ans = int32 3 >> arrowTable.ColumnNames ans = 1×3 string array "Var1" "Var2" "Var3" >> arrowTable.Schema ans = Var1: double Var2: string Var3: bool >> table(arrowTable) ans = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> isequal(ans, matlabTable) ans = logical 1 ``` ### Are these changes tested? Yes. 1. Added a new `tTable` test class for `arrow.tabular.Table` and `arrow.table(<matlab-table>)` tests. ### Are there any user-facing changes? Yes. 1. Users can now create `arrow.tabular.Table` objects using the `fromArrays` static construction method or the `arrow.table(<matlab-table>)` construction function. ### Future Directions 1. Create shared test infrastructure for common `RecordBatch` and `Table` MATLAB tests. 2. Implement equality check (i.e. `isequal`) for `arrow.tabular.Table` instances. 4. Add more static construction methods to `arrow.tabular.Table`. For example: `fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)` and `fromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)`. ### Notes 1. A lot of the code for `arrow.tabular.Table` is very similar to the code for `arrow.tabular.RecordBatch`. It may make sense for us to try to share more of the code using C++ templates or another approach. 2. Thank you @ sgilmore10 for your help with this pull request! * Closes: #37571 Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
loicalleyne
pushed a commit
to loicalleyne/arrow
that referenced
this issue
Nov 13, 2023
…he#37620) ### Rationale for this change Following on from apache#37525, which adds `arrow.array.ChunkedArray` to the MATLAB interface, this pull request adds support for a new `arrow.tabular.Table` MATLAB class. This pull request is intended to be an initial implementation of `Table` support and does not include all methods or properties that may be useful on `arrow.tabular.Table`. ### What changes are included in this PR? 1. Added new `arrow.tabular.Table` MATLAB class. **Properties** * `NumRows` * `NumColumns` * `ColumnNames` * `Schema` **Methods** * `fromArrays(<array-1>, ..., <array-N>)` * `column(<index>)` * `table()` * `toMATLAB()` **Example of `arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)` static construction method** ```matlab >> arrowTable = arrow.tabular.Table.fromArrays(arrow.array([1, 2, 3]), arrow.array(["A", "B", "C"]), arrow.array([true, false, true])) arrowTable = Column1: double Column2: string Column3: bool ---- Column1: [ [ 1, 2, 3 ] ] Column2: [ [ "A", "B", "C" ] ] Column3: [ [ true, false, true ] ] >> matlabTable = table(arrowTable) matlabTable = 3×3 table Column1 Column2 Column3 _______ _______ _______ 1 "A" true 2 "B" false 3 "C" true ``` 2. Added a new `arrow.table(<matlab-table>)` construction function which creates an `arrow.tabular.Table` from a MATLAB `table`. **Example of `arrow.table(<matlab-table>)` construction function** ```matlab >> matlabTable = table([1; 2; 3], ["A"; "B"; "C"], [true; false; true]) matlabTable = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> arrowTable = arrow.table(matlabTable) arrowTable = Var1: double Var2: string Var3: bool ---- Var1: [ [ 1, 2, 3 ] ] Var2: [ [ "A", "B", "C" ] ] Var3: [ [ true, false, true ] ] >> arrowTable.NumRows ans = int64 3 >> arrowTable.NumColumns ans = int32 3 >> arrowTable.ColumnNames ans = 1×3 string array "Var1" "Var2" "Var3" >> arrowTable.Schema ans = Var1: double Var2: string Var3: bool >> table(arrowTable) ans = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> isequal(ans, matlabTable) ans = logical 1 ``` ### Are these changes tested? Yes. 1. Added a new `tTable` test class for `arrow.tabular.Table` and `arrow.table(<matlab-table>)` tests. ### Are there any user-facing changes? Yes. 1. Users can now create `arrow.tabular.Table` objects using the `fromArrays` static construction method or the `arrow.table(<matlab-table>)` construction function. ### Future Directions 1. Create shared test infrastructure for common `RecordBatch` and `Table` MATLAB tests. 2. Implement equality check (i.e. `isequal`) for `arrow.tabular.Table` instances. 4. Add more static construction methods to `arrow.tabular.Table`. For example: `fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)` and `fromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)`. ### Notes 1. A lot of the code for `arrow.tabular.Table` is very similar to the code for `arrow.tabular.RecordBatch`. It may make sense for us to try to share more of the code using C++ templates or another approach. 2. Thank you @ sgilmore10 for your help with this pull request! * Closes: apache#37571 Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this issue
Feb 19, 2024
…he#37620) ### Rationale for this change Following on from apache#37525, which adds `arrow.array.ChunkedArray` to the MATLAB interface, this pull request adds support for a new `arrow.tabular.Table` MATLAB class. This pull request is intended to be an initial implementation of `Table` support and does not include all methods or properties that may be useful on `arrow.tabular.Table`. ### What changes are included in this PR? 1. Added new `arrow.tabular.Table` MATLAB class. **Properties** * `NumRows` * `NumColumns` * `ColumnNames` * `Schema` **Methods** * `fromArrays(<array-1>, ..., <array-N>)` * `column(<index>)` * `table()` * `toMATLAB()` **Example of `arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)` static construction method** ```matlab >> arrowTable = arrow.tabular.Table.fromArrays(arrow.array([1, 2, 3]), arrow.array(["A", "B", "C"]), arrow.array([true, false, true])) arrowTable = Column1: double Column2: string Column3: bool ---- Column1: [ [ 1, 2, 3 ] ] Column2: [ [ "A", "B", "C" ] ] Column3: [ [ true, false, true ] ] >> matlabTable = table(arrowTable) matlabTable = 3×3 table Column1 Column2 Column3 _______ _______ _______ 1 "A" true 2 "B" false 3 "C" true ``` 2. Added a new `arrow.table(<matlab-table>)` construction function which creates an `arrow.tabular.Table` from a MATLAB `table`. **Example of `arrow.table(<matlab-table>)` construction function** ```matlab >> matlabTable = table([1; 2; 3], ["A"; "B"; "C"], [true; false; true]) matlabTable = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> arrowTable = arrow.table(matlabTable) arrowTable = Var1: double Var2: string Var3: bool ---- Var1: [ [ 1, 2, 3 ] ] Var2: [ [ "A", "B", "C" ] ] Var3: [ [ true, false, true ] ] >> arrowTable.NumRows ans = int64 3 >> arrowTable.NumColumns ans = int32 3 >> arrowTable.ColumnNames ans = 1×3 string array "Var1" "Var2" "Var3" >> arrowTable.Schema ans = Var1: double Var2: string Var3: bool >> table(arrowTable) ans = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> isequal(ans, matlabTable) ans = logical 1 ``` ### Are these changes tested? Yes. 1. Added a new `tTable` test class for `arrow.tabular.Table` and `arrow.table(<matlab-table>)` tests. ### Are there any user-facing changes? Yes. 1. Users can now create `arrow.tabular.Table` objects using the `fromArrays` static construction method or the `arrow.table(<matlab-table>)` construction function. ### Future Directions 1. Create shared test infrastructure for common `RecordBatch` and `Table` MATLAB tests. 2. Implement equality check (i.e. `isequal`) for `arrow.tabular.Table` instances. 4. Add more static construction methods to `arrow.tabular.Table`. For example: `fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)` and `fromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)`. ### Notes 1. A lot of the code for `arrow.tabular.Table` is very similar to the code for `arrow.tabular.RecordBatch`. It may make sense for us to try to share more of the code using C++ templates or another approach. 2. Thank you @ sgilmore10 for your help with this pull request! * Closes: apache#37571 Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
Now that
arrow.array.ChunkedArray
has been added to the MATLAB interface (#37525), we can implement anarrow.tabular.Table
class.Methods that may be useful for
arrow.tabular.Table
include:fromArrays(arrays, nvpairs)
fromChunkedArrays(chunkedArrays, nvpairs)
fromRecordBatches(recordBatches, nvpairs)
column(i)
-> get thei
th column as anarrow.array.ChunkedArray
field(i)
-> get thei
th field as anarrow.type.Field
toMATLAB()
-> convert to a MATLABtable
table()
-> convert to a MATLABtable
Properties that may be useful for
arrow.tabular.Table
include:Schema
NumColumns
NumRows
ColumnNames
Component(s)
MATLAB
The text was updated successfully, but these errors were encountered: