-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-37571: [MATLAB] Add arrow.tabular.Table
MATLAB class
#37620
Conversation
Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
2. Add `column` method tests. Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this!
empty `arrow.tabular.Table`. Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
+1 |
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit da602af. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them. |
…he#37620) ### Rationale for this change Following on from apache#37525, which adds `arrow.array.ChunkedArray` to the MATLAB interface, this pull request adds support for a new `arrow.tabular.Table` MATLAB class. This pull request is intended to be an initial implementation of `Table` support and does not include all methods or properties that may be useful on `arrow.tabular.Table`. ### What changes are included in this PR? 1. Added new `arrow.tabular.Table` MATLAB class. **Properties** * `NumRows` * `NumColumns` * `ColumnNames` * `Schema` **Methods** * `fromArrays(<array-1>, ..., <array-N>)` * `column(<index>)` * `table()` * `toMATLAB()` **Example of `arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)` static construction method** ```matlab >> arrowTable = arrow.tabular.Table.fromArrays(arrow.array([1, 2, 3]), arrow.array(["A", "B", "C"]), arrow.array([true, false, true])) arrowTable = Column1: double Column2: string Column3: bool ---- Column1: [ [ 1, 2, 3 ] ] Column2: [ [ "A", "B", "C" ] ] Column3: [ [ true, false, true ] ] >> matlabTable = table(arrowTable) matlabTable = 3×3 table Column1 Column2 Column3 _______ _______ _______ 1 "A" true 2 "B" false 3 "C" true ``` 2. Added a new `arrow.table(<matlab-table>)` construction function which creates an `arrow.tabular.Table` from a MATLAB `table`. **Example of `arrow.table(<matlab-table>)` construction function** ```matlab >> matlabTable = table([1; 2; 3], ["A"; "B"; "C"], [true; false; true]) matlabTable = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> arrowTable = arrow.table(matlabTable) arrowTable = Var1: double Var2: string Var3: bool ---- Var1: [ [ 1, 2, 3 ] ] Var2: [ [ "A", "B", "C" ] ] Var3: [ [ true, false, true ] ] >> arrowTable.NumRows ans = int64 3 >> arrowTable.NumColumns ans = int32 3 >> arrowTable.ColumnNames ans = 1×3 string array "Var1" "Var2" "Var3" >> arrowTable.Schema ans = Var1: double Var2: string Var3: bool >> table(arrowTable) ans = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> isequal(ans, matlabTable) ans = logical 1 ``` ### Are these changes tested? Yes. 1. Added a new `tTable` test class for `arrow.tabular.Table` and `arrow.table(<matlab-table>)` tests. ### Are there any user-facing changes? Yes. 1. Users can now create `arrow.tabular.Table` objects using the `fromArrays` static construction method or the `arrow.table(<matlab-table>)` construction function. ### Future Directions 1. Create shared test infrastructure for common `RecordBatch` and `Table` MATLAB tests. 2. Implement equality check (i.e. `isequal`) for `arrow.tabular.Table` instances. 4. Add more static construction methods to `arrow.tabular.Table`. For example: `fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)` and `fromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)`. ### Notes 1. A lot of the code for `arrow.tabular.Table` is very similar to the code for `arrow.tabular.RecordBatch`. It may make sense for us to try to share more of the code using C++ templates or another approach. 2. Thank you @ sgilmore10 for your help with this pull request! * Closes: apache#37571 Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
…he#37620) ### Rationale for this change Following on from apache#37525, which adds `arrow.array.ChunkedArray` to the MATLAB interface, this pull request adds support for a new `arrow.tabular.Table` MATLAB class. This pull request is intended to be an initial implementation of `Table` support and does not include all methods or properties that may be useful on `arrow.tabular.Table`. ### What changes are included in this PR? 1. Added new `arrow.tabular.Table` MATLAB class. **Properties** * `NumRows` * `NumColumns` * `ColumnNames` * `Schema` **Methods** * `fromArrays(<array-1>, ..., <array-N>)` * `column(<index>)` * `table()` * `toMATLAB()` **Example of `arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)` static construction method** ```matlab >> arrowTable = arrow.tabular.Table.fromArrays(arrow.array([1, 2, 3]), arrow.array(["A", "B", "C"]), arrow.array([true, false, true])) arrowTable = Column1: double Column2: string Column3: bool ---- Column1: [ [ 1, 2, 3 ] ] Column2: [ [ "A", "B", "C" ] ] Column3: [ [ true, false, true ] ] >> matlabTable = table(arrowTable) matlabTable = 3×3 table Column1 Column2 Column3 _______ _______ _______ 1 "A" true 2 "B" false 3 "C" true ``` 2. Added a new `arrow.table(<matlab-table>)` construction function which creates an `arrow.tabular.Table` from a MATLAB `table`. **Example of `arrow.table(<matlab-table>)` construction function** ```matlab >> matlabTable = table([1; 2; 3], ["A"; "B"; "C"], [true; false; true]) matlabTable = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> arrowTable = arrow.table(matlabTable) arrowTable = Var1: double Var2: string Var3: bool ---- Var1: [ [ 1, 2, 3 ] ] Var2: [ [ "A", "B", "C" ] ] Var3: [ [ true, false, true ] ] >> arrowTable.NumRows ans = int64 3 >> arrowTable.NumColumns ans = int32 3 >> arrowTable.ColumnNames ans = 1×3 string array "Var1" "Var2" "Var3" >> arrowTable.Schema ans = Var1: double Var2: string Var3: bool >> table(arrowTable) ans = 3×3 table Var1 Var2 Var3 ____ ____ _____ 1 "A" true 2 "B" false 3 "C" true >> isequal(ans, matlabTable) ans = logical 1 ``` ### Are these changes tested? Yes. 1. Added a new `tTable` test class for `arrow.tabular.Table` and `arrow.table(<matlab-table>)` tests. ### Are there any user-facing changes? Yes. 1. Users can now create `arrow.tabular.Table` objects using the `fromArrays` static construction method or the `arrow.table(<matlab-table>)` construction function. ### Future Directions 1. Create shared test infrastructure for common `RecordBatch` and `Table` MATLAB tests. 2. Implement equality check (i.e. `isequal`) for `arrow.tabular.Table` instances. 4. Add more static construction methods to `arrow.tabular.Table`. For example: `fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)` and `fromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)`. ### Notes 1. A lot of the code for `arrow.tabular.Table` is very similar to the code for `arrow.tabular.RecordBatch`. It may make sense for us to try to share more of the code using C++ templates or another approach. 2. Thank you @ sgilmore10 for your help with this pull request! * Closes: apache#37571 Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Rationale for this change
Following on from #37525, which adds
arrow.array.ChunkedArray
to the MATLAB interface, this pull request adds support for a newarrow.tabular.Table
MATLAB class.This pull request is intended to be an initial implementation of
Table
support and does not include all methods or properties that may be useful onarrow.tabular.Table
.What changes are included in this PR?
arrow.tabular.Table
MATLAB class.Properties
NumRows
NumColumns
ColumnNames
Schema
Methods
fromArrays(<array-1>, ..., <array-N>)
column(<index>)
table()
toMATLAB()
Example of
arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)
static construction methodarrow.table(<matlab-table>)
construction function which creates anarrow.tabular.Table
from a MATLABtable
.Example of
arrow.table(<matlab-table>)
construction functionAre these changes tested?
Yes.
tTable
test class forarrow.tabular.Table
andarrow.table(<matlab-table>)
tests.Are there any user-facing changes?
Yes.
arrow.tabular.Table
objects using thefromArrays
static construction method or thearrow.table(<matlab-table>)
construction function.Future Directions
RecordBatch
andTable
MATLAB tests.isequal
) forarrow.tabular.Table
instances.arrow.tabular.Table
. For example:fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)
andfromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)
.Notes
arrow.tabular.Table
is very similar to the code forarrow.tabular.RecordBatch
. It may make sense for us to try to share more of the code using C++ templates or another approach.arrow.tabular.Table
MATLAB class #37571