Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MATLAB] Add arrow.tabular.Table MATLAB class #37571

Closed
kevingurney opened this issue Sep 5, 2023 · 0 comments · Fixed by #37620
Closed

[MATLAB] Add arrow.tabular.Table MATLAB class #37571

kevingurney opened this issue Sep 5, 2023 · 0 comments · Fixed by #37620

Comments

@kevingurney
Copy link
Member

kevingurney commented Sep 5, 2023

Describe the enhancement requested

Now that arrow.array.ChunkedArray has been added to the MATLAB interface (#37525), we can implement an arrow.tabular.Table class.

Methods that may be useful for arrow.tabular.Table include:

  • fromArrays(arrays, nvpairs)
  • fromChunkedArrays(chunkedArrays, nvpairs)
  • fromRecordBatches(recordBatches, nvpairs)
  • column(i) -> get the ith column as an arrow.array.ChunkedArray
  • field(i) -> get the ith field as an arrow.type.Field
  • toMATLAB() -> convert to a MATLAB table
  • table() -> convert to a MATLAB table

Properties that may be useful for arrow.tabular.Table include:

  • Schema
  • NumColumns
  • NumRows
  • ColumnNames

Component(s)

MATLAB

@kevingurney kevingurney self-assigned this Sep 6, 2023
kevingurney added a commit that referenced this issue Sep 7, 2023
### Rationale for this change

Following on from #37525, which adds `arrow.array.ChunkedArray` to the MATLAB interface, this pull request adds support for a new `arrow.tabular.Table` MATLAB class.

This pull request is intended to be an initial implementation of `Table` support and does not include all methods or properties that may be useful on `arrow.tabular.Table`.

### What changes are included in this PR?

1. Added new `arrow.tabular.Table` MATLAB class.

**Properties**

* `NumRows`
* `NumColumns`
* `ColumnNames`
* `Schema`

**Methods**

* `fromArrays(<array-1>, ..., <array-N>)`
* `column(<index>)`
* `table()`
* `toMATLAB()`

**Example of `arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)` static construction method**
```matlab
>> arrowTable = arrow.tabular.Table.fromArrays(arrow.array([1, 2, 3]), arrow.array(["A", "B", "C"]), arrow.array([true, false, true]))

arrowTable = 

Column1: double
Column2: string
Column3: bool
----
Column1:
  [
    [
      1,
      2,
      3
    ]
  ]
Column2:
  [
    [
      "A",
      "B",
      "C"
    ]
  ]
Column3:
  [
    [
      true,
      false,
      true
    ]
  ]

>> matlabTable = table(arrowTable)

matlabTable =

  3×3 table

    Column1    Column2    Column3
    _______    _______    _______

       1         "A"       true  
       2         "B"       false 
       3         "C"       true  
```

2. Added a new `arrow.table(<matlab-table>)` construction function which creates an `arrow.tabular.Table` from a MATLAB `table`. 

**Example of `arrow.table(<matlab-table>)` construction function**
```matlab
>> matlabTable = table([1; 2; 3], ["A"; "B"; "C"], [true; false; true])

matlabTable =

  3×3 table

    Var1    Var2    Var3 
    ____    ____    _____

     1      "A"     true 
     2      "B"     false
     3      "C"     true 

>> arrowTable = arrow.table(matlabTable)

arrowTable = 

Var1: double
Var2: string
Var3: bool
----
Var1:
  [
    [
      1,
      2,
      3
    ]
  ]
Var2:
  [
    [
      "A",
      "B",
      "C"
    ]
  ]
Var3:
  [
    [
      true,
      false,
      true
    ]
  ]

>> arrowTable.NumRows

ans =

  int64

   3

>> arrowTable.NumColumns

ans =

  int32

   3

>> arrowTable.ColumnNames

ans = 

  1×3 string array

    "Var1"    "Var2"    "Var3"

>> arrowTable.Schema

ans = 

Var1: double
Var2: string
Var3: bool

>> table(arrowTable)

ans =

  3×3 table

    Var1    Var2    Var3 
    ____    ____    _____

     1      "A"     true 
     2      "B"     false
     3      "C"     true 

>> isequal(ans, matlabTable)

ans =

  logical

   1
```

### Are these changes tested?

Yes.

1. Added a new `tTable` test class for `arrow.tabular.Table` and `arrow.table(<matlab-table>)` tests.

### Are there any user-facing changes?

Yes.

1. Users can now create `arrow.tabular.Table` objects using the `fromArrays` static construction method or the `arrow.table(<matlab-table>)` construction function.

### Future Directions

1. Create shared test infrastructure for common `RecordBatch` and `Table` MATLAB tests.
2. Implement equality check (i.e. `isequal`) for `arrow.tabular.Table` instances.
4. Add more static construction methods to `arrow.tabular.Table`. For example: `fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)` and `fromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)`.

### Notes

1. A lot of the code for `arrow.tabular.Table` is very similar to the code for `arrow.tabular.RecordBatch`. It may make sense for us to try to share more of the code using C++ templates or another approach.
2. Thank you @ sgilmore10 for your help with this pull request!
* Closes: #37571

Lead-authored-by: Kevin Gurney <kgurney@mathworks.com>
Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
@kevingurney kevingurney added this to the 14.0.0 milestone Sep 7, 2023
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…he#37620)

### Rationale for this change

Following on from apache#37525, which adds `arrow.array.ChunkedArray` to the MATLAB interface, this pull request adds support for a new `arrow.tabular.Table` MATLAB class.

This pull request is intended to be an initial implementation of `Table` support and does not include all methods or properties that may be useful on `arrow.tabular.Table`.

### What changes are included in this PR?

1. Added new `arrow.tabular.Table` MATLAB class.

**Properties**

* `NumRows`
* `NumColumns`
* `ColumnNames`
* `Schema`

**Methods**

* `fromArrays(<array-1>, ..., <array-N>)`
* `column(<index>)`
* `table()`
* `toMATLAB()`

**Example of `arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)` static construction method**
```matlab
>> arrowTable = arrow.tabular.Table.fromArrays(arrow.array([1, 2, 3]), arrow.array(["A", "B", "C"]), arrow.array([true, false, true]))

arrowTable = 

Column1: double
Column2: string
Column3: bool
----
Column1:
  [
    [
      1,
      2,
      3
    ]
  ]
Column2:
  [
    [
      "A",
      "B",
      "C"
    ]
  ]
Column3:
  [
    [
      true,
      false,
      true
    ]
  ]

>> matlabTable = table(arrowTable)

matlabTable =

  3×3 table

    Column1    Column2    Column3
    _______    _______    _______

       1         "A"       true  
       2         "B"       false 
       3         "C"       true  
```

2. Added a new `arrow.table(<matlab-table>)` construction function which creates an `arrow.tabular.Table` from a MATLAB `table`. 

**Example of `arrow.table(<matlab-table>)` construction function**
```matlab
>> matlabTable = table([1; 2; 3], ["A"; "B"; "C"], [true; false; true])

matlabTable =

  3×3 table

    Var1    Var2    Var3 
    ____    ____    _____

     1      "A"     true 
     2      "B"     false
     3      "C"     true 

>> arrowTable = arrow.table(matlabTable)

arrowTable = 

Var1: double
Var2: string
Var3: bool
----
Var1:
  [
    [
      1,
      2,
      3
    ]
  ]
Var2:
  [
    [
      "A",
      "B",
      "C"
    ]
  ]
Var3:
  [
    [
      true,
      false,
      true
    ]
  ]

>> arrowTable.NumRows

ans =

  int64

   3

>> arrowTable.NumColumns

ans =

  int32

   3

>> arrowTable.ColumnNames

ans = 

  1×3 string array

    "Var1"    "Var2"    "Var3"

>> arrowTable.Schema

ans = 

Var1: double
Var2: string
Var3: bool

>> table(arrowTable)

ans =

  3×3 table

    Var1    Var2    Var3 
    ____    ____    _____

     1      "A"     true 
     2      "B"     false
     3      "C"     true 

>> isequal(ans, matlabTable)

ans =

  logical

   1
```

### Are these changes tested?

Yes.

1. Added a new `tTable` test class for `arrow.tabular.Table` and `arrow.table(<matlab-table>)` tests.

### Are there any user-facing changes?

Yes.

1. Users can now create `arrow.tabular.Table` objects using the `fromArrays` static construction method or the `arrow.table(<matlab-table>)` construction function.

### Future Directions

1. Create shared test infrastructure for common `RecordBatch` and `Table` MATLAB tests.
2. Implement equality check (i.e. `isequal`) for `arrow.tabular.Table` instances.
4. Add more static construction methods to `arrow.tabular.Table`. For example: `fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)` and `fromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)`.

### Notes

1. A lot of the code for `arrow.tabular.Table` is very similar to the code for `arrow.tabular.RecordBatch`. It may make sense for us to try to share more of the code using C++ templates or another approach.
2. Thank you @ sgilmore10 for your help with this pull request!
* Closes: apache#37571

Lead-authored-by: Kevin Gurney <kgurney@mathworks.com>
Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…he#37620)

### Rationale for this change

Following on from apache#37525, which adds `arrow.array.ChunkedArray` to the MATLAB interface, this pull request adds support for a new `arrow.tabular.Table` MATLAB class.

This pull request is intended to be an initial implementation of `Table` support and does not include all methods or properties that may be useful on `arrow.tabular.Table`.

### What changes are included in this PR?

1. Added new `arrow.tabular.Table` MATLAB class.

**Properties**

* `NumRows`
* `NumColumns`
* `ColumnNames`
* `Schema`

**Methods**

* `fromArrays(<array-1>, ..., <array-N>)`
* `column(<index>)`
* `table()`
* `toMATLAB()`

**Example of `arrow.tabular.Table.fromArrays(<array_1>, ..., <array-N>)` static construction method**
```matlab
>> arrowTable = arrow.tabular.Table.fromArrays(arrow.array([1, 2, 3]), arrow.array(["A", "B", "C"]), arrow.array([true, false, true]))

arrowTable = 

Column1: double
Column2: string
Column3: bool
----
Column1:
  [
    [
      1,
      2,
      3
    ]
  ]
Column2:
  [
    [
      "A",
      "B",
      "C"
    ]
  ]
Column3:
  [
    [
      true,
      false,
      true
    ]
  ]

>> matlabTable = table(arrowTable)

matlabTable =

  3×3 table

    Column1    Column2    Column3
    _______    _______    _______

       1         "A"       true  
       2         "B"       false 
       3         "C"       true  
```

2. Added a new `arrow.table(<matlab-table>)` construction function which creates an `arrow.tabular.Table` from a MATLAB `table`. 

**Example of `arrow.table(<matlab-table>)` construction function**
```matlab
>> matlabTable = table([1; 2; 3], ["A"; "B"; "C"], [true; false; true])

matlabTable =

  3×3 table

    Var1    Var2    Var3 
    ____    ____    _____

     1      "A"     true 
     2      "B"     false
     3      "C"     true 

>> arrowTable = arrow.table(matlabTable)

arrowTable = 

Var1: double
Var2: string
Var3: bool
----
Var1:
  [
    [
      1,
      2,
      3
    ]
  ]
Var2:
  [
    [
      "A",
      "B",
      "C"
    ]
  ]
Var3:
  [
    [
      true,
      false,
      true
    ]
  ]

>> arrowTable.NumRows

ans =

  int64

   3

>> arrowTable.NumColumns

ans =

  int32

   3

>> arrowTable.ColumnNames

ans = 

  1×3 string array

    "Var1"    "Var2"    "Var3"

>> arrowTable.Schema

ans = 

Var1: double
Var2: string
Var3: bool

>> table(arrowTable)

ans =

  3×3 table

    Var1    Var2    Var3 
    ____    ____    _____

     1      "A"     true 
     2      "B"     false
     3      "C"     true 

>> isequal(ans, matlabTable)

ans =

  logical

   1
```

### Are these changes tested?

Yes.

1. Added a new `tTable` test class for `arrow.tabular.Table` and `arrow.table(<matlab-table>)` tests.

### Are there any user-facing changes?

Yes.

1. Users can now create `arrow.tabular.Table` objects using the `fromArrays` static construction method or the `arrow.table(<matlab-table>)` construction function.

### Future Directions

1. Create shared test infrastructure for common `RecordBatch` and `Table` MATLAB tests.
2. Implement equality check (i.e. `isequal`) for `arrow.tabular.Table` instances.
4. Add more static construction methods to `arrow.tabular.Table`. For example: `fromChunkedArrays(<chunkedArray-1>, ..., <chunkedArray-N>)` and `fromRecordBatches(<recordBatch-1>, ..., <recordBatch-N>)`.

### Notes

1. A lot of the code for `arrow.tabular.Table` is very similar to the code for `arrow.tabular.RecordBatch`. It may make sense for us to try to share more of the code using C++ templates or another approach.
2. Thank you @ sgilmore10 for your help with this pull request!
* Closes: apache#37571

Lead-authored-by: Kevin Gurney <kgurney@mathworks.com>
Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant