Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MATLAB] The column(index) method of arrow.tabular.RecordBatch errors if index refers to an arrow.array.Time32Array column #37340

Closed
sgilmore10 opened this issue Aug 23, 2023 · 1 comment · Fixed by #37347

Comments

@sgilmore10
Copy link
Member

sgilmore10 commented Aug 23, 2023

Describe the bug, including details regarding any error messages, version, and platform.

The column(index) method of arrow.tabular.RecordBatch errors if index refers to an arrow.array.Time32Array column.

>> string = arrow.array(["A", "B", "C"]);
>> time32 = arrow.array.Time32Array.fromMATLAB(seconds(1:3));
>> rb = arrow.tabular.RecordBatch.fromArrays(string, time32, ColumnNames=["String", "Time32"])

rb = 

String:   [
    "A",
    "B",
    "C"
  ]
Time32:   [
    00:00:01,
    00:00:02,
    00:00:03
  ]

>> time32Column = rb.column(2)

Error using  . 
Unsupported DataType: time32[s]

Error in arrow.tabular.RecordBatch/column (line 63)
            [proxyID, typeID] = obj.Proxy.getColumnByIndex(args);

column(index) is throwing this error because the case for Time32 was not added to arrow::matlab::array::proxy::wrap(arrowArray) in #37315. column(index) calls this function to create proxy objects around existing arrow arrays.existing arrow arrays. Adding the case for Time32 will resolve this bug.

Component(s)

MATLAB

@sgilmore10
Copy link
Member Author

take

@sgilmore10 sgilmore10 changed the title [MATLAB] The column(index) method of arrow.tabular.RecordBatch errors if index refers to a arrow.array.Time32Array column [MATLAB] The column(index) method of arrow.tabular.RecordBatch errors if index refers to an arrow.array.Time32Array column Aug 23, 2023
kevingurney pushed a commit that referenced this issue Aug 23, 2023
…dBatch` errors if `index` refers to an `arrow.array.Time32Array` column (#37347)

### Rationale for this change

The `column(index)` method of `arrow.tabular.RecordBatch` errors  if `index` refers to an `arrow.array.Time32Array` column.

```matlab
>> string = arrow.array(["A", "B", "C"]);
>> time32 = arrow.array.Time32Array.fromMATLAB(seconds(1:3));
>> rb = arrow.tabular.RecordBatch.fromArrays(string, time32, ColumnNames=["String", "Time32"])

rb = 

String:   [
    "A",
    "B",
    "C"
  ]
Time32:   [
    00:00:01,
    00:00:02,
    00:00:03
  ]

>> time32Column = rb.column(2)

Error using  . 
Unsupported DataType: time32[s]

Error in arrow.tabular.RecordBatch/column (line 63)
            [proxyID, typeID] = obj.Proxy.getColumnByIndex(args);
```

`column(index)` is throwing this error because the case for `Time32` was not added to `arrow::matlab::array::proxy::wrap(arrowArray)` in #37315. `column(index)` calls this function to create proxy objects around existing arrow arrays. Adding the case for `Time32` will resolve this bug.

### What changes are included in this PR?

1. Updated `arrow::Result<proxy::Array> wrap(const std::shared_ptr<arrow::Array>& array)` to handle wrapping `arrow::Time32Array` instances within `proxy::Time32Array`s.
2. Added a new test utility called `arrow.internal.test.tabular.createAllSupportedArrayTypes`, which returns two cell arrays: `arrowArrays` and `matlabData`. The `arrowArrays` cell array contains one instance of each concrete subclass of `arrow.array.Array`. The `matlabData` cell array contains the MATLAB arrays used to generate each array in `arrowArrays`. This utility can be used to create an `arrow.array.RecordBatch` that contains all the arrow array types that are supported by the MATLAB interface. 

### Are these changes tested?

Yes. Updated the `fromArrays` test cases in `tRecordBatch` to verify the `column(index)` method of `arrow.type.RecordBatch` supports extracting columns of any arrow type (supported by the MATLAB Interface).

### Are there any user-facing changes?

Yes. Fixed a bug in `arrow.tabular.RecordBatch`.

### Future Directions

1. #37345

* Closes: #37340

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
@kevingurney kevingurney added this to the 14.0.0 milestone Aug 23, 2023
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
….RecordBatch` errors if `index` refers to an `arrow.array.Time32Array` column (apache#37347)

### Rationale for this change

The `column(index)` method of `arrow.tabular.RecordBatch` errors  if `index` refers to an `arrow.array.Time32Array` column.

```matlab
>> string = arrow.array(["A", "B", "C"]);
>> time32 = arrow.array.Time32Array.fromMATLAB(seconds(1:3));
>> rb = arrow.tabular.RecordBatch.fromArrays(string, time32, ColumnNames=["String", "Time32"])

rb = 

String:   [
    "A",
    "B",
    "C"
  ]
Time32:   [
    00:00:01,
    00:00:02,
    00:00:03
  ]

>> time32Column = rb.column(2)

Error using  . 
Unsupported DataType: time32[s]

Error in arrow.tabular.RecordBatch/column (line 63)
            [proxyID, typeID] = obj.Proxy.getColumnByIndex(args);
```

`column(index)` is throwing this error because the case for `Time32` was not added to `arrow::matlab::array::proxy::wrap(arrowArray)` in apache#37315. `column(index)` calls this function to create proxy objects around existing arrow arrays. Adding the case for `Time32` will resolve this bug.

### What changes are included in this PR?

1. Updated `arrow::Result<proxy::Array> wrap(const std::shared_ptr<arrow::Array>& array)` to handle wrapping `arrow::Time32Array` instances within `proxy::Time32Array`s.
2. Added a new test utility called `arrow.internal.test.tabular.createAllSupportedArrayTypes`, which returns two cell arrays: `arrowArrays` and `matlabData`. The `arrowArrays` cell array contains one instance of each concrete subclass of `arrow.array.Array`. The `matlabData` cell array contains the MATLAB arrays used to generate each array in `arrowArrays`. This utility can be used to create an `arrow.array.RecordBatch` that contains all the arrow array types that are supported by the MATLAB interface. 

### Are these changes tested?

Yes. Updated the `fromArrays` test cases in `tRecordBatch` to verify the `column(index)` method of `arrow.type.RecordBatch` supports extracting columns of any arrow type (supported by the MATLAB Interface).

### Are there any user-facing changes?

Yes. Fixed a bug in `arrow.tabular.RecordBatch`.

### Future Directions

1. apache#37345

* Closes: apache#37340

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment