Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MATLAB] Add arrow.buffer.Buffer class to the MATLAB Interface #38015

Closed
sgilmore10 opened this issue Oct 4, 2023 · 1 comment · Fixed by #38020
Closed

[MATLAB] Add arrow.buffer.Buffer class to the MATLAB Interface #38015

sgilmore10 opened this issue Oct 4, 2023 · 1 comment · Fixed by #38020

Comments

@sgilmore10
Copy link
Member

Describe the enhancement requested

To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the toMATLAB() on arrow.array.Array), we would like expose the underlying Arrow data representation as a property on arrow.array.Array. One possible name for this property would be DataLayout, which would be an arrow.array.DataLayout object. Note, this class does not yet exist, so we would have to add it.

For example, the DataLayout property for temporal array types would return an object of the following class type:

classdef TemporalDataLayout < arrow.array.DataLayout
    properties
       Values % an arrow.array.Int32Array or an arrow.array.Int64Array
       Valid  % an arrow.buffer.Buffer 
    end
end

However, the Valid property on this class would need to be an arrow.buffer.Buffer object, which does not yet exist in the MATLAB interface. Therefore, it would be helpful to first add the arrow.buffer.Buffer class before adding the DataLayout property/class hierarchy. It's worth mentioning that adding arrow.buffer.Buffer will open up additional advanced use cases in the future.

Component(s)

MATLAB

@sgilmore10
Copy link
Member Author

take

kevingurney pushed a commit that referenced this issue Oct 10, 2023
…rface (#38020)

### Rationale for this change

To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the `toMATLAB()` on `arrow.array.Array`), we would like expose the underlying Arrow data representation as a property on `arrow.array.Array`. One possible name for this property would be `DataLayout`, which would be an `arrow.array.DataLayout` object. Note, this class does not yet exist, so we would have to add it.

For example, the `DataLayout` property for temporal array types would return an object of the following class type: 

```matlab
classdef TemporalDataLayout < arrow.array.DataLayout
    properties
       Values % an arrow.array.Int32Array or an arrow.array.Int64Array
       Valid  % an arrow.buffer.Buffer 
    end
end
```

However, the `Valid` property on this class would need to be an `arrow.buffer.Buffer` object, which does not yet exist in the MATLAB interface.  Therefore, it would be helpful to first add the `arrow.buffer.Buffer` class before adding the `DataLayout` property/class hierarchy. It's worth mentioning that adding `arrow.buffer.Buffer` will open up additional advanced use cases in the future.

### What changes are included in this PR?

Added `arrow.buffer.Buffer` MATLAB class.

*Properties of `arrow.buffer.Buffer`*
 1. `NumBytes` - a  scalar `int64` value representing the size of the buffer in bytes. 

*Methods of `arrow.buffer.Buffer`*
1. `toMATLAB` - returns the data in the buffer as `Nx1` `uint8` vector, where `N` is the number of bytes.
2. `fromMATLAB(data)` - Static method that creates an `arrow.buffer.Buffer` from a numeric array. 

**Example:**
```matlab
>> dataIn = [1 2];
>> buffer = arrow.buffer.Buffer.fromMATLAB(dataIn)

buffer = 

  Buffer with properties:

    NumBytes: 16

>> dataOut = toMATLAB(buffer)

dataOut =

  16×1 uint8 column vector

     0
     0
     0
     0
     0
     0
   240
    63
     0
     0
     0
     0
     0
     0
     0
    64

% Reinterpret bit pattern as a double array 
>> toDouble = typecast(dataOut, "double")

toDouble =

     1
     2
```

### Are these changes tested?

Yes. Added a new test class called `tBuffer.m`

### Are there any user-facing changes?

Yes. Users can now create `arrow.buffer.Buffer` objects via the `fromMATLAB` static method. However, there's not much users can do with this object as of now. We implemented this class to facilitate adding `DataLayout` property to `arrow.array.Array`, as described in the **Rational for this change** section. 

* Closes: #38015

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
@kevingurney kevingurney added this to the 14.0.0 milestone Oct 10, 2023
JerAguilon pushed a commit to JerAguilon/arrow that referenced this issue Oct 23, 2023
…B Interface (apache#38020)

### Rationale for this change

To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the `toMATLAB()` on `arrow.array.Array`), we would like expose the underlying Arrow data representation as a property on `arrow.array.Array`. One possible name for this property would be `DataLayout`, which would be an `arrow.array.DataLayout` object. Note, this class does not yet exist, so we would have to add it.

For example, the `DataLayout` property for temporal array types would return an object of the following class type: 

```matlab
classdef TemporalDataLayout < arrow.array.DataLayout
    properties
       Values % an arrow.array.Int32Array or an arrow.array.Int64Array
       Valid  % an arrow.buffer.Buffer 
    end
end
```

However, the `Valid` property on this class would need to be an `arrow.buffer.Buffer` object, which does not yet exist in the MATLAB interface.  Therefore, it would be helpful to first add the `arrow.buffer.Buffer` class before adding the `DataLayout` property/class hierarchy. It's worth mentioning that adding `arrow.buffer.Buffer` will open up additional advanced use cases in the future.

### What changes are included in this PR?

Added `arrow.buffer.Buffer` MATLAB class.

*Properties of `arrow.buffer.Buffer`*
 1. `NumBytes` - a  scalar `int64` value representing the size of the buffer in bytes. 

*Methods of `arrow.buffer.Buffer`*
1. `toMATLAB` - returns the data in the buffer as `Nx1` `uint8` vector, where `N` is the number of bytes.
2. `fromMATLAB(data)` - Static method that creates an `arrow.buffer.Buffer` from a numeric array. 

**Example:**
```matlab
>> dataIn = [1 2];
>> buffer = arrow.buffer.Buffer.fromMATLAB(dataIn)

buffer = 

  Buffer with properties:

    NumBytes: 16

>> dataOut = toMATLAB(buffer)

dataOut =

  16×1 uint8 column vector

     0
     0
     0
     0
     0
     0
   240
    63
     0
     0
     0
     0
     0
     0
     0
    64

% Reinterpret bit pattern as a double array 
>> toDouble = typecast(dataOut, "double")

toDouble =

     1
     2
```

### Are these changes tested?

Yes. Added a new test class called `tBuffer.m`

### Are there any user-facing changes?

Yes. Users can now create `arrow.buffer.Buffer` objects via the `fromMATLAB` static method. However, there's not much users can do with this object as of now. We implemented this class to facilitate adding `DataLayout` property to `arrow.array.Array`, as described in the **Rational for this change** section. 

* Closes: apache#38015

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…B Interface (apache#38020)

### Rationale for this change

To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the `toMATLAB()` on `arrow.array.Array`), we would like expose the underlying Arrow data representation as a property on `arrow.array.Array`. One possible name for this property would be `DataLayout`, which would be an `arrow.array.DataLayout` object. Note, this class does not yet exist, so we would have to add it.

For example, the `DataLayout` property for temporal array types would return an object of the following class type: 

```matlab
classdef TemporalDataLayout < arrow.array.DataLayout
    properties
       Values % an arrow.array.Int32Array or an arrow.array.Int64Array
       Valid  % an arrow.buffer.Buffer 
    end
end
```

However, the `Valid` property on this class would need to be an `arrow.buffer.Buffer` object, which does not yet exist in the MATLAB interface.  Therefore, it would be helpful to first add the `arrow.buffer.Buffer` class before adding the `DataLayout` property/class hierarchy. It's worth mentioning that adding `arrow.buffer.Buffer` will open up additional advanced use cases in the future.

### What changes are included in this PR?

Added `arrow.buffer.Buffer` MATLAB class.

*Properties of `arrow.buffer.Buffer`*
 1. `NumBytes` - a  scalar `int64` value representing the size of the buffer in bytes. 

*Methods of `arrow.buffer.Buffer`*
1. `toMATLAB` - returns the data in the buffer as `Nx1` `uint8` vector, where `N` is the number of bytes.
2. `fromMATLAB(data)` - Static method that creates an `arrow.buffer.Buffer` from a numeric array. 

**Example:**
```matlab
>> dataIn = [1 2];
>> buffer = arrow.buffer.Buffer.fromMATLAB(dataIn)

buffer = 

  Buffer with properties:

    NumBytes: 16

>> dataOut = toMATLAB(buffer)

dataOut =

  16×1 uint8 column vector

     0
     0
     0
     0
     0
     0
   240
    63
     0
     0
     0
     0
     0
     0
     0
    64

% Reinterpret bit pattern as a double array 
>> toDouble = typecast(dataOut, "double")

toDouble =

     1
     2
```

### Are these changes tested?

Yes. Added a new test class called `tBuffer.m`

### Are there any user-facing changes?

Yes. Users can now create `arrow.buffer.Buffer` objects via the `fromMATLAB` static method. However, there's not much users can do with this object as of now. We implemented this class to facilitate adding `DataLayout` property to `arrow.array.Array`, as described in the **Rational for this change** section. 

* Closes: apache#38015

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…B Interface (apache#38020)

### Rationale for this change

To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the `toMATLAB()` on `arrow.array.Array`), we would like expose the underlying Arrow data representation as a property on `arrow.array.Array`. One possible name for this property would be `DataLayout`, which would be an `arrow.array.DataLayout` object. Note, this class does not yet exist, so we would have to add it.

For example, the `DataLayout` property for temporal array types would return an object of the following class type: 

```matlab
classdef TemporalDataLayout < arrow.array.DataLayout
    properties
       Values % an arrow.array.Int32Array or an arrow.array.Int64Array
       Valid  % an arrow.buffer.Buffer 
    end
end
```

However, the `Valid` property on this class would need to be an `arrow.buffer.Buffer` object, which does not yet exist in the MATLAB interface.  Therefore, it would be helpful to first add the `arrow.buffer.Buffer` class before adding the `DataLayout` property/class hierarchy. It's worth mentioning that adding `arrow.buffer.Buffer` will open up additional advanced use cases in the future.

### What changes are included in this PR?

Added `arrow.buffer.Buffer` MATLAB class.

*Properties of `arrow.buffer.Buffer`*
 1. `NumBytes` - a  scalar `int64` value representing the size of the buffer in bytes. 

*Methods of `arrow.buffer.Buffer`*
1. `toMATLAB` - returns the data in the buffer as `Nx1` `uint8` vector, where `N` is the number of bytes.
2. `fromMATLAB(data)` - Static method that creates an `arrow.buffer.Buffer` from a numeric array. 

**Example:**
```matlab
>> dataIn = [1 2];
>> buffer = arrow.buffer.Buffer.fromMATLAB(dataIn)

buffer = 

  Buffer with properties:

    NumBytes: 16

>> dataOut = toMATLAB(buffer)

dataOut =

  16×1 uint8 column vector

     0
     0
     0
     0
     0
     0
   240
    63
     0
     0
     0
     0
     0
     0
     0
    64

% Reinterpret bit pattern as a double array 
>> toDouble = typecast(dataOut, "double")

toDouble =

     1
     2
```

### Are these changes tested?

Yes. Added a new test class called `tBuffer.m`

### Are there any user-facing changes?

Yes. Users can now create `arrow.buffer.Buffer` objects via the `fromMATLAB` static method. However, there's not much users can do with this object as of now. We implemented this class to facilitate adding `DataLayout` property to `arrow.array.Array`, as described in the **Rational for this change** section. 

* Closes: apache#38015

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants