Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MATLAB] Initialize the Type property of arrow.array.Array subclasses from existing proxy ids #36652

Closed
sgilmore10 opened this issue Jul 12, 2023 · 1 comment · Fixed by #36731

Comments

@sgilmore10
Copy link
Member

sgilmore10 commented Jul 12, 2023

Describe the enhancement requested

Now that the issue #36363 is closed via PR #36419, we can initialize the Type property of arrow.array.Array subclasses from existing proxy ids. Currently, we create a new proxy Type object whose underlying arrow::DataType are semantically equal to - but not the same as - the arrow::DataType owned by the Array proxy. It would be preferable if the Type and Array proxy classes refer to the same arrow::DataType object (i.e. the same object on the heap).

Component(s)

MATLAB

@sgilmore10
Copy link
Member Author

take

kevingurney pushed a commit that referenced this issue Jul 17, 2023
…ay` subclasses from existing proxy ids (#36731)

### Rationale for this change

Now that the issue #36363 is closed via PR #36419, we can initialize the `Type` property of `arrow.array.Array` subclasses from existing proxy ids. Currently, we create a new proxy `Type` object whose underlying `arrow::DataType` are semantically equal to  - but not the same as - the `arrow::DataType` owned by the Array proxy. It would be preferable if the `Type` and `Array` proxy classes refer to the same `arrow::DataType` object (i.e. the same object on the heap).

### What changes are included in this PR?

1. Upgraded `libmexclass` to commit [d04f88d](mathworks/libmexclass@d04f88d). In this commit, we added a static "make-like" function to `Proxy` called `create`.
2. Modified the constructors of all `Type` objects to expect a single `Proxy` object as input. This is a breaking change and  means clients are no longer expected to build `Type` objects via their constructors. Instead, we introduced standalone functions that clients can use to construct `Type` objects, i.e.   `arrow.type.int8`, `arrow.type.string`, `arrow.type.timestamp`, etc. These functions deal with creating the `Proxy` objects to pass to the `Type` constructors. Below is an example of the new workflow for creating `Type` objects. 

```matlab
>> timestampType = arrow.type.timestamp(TimeUnit="second", TimeZone="America/New_York")

timestampType = 

  TimestampType with properties:

    ID: Timestamp
```
NOTE: We plan on enhancing the display to show the `TimeUnit` and `TimeZone` properties. 

3. Made `Type` a [dependent](https://www.mathworks.com/help/matlab/matlab_oop/access-methods-for-dependent-properties.html) property on `arrow.array.Array`. The `get.Type` method constructs a `Type` object on demand by making a proxy that wraps the same `arrow::DataType` object stored within the `arrow::Array`.

### Are these changes tested?

Yes, updated existing tests.

### Are there any user-facing changes?

Yes, we added new standalone functions for creating `Type` objects. Below is a table mapping standalone  functions to the `Type` object they output: 

| Standalone Function | Output Type Object |
|----------------------|---------------------|
|`arrow.type.boolean`| `arrow.type.BooleanType`|
|`arrow.type.int8`| `arrow.type.Int8Type`|
|`arrow.type.int16`| `arrow.type.Int16Type`|
|`arrow.type.int32`| `arrow.type.Int32Type`|
|`arrow.type.int64`| `arrow.type.Int64Type`|
|`arrow.type.uint8`| `arrow.type.UInt8Type`|
|`arrow.type.uint16`| `arrow.type.UInt16Type`|
|`arrow.type.uint32`| `arrow.type.UInt32Type`|
|`arrow.type.uint64`| `arrow.type.UInt64Type`|
|`arrow.type.string`| `arrow.type.StringType`|
|`arrow.type.timestamp`| `arrow.type.TimestampType`|

### Notes

Thanks @ kevingurney for the advice!
* Closes: #36652

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
@kevingurney kevingurney added this to the 14.0.0 milestone Jul 17, 2023
chelseajonesr pushed a commit to chelseajonesr/arrow that referenced this issue Jul 20, 2023
…ay.Array` subclasses from existing proxy ids (apache#36731)

### Rationale for this change

Now that the issue apache#36363 is closed via PR apache#36419, we can initialize the `Type` property of `arrow.array.Array` subclasses from existing proxy ids. Currently, we create a new proxy `Type` object whose underlying `arrow::DataType` are semantically equal to  - but not the same as - the `arrow::DataType` owned by the Array proxy. It would be preferable if the `Type` and `Array` proxy classes refer to the same `arrow::DataType` object (i.e. the same object on the heap).

### What changes are included in this PR?

1. Upgraded `libmexclass` to commit [d04f88d](mathworks/libmexclass@d04f88d). In this commit, we added a static "make-like" function to `Proxy` called `create`.
2. Modified the constructors of all `Type` objects to expect a single `Proxy` object as input. This is a breaking change and  means clients are no longer expected to build `Type` objects via their constructors. Instead, we introduced standalone functions that clients can use to construct `Type` objects, i.e.   `arrow.type.int8`, `arrow.type.string`, `arrow.type.timestamp`, etc. These functions deal with creating the `Proxy` objects to pass to the `Type` constructors. Below is an example of the new workflow for creating `Type` objects. 

```matlab
>> timestampType = arrow.type.timestamp(TimeUnit="second", TimeZone="America/New_York")

timestampType = 

  TimestampType with properties:

    ID: Timestamp
```
NOTE: We plan on enhancing the display to show the `TimeUnit` and `TimeZone` properties. 

3. Made `Type` a [dependent](https://www.mathworks.com/help/matlab/matlab_oop/access-methods-for-dependent-properties.html) property on `arrow.array.Array`. The `get.Type` method constructs a `Type` object on demand by making a proxy that wraps the same `arrow::DataType` object stored within the `arrow::Array`.

### Are these changes tested?

Yes, updated existing tests.

### Are there any user-facing changes?

Yes, we added new standalone functions for creating `Type` objects. Below is a table mapping standalone  functions to the `Type` object they output: 

| Standalone Function | Output Type Object |
|----------------------|---------------------|
|`arrow.type.boolean`| `arrow.type.BooleanType`|
|`arrow.type.int8`| `arrow.type.Int8Type`|
|`arrow.type.int16`| `arrow.type.Int16Type`|
|`arrow.type.int32`| `arrow.type.Int32Type`|
|`arrow.type.int64`| `arrow.type.Int64Type`|
|`arrow.type.uint8`| `arrow.type.UInt8Type`|
|`arrow.type.uint16`| `arrow.type.UInt16Type`|
|`arrow.type.uint32`| `arrow.type.UInt32Type`|
|`arrow.type.uint64`| `arrow.type.UInt64Type`|
|`arrow.type.string`| `arrow.type.StringType`|
|`arrow.type.timestamp`| `arrow.type.TimestampType`|

### Notes

Thanks @ kevingurney for the advice!
* Closes: apache#36652

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
R-JunmingChen pushed a commit to R-JunmingChen/arrow that referenced this issue Aug 20, 2023
…ay.Array` subclasses from existing proxy ids (apache#36731)

### Rationale for this change

Now that the issue apache#36363 is closed via PR apache#36419, we can initialize the `Type` property of `arrow.array.Array` subclasses from existing proxy ids. Currently, we create a new proxy `Type` object whose underlying `arrow::DataType` are semantically equal to  - but not the same as - the `arrow::DataType` owned by the Array proxy. It would be preferable if the `Type` and `Array` proxy classes refer to the same `arrow::DataType` object (i.e. the same object on the heap).

### What changes are included in this PR?

1. Upgraded `libmexclass` to commit [d04f88d](mathworks/libmexclass@d04f88d). In this commit, we added a static "make-like" function to `Proxy` called `create`.
2. Modified the constructors of all `Type` objects to expect a single `Proxy` object as input. This is a breaking change and  means clients are no longer expected to build `Type` objects via their constructors. Instead, we introduced standalone functions that clients can use to construct `Type` objects, i.e.   `arrow.type.int8`, `arrow.type.string`, `arrow.type.timestamp`, etc. These functions deal with creating the `Proxy` objects to pass to the `Type` constructors. Below is an example of the new workflow for creating `Type` objects. 

```matlab
>> timestampType = arrow.type.timestamp(TimeUnit="second", TimeZone="America/New_York")

timestampType = 

  TimestampType with properties:

    ID: Timestamp
```
NOTE: We plan on enhancing the display to show the `TimeUnit` and `TimeZone` properties. 

3. Made `Type` a [dependent](https://www.mathworks.com/help/matlab/matlab_oop/access-methods-for-dependent-properties.html) property on `arrow.array.Array`. The `get.Type` method constructs a `Type` object on demand by making a proxy that wraps the same `arrow::DataType` object stored within the `arrow::Array`.

### Are these changes tested?

Yes, updated existing tests.

### Are there any user-facing changes?

Yes, we added new standalone functions for creating `Type` objects. Below is a table mapping standalone  functions to the `Type` object they output: 

| Standalone Function | Output Type Object |
|----------------------|---------------------|
|`arrow.type.boolean`| `arrow.type.BooleanType`|
|`arrow.type.int8`| `arrow.type.Int8Type`|
|`arrow.type.int16`| `arrow.type.Int16Type`|
|`arrow.type.int32`| `arrow.type.Int32Type`|
|`arrow.type.int64`| `arrow.type.Int64Type`|
|`arrow.type.uint8`| `arrow.type.UInt8Type`|
|`arrow.type.uint16`| `arrow.type.UInt16Type`|
|`arrow.type.uint32`| `arrow.type.UInt32Type`|
|`arrow.type.uint64`| `arrow.type.UInt64Type`|
|`arrow.type.string`| `arrow.type.StringType`|
|`arrow.type.timestamp`| `arrow.type.TimestampType`|

### Notes

Thanks @ kevingurney for the advice!
* Closes: apache#36652

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants