Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37391: [MATLAB] Implement the isequal() method on arrow.array.Array #37446

Merged
merged 20 commits into from
Aug 30, 2023

Conversation

sgilmore10
Copy link
Member

@sgilmore10 sgilmore10 commented Aug 29, 2023

Rationale for this change

Currently, it's not possible to determine if two arrow.array.Array instances are equal because the isequal() method always returns false by default:

Example

>> a = arrow.array([1 2 3])

a = 

[
  1,
  2,
  3
]

% Compare a with itself. 
>> tf = isequal(a, a)

tf =

  logical

   0

What changes are included in this PR?

  1. Defined an isequal() overload on the arrow.array.Array super-class.
  2. Added a new method called isEqual() on the arrow::matlab::array::proxy::Array class.

Two arrays are considered equal in the MATLAB Interface if the following conditions are met:

  1. They have the same type
  2. They have the same length
  3. The same elements are valid
  4. Corresponding valid elements are equal.

NOTE: NaN values are not considered equal.

Example

>> a = arrow.array([1 2 3])

a = 

[
  1,
  2,
  3
]

% Compare a with itself. 
>> tf = isequal(a, a)

tf =

  logical

   1

Are these changes tested?

Yes. Added positive and negative isequal tests for all concrete subclasses of arrow.array.Array.

Are there any user-facing changes?

Yes. Users can now use isequal() with arrow.array.Array subclasses.

Future Directions

  1. Implement isequaln so that users can test array equality with NaNs being treated as equal.

@github-actions
Copy link

⚠️ GitHub issue #7391 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link

⚠️ GitHub issue #7391 has no components, please add labels for components.

@github-actions
Copy link

⚠️ GitHub issue #7391 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link

⚠️ GitHub issue #7391 has no components, please add labels for components.

@sgilmore10 sgilmore10 changed the title GH-7391: [MATLAB] Implement the isequal() method on arrow.array.Array GH-37391: [MATLAB] Implement the isequal() method on arrow.array.Array Aug 29, 2023
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Aug 29, 2023
Copy link
Member

@kevingurney kevingurney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! This is a nice enhancement!

matlab/src/cpp/arrow/matlab/array/proxy/array.cc Outdated Show resolved Hide resolved
matlab/test/arrow/array/hNumericArray.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/hNumericArray.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/tBooleanArray.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/tTime32Array.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/tTime64Array.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/tTimestampArray.m Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels Aug 30, 2023
@kevingurney
Copy link
Member

+1

@kevingurney kevingurney merged commit c9012a0 into apache:main Aug 30, 2023
9 checks passed
@kevingurney kevingurney deleted the GH-37391 branch August 30, 2023 15:40
@kevingurney kevingurney removed the awaiting change review Awaiting change review label Aug 30, 2023
@github-actions github-actions bot added the awaiting merge Awaiting merge label Aug 30, 2023
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit c9012a0.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them.

kevingurney pushed a commit that referenced this pull request Sep 7, 2023
…LAB class (#37617)

### Rationale for this change

Following on to #37474, #37446, and #37525, we should implement `isequal` for the `arrow.type.Field` MATLAB class.

### What changes are included in this PR?

1. Implemented the `isequal` method for `arrow.type.Field`

### Are these changes tested?

Yes. Add new unit tests to `tField.m`

### Are there any user-facing changes?

Yes. Users can now call `isequal` on `arrow.type.Field`s to determine if two fields are equal.

**Example**
```matlab
>> f1 = arrow.field("A", arrow.time32(TimeUnit="Second"));
>> f2 = arrow.field("B", arrow.time32(TimeUnit="Second"));
>> f3 = arrow.field("A", arrow.time32(TimeUnit="Millisecond"));

>> isequal(f1, f1)

ans =

  logical

   1

% Name properties differ
>> isequal(f1, f2)

ans =

  logical

   0

% Type properties differ
>> isequal(f1, f3)

ans =

  logical

   0
```

### Future Directions

1. #37568
2. #37570

* Closes: #37569

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
kevingurney pushed a commit that referenced this pull request Sep 7, 2023
… MATLAB class (#37619)

### Rationale for this change

Following on to #37474, #37446, and #37525, we should implement `isequal` for the `arrow.tabular.Schema` MATLAB class.

### What changes are included in this PR?

1. Updated `arrow.tabular.Schema` class to inherit from `matlab.mixin.Scalar`.
2. Added `isequal` method to `arrow.tabular.Schema`.

### Are these changes tested?

Yes. Added `isequal` unit tests to `tSchema.m`

### Are there any user-facing changes?

Yes. Users can now compare two `arrow.tabular.Schema` objects via `isequal`.

**Example**
```matlab
>> schema1 = arrow.schema([arrow.field("A", arrow.uint8), arrow.field("B", arrow.uint16)]);
>> schema2 = arrow.schema([arrow.field("A", arrow.uint8), arrow.field("B", arrow.uint16)]);
>> schema3 = arrow.schema([arrow.field("A", arrow.uint8)]);

>> isequal(schema1, schema2)

ans =

  logical

   1

>> isequal(schema1, schema3)

ans =

  logical

   0
```

### Future Directions
1. #37570 

* Closes: #37568

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
kevingurney pushed a commit that referenced this pull request Sep 8, 2023
…atch` MATLAB class (#37627)

### Rationale for this change

Following on to #37474, #37446, and #37525, we should implement `isequal` for the `arrow.tabular.RecordBatch` MATLAB class.

### What changes are included in this PR?

1. Implemented `isequal` method for `arrow.tabular.RecordBatch`

### Are these changes tested?

Yes. Added `isequal` unit tests to `tRecordBatch.m`.

### Are there any user-facing changes?

Yes, users can now use `isequal` to compare `arrow.tabular.RecordBatch`es. 

**Example**

```matlab
>> t1 = table(1, "A", false, VariableNames=["Number",  "String", "Logical"]);
>> t2 = table([1; 2], ["A"; "B"], [false; false], VariableNames=["Number",  "String", "Logical"]); 
>> rb1 = arrow.recordBatch(t1);
>> rb2 = arrow.recordBatch(t2);
>> rb3 = arrow.recordBatch(t1);

>> isequal(rb1, rb2)

ans =

  logical

   0

>> isequal(rb1, rb3)

ans =

  logical

   1
```

### Future Directions
1. #37628

* Closes: #37570

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
kevingurney pushed a commit that referenced this pull request Sep 8, 2023
…MATLAB class (#37629)

### Rationale for this change

Following on to #37474, #37446, #37525,  and  #37627, we should implement `isequal` for the arrow.tabular.Table` MATLAB class.

### What changes are included in this PR?

1. Add new function `arrow.internal.tabular.isequal` that both `arrow.tabular.RecordBatch` and `arrow.tabular.Table` can use to implement their `isequal` methods.
2. Modified `arrow.tabular.RecordBatch` to use the new `isequal` package function to implement  its `isequal` method.
3. Implemented the `isequal` method for `arrow.tabular.Table` using the new `isequal` package function.

### Are these changes tested?

Yes, added `isequal` unit tests to `tTable.m`

### Are there any user-facing changes?

Yes. Users can now compare `arrow.tabular.Table`s using `isequal`:

```matlab
>> t1 = table(1, "A", false, VariableNames=["Number",  "String", "Logical"]);
>> t2 = table([1; 2], ["A"; "B"], [false; false], VariableNames=["Number",  "String", "Logical"]); 
>> tbl1 = arrow.table(t1);
>> tbl2 = arrow.table(t2);
>> tbl3 = arrow.table(t1);

>> isequal(tbl1, tbl2)

ans =

  logical

   0

>> isequal(tbl1, tbl3)

ans =

  logical

   1
```

* Closes: #37628

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…rray.Array` (apache#37446)

### Rationale for this change

Currently, it's not possible to determine if two `arrow.array.Array` instances are equal because the `isequal()` method always returns `false` by default:

**Example**
 
```matlab
>> a = arrow.array([1 2 3])

a = 

[
  1,
  2,
  3
]

% Compare a with itself. 
>> tf = isequal(a, a)

tf =

  logical

   0
````

### What changes are included in this PR?

1. Defined an `isequal()` overload on the `arrow.array.Array` super-class.
2. Added a new method called `isEqual()` on the `arrow::matlab::array::proxy::Array` class. 

Two arrays are considered equal in the MATLAB Interface if the following conditions are met:

 1. They have the same type
 2. They have the same length
 3. The same elements are valid
 4. Corresponding valid elements are equal.

**NOTE:** NaN values are not considered equal.

**Example**

```matlab
>> a = arrow.array([1 2 3])

a = 

[
  1,
  2,
  3
]

% Compare a with itself. 
>> tf = isequal(a, a)

tf =

  logical

   1

```

### Are these changes tested?

Yes. Added positive and negative `isequal` tests for all concrete subclasses of `arrow.array.Array`.

### Are there any user-facing changes?

Yes. Users can now use `isequal()` with `arrow.array.Array` subclasses.

### Future Directions

1. Implement [`isequaln`](https://www.mathworks.com/help/matlab/ref/isequaln.html) so that users can test array equality with NaNs being treated as equal. 
* Closes: apache#37391

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…d` MATLAB class (apache#37617)

### Rationale for this change

Following on to apache#37474, apache#37446, and apache#37525, we should implement `isequal` for the `arrow.type.Field` MATLAB class.

### What changes are included in this PR?

1. Implemented the `isequal` method for `arrow.type.Field`

### Are these changes tested?

Yes. Add new unit tests to `tField.m`

### Are there any user-facing changes?

Yes. Users can now call `isequal` on `arrow.type.Field`s to determine if two fields are equal.

**Example**
```matlab
>> f1 = arrow.field("A", arrow.time32(TimeUnit="Second"));
>> f2 = arrow.field("B", arrow.time32(TimeUnit="Second"));
>> f3 = arrow.field("A", arrow.time32(TimeUnit="Millisecond"));

>> isequal(f1, f1)

ans =

  logical

   1

% Name properties differ
>> isequal(f1, f2)

ans =

  logical

   0

% Type properties differ
>> isequal(f1, f3)

ans =

  logical

   0
```

### Future Directions

1. apache#37568
2. apache#37570

* Closes: apache#37569

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…chema` MATLAB class (apache#37619)

### Rationale for this change

Following on to apache#37474, apache#37446, and apache#37525, we should implement `isequal` for the `arrow.tabular.Schema` MATLAB class.

### What changes are included in this PR?

1. Updated `arrow.tabular.Schema` class to inherit from `matlab.mixin.Scalar`.
2. Added `isequal` method to `arrow.tabular.Schema`.

### Are these changes tested?

Yes. Added `isequal` unit tests to `tSchema.m`

### Are there any user-facing changes?

Yes. Users can now compare two `arrow.tabular.Schema` objects via `isequal`.

**Example**
```matlab
>> schema1 = arrow.schema([arrow.field("A", arrow.uint8), arrow.field("B", arrow.uint16)]);
>> schema2 = arrow.schema([arrow.field("A", arrow.uint8), arrow.field("B", arrow.uint16)]);
>> schema3 = arrow.schema([arrow.field("A", arrow.uint8)]);

>> isequal(schema1, schema2)

ans =

  logical

   1

>> isequal(schema1, schema3)

ans =

  logical

   0
```

### Future Directions
1. apache#37570 

* Closes: apache#37568

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…ecordBatch` MATLAB class (apache#37627)

### Rationale for this change

Following on to apache#37474, apache#37446, and apache#37525, we should implement `isequal` for the `arrow.tabular.RecordBatch` MATLAB class.

### What changes are included in this PR?

1. Implemented `isequal` method for `arrow.tabular.RecordBatch`

### Are these changes tested?

Yes. Added `isequal` unit tests to `tRecordBatch.m`.

### Are there any user-facing changes?

Yes, users can now use `isequal` to compare `arrow.tabular.RecordBatch`es. 

**Example**

```matlab
>> t1 = table(1, "A", false, VariableNames=["Number",  "String", "Logical"]);
>> t2 = table([1; 2], ["A"; "B"], [false; false], VariableNames=["Number",  "String", "Logical"]); 
>> rb1 = arrow.recordBatch(t1);
>> rb2 = arrow.recordBatch(t2);
>> rb3 = arrow.recordBatch(t1);

>> isequal(rb1, rb2)

ans =

  logical

   0

>> isequal(rb1, rb3)

ans =

  logical

   1
```

### Future Directions
1. apache#37628

* Closes: apache#37570

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…able` MATLAB class (apache#37629)

### Rationale for this change

Following on to apache#37474, apache#37446, apache#37525,  and  apache#37627, we should implement `isequal` for the arrow.tabular.Table` MATLAB class.

### What changes are included in this PR?

1. Add new function `arrow.internal.tabular.isequal` that both `arrow.tabular.RecordBatch` and `arrow.tabular.Table` can use to implement their `isequal` methods.
2. Modified `arrow.tabular.RecordBatch` to use the new `isequal` package function to implement  its `isequal` method.
3. Implemented the `isequal` method for `arrow.tabular.Table` using the new `isequal` package function.

### Are these changes tested?

Yes, added `isequal` unit tests to `tTable.m`

### Are there any user-facing changes?

Yes. Users can now compare `arrow.tabular.Table`s using `isequal`:

```matlab
>> t1 = table(1, "A", false, VariableNames=["Number",  "String", "Logical"]);
>> t2 = table([1; 2], ["A"; "B"], [false; false], VariableNames=["Number",  "String", "Logical"]); 
>> tbl1 = arrow.table(t1);
>> tbl2 = arrow.table(t2);
>> tbl3 = arrow.table(t1);

>> isequal(tbl1, tbl2)

ans =

  logical

   0

>> isequal(tbl1, tbl3)

ans =

  logical

   1
```

* Closes: apache#37628

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…rray.Array` (apache#37446)

### Rationale for this change

Currently, it's not possible to determine if two `arrow.array.Array` instances are equal because the `isequal()` method always returns `false` by default:

**Example**
 
```matlab
>> a = arrow.array([1 2 3])

a = 

[
  1,
  2,
  3
]

% Compare a with itself. 
>> tf = isequal(a, a)

tf =

  logical

   0
````

### What changes are included in this PR?

1. Defined an `isequal()` overload on the `arrow.array.Array` super-class.
2. Added a new method called `isEqual()` on the `arrow::matlab::array::proxy::Array` class. 

Two arrays are considered equal in the MATLAB Interface if the following conditions are met:

 1. They have the same type
 2. They have the same length
 3. The same elements are valid
 4. Corresponding valid elements are equal.

**NOTE:** NaN values are not considered equal.

**Example**

```matlab
>> a = arrow.array([1 2 3])

a = 

[
  1,
  2,
  3
]

% Compare a with itself. 
>> tf = isequal(a, a)

tf =

  logical

   1

```

### Are these changes tested?

Yes. Added positive and negative `isequal` tests for all concrete subclasses of `arrow.array.Array`.

### Are there any user-facing changes?

Yes. Users can now use `isequal()` with `arrow.array.Array` subclasses.

### Future Directions

1. Implement [`isequaln`](https://www.mathworks.com/help/matlab/ref/isequaln.html) so that users can test array equality with NaNs being treated as equal. 
* Closes: apache#37391

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…d` MATLAB class (apache#37617)

### Rationale for this change

Following on to apache#37474, apache#37446, and apache#37525, we should implement `isequal` for the `arrow.type.Field` MATLAB class.

### What changes are included in this PR?

1. Implemented the `isequal` method for `arrow.type.Field`

### Are these changes tested?

Yes. Add new unit tests to `tField.m`

### Are there any user-facing changes?

Yes. Users can now call `isequal` on `arrow.type.Field`s to determine if two fields are equal.

**Example**
```matlab
>> f1 = arrow.field("A", arrow.time32(TimeUnit="Second"));
>> f2 = arrow.field("B", arrow.time32(TimeUnit="Second"));
>> f3 = arrow.field("A", arrow.time32(TimeUnit="Millisecond"));

>> isequal(f1, f1)

ans =

  logical

   1

% Name properties differ
>> isequal(f1, f2)

ans =

  logical

   0

% Type properties differ
>> isequal(f1, f3)

ans =

  logical

   0
```

### Future Directions

1. apache#37568
2. apache#37570

* Closes: apache#37569

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…chema` MATLAB class (apache#37619)

### Rationale for this change

Following on to apache#37474, apache#37446, and apache#37525, we should implement `isequal` for the `arrow.tabular.Schema` MATLAB class.

### What changes are included in this PR?

1. Updated `arrow.tabular.Schema` class to inherit from `matlab.mixin.Scalar`.
2. Added `isequal` method to `arrow.tabular.Schema`.

### Are these changes tested?

Yes. Added `isequal` unit tests to `tSchema.m`

### Are there any user-facing changes?

Yes. Users can now compare two `arrow.tabular.Schema` objects via `isequal`.

**Example**
```matlab
>> schema1 = arrow.schema([arrow.field("A", arrow.uint8), arrow.field("B", arrow.uint16)]);
>> schema2 = arrow.schema([arrow.field("A", arrow.uint8), arrow.field("B", arrow.uint16)]);
>> schema3 = arrow.schema([arrow.field("A", arrow.uint8)]);

>> isequal(schema1, schema2)

ans =

  logical

   1

>> isequal(schema1, schema3)

ans =

  logical

   0
```

### Future Directions
1. apache#37570 

* Closes: apache#37568

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…ecordBatch` MATLAB class (apache#37627)

### Rationale for this change

Following on to apache#37474, apache#37446, and apache#37525, we should implement `isequal` for the `arrow.tabular.RecordBatch` MATLAB class.

### What changes are included in this PR?

1. Implemented `isequal` method for `arrow.tabular.RecordBatch`

### Are these changes tested?

Yes. Added `isequal` unit tests to `tRecordBatch.m`.

### Are there any user-facing changes?

Yes, users can now use `isequal` to compare `arrow.tabular.RecordBatch`es. 

**Example**

```matlab
>> t1 = table(1, "A", false, VariableNames=["Number",  "String", "Logical"]);
>> t2 = table([1; 2], ["A"; "B"], [false; false], VariableNames=["Number",  "String", "Logical"]); 
>> rb1 = arrow.recordBatch(t1);
>> rb2 = arrow.recordBatch(t2);
>> rb3 = arrow.recordBatch(t1);

>> isequal(rb1, rb2)

ans =

  logical

   0

>> isequal(rb1, rb3)

ans =

  logical

   1
```

### Future Directions
1. apache#37628

* Closes: apache#37570

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…able` MATLAB class (apache#37629)

### Rationale for this change

Following on to apache#37474, apache#37446, apache#37525,  and  apache#37627, we should implement `isequal` for the arrow.tabular.Table` MATLAB class.

### What changes are included in this PR?

1. Add new function `arrow.internal.tabular.isequal` that both `arrow.tabular.RecordBatch` and `arrow.tabular.Table` can use to implement their `isequal` methods.
2. Modified `arrow.tabular.RecordBatch` to use the new `isequal` package function to implement  its `isequal` method.
3. Implemented the `isequal` method for `arrow.tabular.Table` using the new `isequal` package function.

### Are these changes tested?

Yes, added `isequal` unit tests to `tTable.m`

### Are there any user-facing changes?

Yes. Users can now compare `arrow.tabular.Table`s using `isequal`:

```matlab
>> t1 = table(1, "A", false, VariableNames=["Number",  "String", "Logical"]);
>> t2 = table([1; 2], ["A"; "B"], [false; false], VariableNames=["Number",  "String", "Logical"]); 
>> tbl1 = arrow.table(t1);
>> tbl2 = arrow.table(t2);
>> tbl3 = arrow.table(t1);

>> isequal(tbl1, tbl2)

ans =

  logical

   0

>> isequal(tbl1, tbl3)

ans =

  logical

   1
```

* Closes: apache#37628

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[MATLAB] Implement the isequal() method on arrow.array.Array
3 participants