Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] DefaultVectorComparators does not have a comparator for BaseLargeVariableVector #25659

Closed
asfimport opened this issue Jul 29, 2020 · 3 comments · Fixed by #37887
Closed

Comments

@asfimport
Copy link
Collaborator

asfimport commented Jul 29, 2020

The method

org.apache.arrow.algorithm.sort.DefaultVectorComparators#createDefaultComparator

does not handle vectors that are instances of BaseLargeVariableWidthVector

It would be nice to extend this method to handle this case. It looks like it ought to be easy to define an appropriate comparator class

class LargeVariableWidthComparator extends VectorValueComparator<BaseLargeVariableWidthVector> 

Reporter: Steve M. Kim

Related issues:

Note: This issue was originally created as ARROW-9595. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Liya Fan / @liyafan82:
[~chairmank] Thanks a lot for opening this issue.
BaseLargeVariableVector is recently added to our code base, and sorting & comparison for it should be supported as well.

To reuse the testing framework for sort, I think it should be supported after ARROW-9554 is done.

@jduo
Copy link
Member

jduo commented Sep 26, 2023

This issue is duplicated by #14665 .

@jduo
Copy link
Member

jduo commented Sep 26, 2023

take

jduo added a commit to jduo/arrow that referenced this issue Sep 26, 2023
…e width types

Add DefaultVectorComparators for large vector types (LargeVarCharVector
and LargeVarBinaryVector).
lidavidm pushed a commit that referenced this issue Sep 26, 2023
### Rationale for this change
Support additional vector types in DefaultVectorComparators to make arrow-algorithm easier to use.

### What changes are included in this PR?
Add DefaultVectorComparators for large vector types (LargeVarCharVector and LargeVarBinaryVector).

### Are these changes tested?
Yes.

### Are there any user-facing changes?
No.
* Closes: #25659

Authored-by: James Duong <duong.james@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
@lidavidm lidavidm added this to the 14.0.0 milestone Sep 26, 2023
etseidl pushed a commit to etseidl/arrow that referenced this issue Sep 28, 2023
…pache#37887)

### Rationale for this change
Support additional vector types in DefaultVectorComparators to make arrow-algorithm easier to use.

### What changes are included in this PR?
Add DefaultVectorComparators for large vector types (LargeVarCharVector and LargeVarBinaryVector).

### Are these changes tested?
Yes.

### Are there any user-facing changes?
No.
* Closes: apache#25659

Authored-by: James Duong <duong.james@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
JerAguilon pushed a commit to JerAguilon/arrow that referenced this issue Oct 23, 2023
…pache#37887)

### Rationale for this change
Support additional vector types in DefaultVectorComparators to make arrow-algorithm easier to use.

### What changes are included in this PR?
Add DefaultVectorComparators for large vector types (LargeVarCharVector and LargeVarBinaryVector).

### Are these changes tested?
Yes.

### Are there any user-facing changes?
No.
* Closes: apache#25659

Authored-by: James Duong <duong.james@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
dongjoon-hyun pushed a commit to apache/spark that referenced this issue Nov 4, 2023
### What changes were proposed in this pull request?
This pr upgrade Apache Arrow from 13.0.0 to 14.0.0.

### Why are the changes needed?
The Apache Arrow 14.0.0 release brings a number of enhancements and bug fixes.
‎
In terms of bug fixes, the release addresses several critical issues that were causing failures in integration jobs with Spark([GH-36332](apache/arrow#36332)) and problems with importing empty data arrays([GH-37056](apache/arrow#37056)). It also optimizes the process of appending variable length vectors([GH-37829](apache/arrow#37829)) and includes C++ libraries for MacOS AARCH 64 in Java-Jars([GH-38076](apache/arrow#38076)).
‎
The new features and improvements focus on enhancing the handling and manipulation of data. This includes the introduction of DefaultVectorComparators for large types([GH-25659](apache/arrow#25659)), support for extended expressions in ScannerBuilder([GH-34252](apache/arrow#34252)), and the exposure of the VectorAppender class([GH-37246](apache/arrow#37246)).
‎
The release also brings enhancements to the development and testing process, with the CI environment now using JDK 21([GH-36994](apache/arrow#36994)). In addition, the release introduces vector validation consistent with C++, ensuring consistency across different languages([GH-37702](apache/arrow#37702)).
‎
Furthermore, the usability of VarChar writers and binary writers has been improved with the addition of extra input methods([GH-37705](apache/arrow#37705)), and VarCharWriter now supports writing from `Text` and `String`([GH-37706](apache/arrow#37706)). The release also adds typed getters for StructVector, improving the ease of accessing data([GH-37863](apache/arrow#37863)).

The full release notes as follows:
- https://arrow.apache.org/release/14.0.0.html

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43650 from LuciferYang/arrow-14.

Lead-authored-by: yangjie01 <yangjie01@baidu.com>
Co-authored-by: YangJie <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…pache#37887)

### Rationale for this change
Support additional vector types in DefaultVectorComparators to make arrow-algorithm easier to use.

### What changes are included in this PR?
Add DefaultVectorComparators for large vector types (LargeVarCharVector and LargeVarBinaryVector).

### Are these changes tested?
Yes.

### Are there any user-facing changes?
No.
* Closes: apache#25659

Authored-by: James Duong <duong.james@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…pache#37887)

### Rationale for this change
Support additional vector types in DefaultVectorComparators to make arrow-algorithm easier to use.

### What changes are included in this PR?
Add DefaultVectorComparators for large vector types (LargeVarCharVector and LargeVarBinaryVector).

### Are these changes tested?
Yes.

### Are there any user-facing changes?
No.
* Closes: apache#25659

Authored-by: James Duong <duong.james@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants