Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-38246: [JAVA] added new getTransferPair() function that takes in a Field type for Complex Type Vectors #38261

Merged
merged 9 commits into from
Oct 20, 2023

Conversation

xxlaykxx
Copy link
Contributor

@xxlaykxx xxlaykxx commented Oct 13, 2023

Rationale for this change

Additionally, a new function getTransferPair(Field, Allocator) is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made children not final - for updating field object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

What changes are included in this PR?

  • getTransferPair method for ComplexType Vectors and for BaseVariableWidthVector's

Are these changes tested?

Yes, some tests have been added to verify these changes.

Are there any user-facing changes?

Yes.

This PR includes breaking changes to public APIs.

@github-actions
Copy link

⚠️ GitHub issue #38246 has been automatically assigned in GitHub to PR creator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this class immutable. A method that makes a copy of the field with children is OK, but mutating the field isn't.

Comment on lines 666 to 670
if (field.getChildren().contains(getDataVector().getField())) {
return field;
}
field.setChildren(Collections.singletonList(getDataVector().getField()));
return field;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of mutating the field here, we can keep the check, but just create (and cache) a new field if needed. IMO, if the garbage is that much of a concern, the creator of the vector should provide the expected type up front (this only exists because a ListVector can set its child type dynamically).

@@ -134,8 +134,12 @@ public interface ValueVector extends Closeable, Iterable<ValueVector> {

TransferPair getTransferPair(String ref, BufferAllocator allocator);

TransferPair getTransferPair(Field field, BufferAllocator allocator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it's unfortunate that this makes existing callsites ambiguous...we should've renamed it in the original PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While there's no docstring for the existing function, let's add a docstring for the new ones at least.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this test changed?

Comment on lines 474 to 475
field.setChildren(children);
return field;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caching the field is OK, but let's not mutate it.

@@ -121,9 +119,27 @@ public static LargeListVector empty(String name, BufferAllocator allocator) {
*/
public LargeListVector(String name, BufferAllocator allocator, FieldType fieldType, CallBack callBack) {
super(allocator);
this.name = name;
this.field = new Field(name, fieldType, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this constructor delegate to the one below?

@@ -95,7 +94,23 @@ public static ListVector empty(String name, BufferAllocator allocator) {
public ListVector(String name, BufferAllocator allocator, FieldType fieldType, CallBack callBack) {
super(name, allocator, callBack);
this.validityBuffer = allocator.getEmpty();
this.fieldType = checkNotNull(fieldType);
this.field = new Field(name, fieldType, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delegate this ctor to the one below?

this.listSize = ((ArrowType.FixedSizeList) fieldType.getType()).getListSize();
Preconditions.checkArgument(listSize >= 0, "list size must be non-negative");
this.valueCount = 0;
this.validityAllocationSizeInBytes = getValidityBufferSizeFromCount(INITIAL_VALUE_ALLOCATION);
this.field = new Field(name, fieldType, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delegate this ctor to the one below?

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Oct 13, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 14, 2023
@xxlaykxx
Copy link
Contributor Author

@lidavidm could you plz re-review?

@xxlaykxx
Copy link
Contributor Author

@lidavidm still need re-review, plz

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Oct 20, 2023
@lidavidm lidavidm merged commit a376e3c into apache:main Oct 20, 2023
15 of 17 checks passed
@lidavidm lidavidm removed the awaiting merge Awaiting merge label Oct 20, 2023
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit a376e3c.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 15 possible false positives for unstable benchmarks that are known to sometimes produce them.

xxlaykxx added a commit to dremio/arrow that referenced this pull request Oct 23, 2023
…s in a Field type for Complex Type Vectors (apache#38261) (#57)

### Rationale for this change

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### What changes are included in this PR?

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### Are these changes tested?

Yes, some tests have been added to verify these changes.
### Are there any user-facing changes?

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>

(cherry picked from commit a376e3c)
JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 23, 2023
…s in a Field type for Complex Type Vectors (apache#38261)

### Rationale for this change

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### What changes are included in this PR?

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### Are these changes tested?

Yes, some tests have been added to verify these changes.
### Are there any user-facing changes?

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 25, 2023
…s in a Field type for Complex Type Vectors (apache#38261)

### Rationale for this change

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### What changes are included in this PR?

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### Are these changes tested?

Yes, some tests have been added to verify these changes.
### Are there any user-facing changes?

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
xxlaykxx added a commit to dremio/arrow that referenced this pull request Oct 29, 2023
…s in a Field type for Complex Type Vectors (apache#38261) (#57)

### Rationale for this change

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### What changes are included in this PR?

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### Are these changes tested?

Yes, some tests have been added to verify these changes.
### Are there any user-facing changes?

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>

(cherry picked from commit a376e3c)
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…s in a Field type for Complex Type Vectors (apache#38261)

### Rationale for this change

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### What changes are included in this PR?

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### Are these changes tested?

Yes, some tests have been added to verify these changes.
### Are there any user-facing changes?

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
DenisTarasyuk pushed a commit to dremio/arrow that referenced this pull request Dec 26, 2023
…s in a Field type for Complex Type Vectors (apache#38261) (#57)

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

Yes, some tests have been added to verify these changes.

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>

(cherry picked from commit a376e3c)
DenisTarasyuk pushed a commit to dremio/arrow that referenced this pull request Jan 16, 2024
…s in a Field type for Complex Type Vectors (apache#38261) (#57)

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

Yes, some tests have been added to verify these changes.

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>

(cherry picked from commit a376e3c)
DenisTarasyuk pushed a commit to dremio/arrow that referenced this pull request Jan 17, 2024
…s in a Field type for Complex Type Vectors (apache#38261) (#57)

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

Yes, some tests have been added to verify these changes.

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>

(cherry picked from commit a376e3c)
DenisTarasyuk pushed a commit to dremio/arrow that referenced this pull request Jan 17, 2024
…s in a Field type for Complex Type Vectors (apache#38261) (#57)

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

Yes, some tests have been added to verify these changes.

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>

(cherry picked from commit a376e3c)
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…s in a Field type for Complex Type Vectors (apache#38261)

### Rationale for this change

Additionally, a new function **getTransferPair(Field, Allocator)** is introduced so that a new Field method is not constructed each time getTransferPair is called on the Vector.
Added Field.setChildren() method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### What changes are included in this PR?

- `getTransferPair` method for ComplexType Vectors and for BaseVariableWidthVector's

- added `Field.setChildren()` method and made **children** not final - for updating **field** object inside complex vector(it possible that Field will be without children on creation) - optimisation for keeping using already existing Field object.

### Are these changes tested?

Yes, some tests have been added to verify these changes.
### Are there any user-facing changes?

Yes.

**This PR includes breaking changes to public APIs.**

* Closes: apache#38246

Lead-authored-by: Ivan Chesnov <ivan.chesnov@dremio.com>
Co-authored-by: Ivan Chesnov <xxxlaykxxx@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants