Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22092] Reallocation in OffHeapColumnVector.reserveInternal corrupts struct and array data #19323

Closed
wants to merge 1 commit into from

Conversation

ala
Copy link
Contributor

@ala ala commented Sep 22, 2017

OffHeapColumnVector.reserveInternal() will only copy already inserted values during reallocation if data != null. In vectors containing arrays or structs this is incorrect, since there field data is not used at all. We need to check nulls instead.

Adds new tests to ColumnVectorSuite that reproduce the errors.

…rupts struct and array data

`OffHeapColumnVector.reserveInternal()` will only copy already inserted values during reallocation if `data != null`. In vectors containing arrays or structs this is incorrect, since there field `data` is not used at all. We need to check `nulls` instead.

Adds new tests to `ColumnVectorSuite` that reproduce the errors.

Author: Ala Luszczak <ala@databricks.com>

Closes apache#19308 from ala/vector-realloc.

(cherry picked from commit d2b2932)
Signed-off-by: Ala Luszczak <ala@databricks.com>
@ala
Copy link
Contributor Author

ala commented Sep 22, 2017

@hvanhovell

@SparkQA
Copy link

SparkQA commented Sep 22, 2017

Test build #82085 has started for PR 19323 at commit 6ebeb21.

@hvanhovell
Copy link
Contributor

LGTM

@hvanhovell
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Sep 22, 2017

Test build #82095 has finished for PR 19323 at commit 6ebeb21.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

hvanhovell commented Sep 23, 2017

merging to 2.2. Thanks!

asfgit pushed a commit that referenced this pull request Sep 23, 2017
…rupts struct and array data

`OffHeapColumnVector.reserveInternal()` will only copy already inserted values during reallocation if `data != null`. In vectors containing arrays or structs this is incorrect, since there field `data` is not used at all. We need to check `nulls` instead.

Adds new tests to `ColumnVectorSuite` that reproduce the errors.

Author: Ala Luszczak <ala@databricks.com>

Closes #19323 from ala/port-vector-realloc.
@hvanhovell
Copy link
Contributor

@ala can you close this one? The merge script does not do that for backports.

@ala
Copy link
Contributor Author

ala commented Sep 25, 2017

@hvanhovell Sure. Thanks!

@ala ala closed this Sep 25, 2017
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
…rupts struct and array data

`OffHeapColumnVector.reserveInternal()` will only copy already inserted values during reallocation if `data != null`. In vectors containing arrays or structs this is incorrect, since there field `data` is not used at all. We need to check `nulls` instead.

Adds new tests to `ColumnVectorSuite` that reproduce the errors.

Author: Ala Luszczak <ala@databricks.com>

Closes apache#19323 from ala/port-vector-realloc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants