Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-27711][CORE] Unset InputFileBlockHolder at the end of tasks #24690

Closed
wants to merge 2 commits into from

Conversation

jose-torres
Copy link
Contributor

@jose-torres jose-torres commented May 23, 2019

What changes were proposed in this pull request?

Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model.

Backport to 2.4.

How was this patch tested?

new pyspark test

Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model.

new pyspark test

Closes apache#24605 from jose-torres/fix254.

Authored-by: Jose Torres <torres.joseph.f+github@gmail.com>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
@jose-torres
Copy link
Contributor Author

@zsxwing
Copy link
Member

zsxwing commented May 23, 2019

LGTM

1 similar comment
@jiangxb1987
Copy link
Contributor

LGTM

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented May 23, 2019

Test build #105736 has finished for PR 24690 at commit 48de77b.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented May 24, 2019

Test build #105741 has finished for PR 24690 at commit 48de77b.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 24, 2019

Test build #105766 has finished for PR 24690 at commit 5dc8485.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to branch-2.4.

Thank you, @jose-torres , @zsxwing , @HyukjinKwon .

dongjoon-hyun pushed a commit that referenced this pull request May 26, 2019
## What changes were proposed in this pull request?

Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model.

Backport to 2.4.

## How was this patch tested?

new pyspark test

Closes #24690 from jose-torres/fix24.

Authored-by: Jose Torres <torres.joseph.f+github@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@dongjoon-hyun
Copy link
Member

Can we have this in branch-2.3, too?

rluta pushed a commit to rluta/spark that referenced this pull request Sep 17, 2019
## What changes were proposed in this pull request?

Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model.

Backport to 2.4.

## How was this patch tested?

new pyspark test

Closes apache#24690 from jose-torres/fix24.

Authored-by: Jose Torres <torres.joseph.f+github@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Sep 26, 2019
## What changes were proposed in this pull request?

Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model.

Backport to 2.4.

## How was this patch tested?

new pyspark test

Closes apache#24690 from jose-torres/fix24.

Authored-by: Jose Torres <torres.joseph.f+github@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants