New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27711][CORE] Unset InputFileBlockHolder at the end of tasks #24690
Conversation
Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model. new pyspark test Closes apache#24605 from jose-torres/fix254. Authored-by: Jose Torres <torres.joseph.f+github@gmail.com> Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
LGTM |
1 similar comment
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Test build #105736 has finished for PR 24690 at commit
|
retest this please |
Test build #105741 has finished for PR 24690 at commit
|
Test build #105766 has finished for PR 24690 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Merged to branch-2.4
.
Thank you, @jose-torres , @zsxwing , @HyukjinKwon .
## What changes were proposed in this pull request? Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model. Backport to 2.4. ## How was this patch tested? new pyspark test Closes #24690 from jose-torres/fix24. Authored-by: Jose Torres <torres.joseph.f+github@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Can we have this in |
## What changes were proposed in this pull request? Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model. Backport to 2.4. ## How was this patch tested? new pyspark test Closes apache#24690 from jose-torres/fix24. Authored-by: Jose Torres <torres.joseph.f+github@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request? Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model. Backport to 2.4. ## How was this patch tested? new pyspark test Closes apache#24690 from jose-torres/fix24. Authored-by: Jose Torres <torres.joseph.f+github@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
Unset InputFileBlockHolder at the end of tasks to stop the file name from leaking over to other tasks in the same thread. This happens in particular in Pyspark because of its complex threading model.
Backport to 2.4.
How was this patch tested?
new pyspark test