Skip to content

Conversation

@aarondav
Copy link
Contributor

Previously, the behavior was that if the parent RDD threw any exception other than IOException or FileNotFoundException (which is quite possible for Hadoop input sources), the entire Executor would crash, because the default thread a uncaught exception handler calls System.exit().

This patch avoids two related issues:

  1. Always catch exceptions in this reader thread.
  2. Don't mask readerException when Python throws an EOFError
    after worker.shutdownOutput() is called.

Previously, the behavior was that if the parent RDD threw any exception other
than IOException or FileNotFoundException (which is quite possible for Hadoop
input sources), the entire Executor would crash, because the default thread a
uncaught exception handler calls System.exit().

This patch avoids two related issues:

  1. Always catch exceptions in this reader thread.
  2. Don't mask readerException when Python throws an EOFError
     after worker.shutdownOutput() is called.
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14338/

@aarondav
Copy link
Contributor Author

Hopefully this fix can go into 1.0, as it both obfuscates real errors while running pyspark and can kill a whole application if triggered.

@pwendell
Copy link
Contributor

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14362/

@pwendell
Copy link
Contributor

This PR was timed out inside of the pyspark tests. I killed it because it was blocking a bunch of other PR's from finishing.

@pwendell
Copy link
Contributor

Might be worth running the pyspark tests locally.

@pwendell
Copy link
Contributor

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14368/

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@aarondav
Copy link
Contributor Author

Whoops, turns out FileNotFoundException extends IOException. Who knew! (Also we need to get rid of that catch IOException...)

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14396/

@pwendell
Copy link
Contributor

Thanks - I merged this.

@asfgit asfgit closed this in a967b00 Apr 23, 2014
asfgit pushed a commit that referenced this pull request Apr 23, 2014
Previously, the behavior was that if the parent RDD threw any exception other than IOException or FileNotFoundException (which is quite possible for Hadoop input sources), the entire Executor would crash, because the default thread a uncaught exception handler calls System.exit().

This patch avoids two related issues:

  1. Always catch exceptions in this reader thread.
  2. Don't mask readerException when Python throws an EOFError
     after worker.shutdownOutput() is called.

Author: Aaron Davidson <aaron@databricks.com>

Closes #486 from aarondav/pyspark and squashes the following commits:

fbb11e9 [Aaron Davidson] Make sure FileNotFoundExceptions are handled same as before
b9acb3e [Aaron Davidson] SPARK-1572 Don't kill Executor if PythonRDD fails while computing parent
(cherry picked from commit a967b00)

Signed-off-by: Patrick Wendell <pwendell@gmail.com>
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
Previously, the behavior was that if the parent RDD threw any exception other than IOException or FileNotFoundException (which is quite possible for Hadoop input sources), the entire Executor would crash, because the default thread a uncaught exception handler calls System.exit().

This patch avoids two related issues:

  1. Always catch exceptions in this reader thread.
  2. Don't mask readerException when Python throws an EOFError
     after worker.shutdownOutput() is called.

Author: Aaron Davidson <aaron@databricks.com>

Closes apache#486 from aarondav/pyspark and squashes the following commits:

fbb11e9 [Aaron Davidson] Make sure FileNotFoundExceptions are handled same as before
b9acb3e [Aaron Davidson] SPARK-1572 Don't kill Executor if PythonRDD fails while computing parent
yifeih pushed a commit to yifeih/spark that referenced this pull request Mar 5, 2019
This avoids annoying issues with IDE integration and building where you have to remember incantation to run correct combination. This entails following changes

* Hadoop default changed from 2.7.4 to 2.9.1
* yarn, kubernetes, hadoop-cloud and kinesis modules are on by default

SparkR is left out since it requries a bit more invasive changes to enable by default
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants