Changed localProperties to use ThreadLocal (not DynamicVariable). #926
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The fact that DynamicVariable uses an InheritableThreadLocal
can cause problems where the the localProperties variable is
shared across threads; Spark should use a TheadLocal instead. The
bug only occurs when someone runs a Spark job in a thread,
and then runs many concurrent Spark jobs in child threads.
Here's how the problem can occur:
sc = new SparkContext(...)
sc.setJobDescription("Foo")
(1 to 2).map { i ->
new Thread() {
override def run() {
sc.setJobDescription("Job " + i)
Thread.sleep(100)
sc.makeRdd(.....)
}
}
In this code, both jobs will end up with the same description.
On the first call to setJobDescription(), SparkContext.setLocalProperty() will find that localProperties (a Dynamic variable) is null -- so it will initialize it, and then set the job description property to foo.
When each of the two new threads is created, the value of localProperties is inherited from the value in the parent thread, which is an InheritableThreadLocal. The way that the InheritableThreadLocal childValue() function works is that it just returns a reference to the parent value (http://docs.oracle.com/javase/6/docs/api/java/lang/InheritableThreadLocal.html#childValue(T)) -- so now the localProperties variable in both child threads refers to the same underlying Properties.