-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-2387: remove stage barrier #1328
Closed
Closed
Changes from all commits
Commits
Show all changes
91 commits
Select commit
Hold shift + click to select a range
163302d
minor fix
f81476d
Merge branch 'master' of https://github.com/lirui-intel/spark
3124380
try to locate the point to remove the barrier
8e625c0
apply upstream hot fix
1d5d0f0
RemoveStageBarrier: support partial map outputs
c4f4054
RemoveStageBarrier: build fix
444d2d9
RemoveStageBarrier: register map outputs progressively
2df1d4e
RemoveStageBarrier: increment epoch for progressive registration
9f18dc7
RemoveStageBarrier: fix check free CPUs
7af23c0
RemoveStageBarrier: make reducers refresh map outputs less often
9a32a17
RemoveStageBarrier: start reducers earlier
9ffb208
RemoveStageBarrier: add log info
ef3b043
RemoveStageBarrier: adjust sleep interval
4213d63
RemoveStageBarrier: add a new iterator to manage partial map outputs
376230a
RemoveStageBarrier: minor fixes
efd31ef
RemoveStageBarrier: fix: reducers may fail due to very slow mappers
3cb944c
RemoveStageBarrier: add some log info
641715e
RemoveStageBarrier: stage with a bigger ID should take precedence
b0c2df2
RemoveStageBarrier: track whether map output for a shuffle is partial…
75d2744
RemoveStageBarrier: refine how we get the stage to pre-start
b7f1f84
RemoveStageBarrier: indicate the output is partial for progressive re…
be47408
add some debug info
c88014b
add a new locality level for tasks with no preferred locations
133a356
re-compute pending list when new executor is added
7d92f9a
pendingTasksWithNoPrefs should only contain tasks that really have no…
c1de426
make the delay schedule configurable
e57e081
clean up
fda0281
do some refactor
781861d
RemoveStageBarrier: fix problem with consolidated shuffle file
679813b
RemoveStageBarrier: should fail the pre-started stages if the parent …
563d743
RemoveStageBarrier: fix issue with empty shuffle blocks
2ab311e
RemoveStageBarrier: allow partial map output by default
46da965
RemoveStageBarrier: make sure the feature is enabled before we use pa…
a89c93f
RemoveStageBarrier: partialForShuffle may cause infinite loop
5cfbae8
RemoveStageBarrier: cannot only depend on epoch to determine if the l…
2df5939
RemoveStageBarrier: fix bug
6891d58
RemoveStageBarrier: adjust fetching order of CoGroupedRDD
104ebe3
merge upstream master
5dd28dc
RemoveStageBarrier: make sure pre-started stage has lower priority to…
6cdf2a3
RemoveStageBarrier: revert previous changes to CoGroupedRDD
af000f7
RemoveStageBarrier: sleep less waiting for new map outputs
8349424
RemoveStageBarrier: don't rely on epoch for updated map statuses
28679b9
RemoveStageBarrier: add a proxy to update partial map outputs periodi…
71b87e5
RemoveStageBarrier: remove verbose logs
04f17e8
RemoveStageBarrier: don't increase epoch for partial map output regis…
eafa476
RemoveStageBarrier: don't put partial outputs in cache
7547686
RemoveStageBarrier: block reducers waiting for new map outputs
ca83d19
RemoveStageBarrier: bug fix
539f1a8
RemoveStageBarrier: add API to SchedulerBackend to tell if there's fr…
6e10488
RemoveStageBarrier: refine logs
a48d592
RemoveStageBarrier: fix the way we compute free slots
a418f03
RemoveStageBarrier: when a task finishes, launch new tasks before pop…
3ced2bb
RemoveStageBarrier: make offer after successful/failed task is proper…
5b0031a
RemoveStageBarrier: handle failed task in a synchronized manner
3c52c69
RemoveStageBarrier: add temp test code to detect deadlock
0473e3b
RemoveStageBarrier: maintain support for asynchronous handling failed…
d267c9b
RemoveStageBarrier: fix previously found problem
118914b
RemoveStageBarrier: fix test code
fe63024
RemoveStageBarrier: remove temp code
a996c77
RemoveStageBarrier: add temp test code
cef517b
RemoveStageBarrier: fix shuffle map stage fail over
7d9a4a4
RemoveStageBarrier: kill running tasks when resubmit failed stages
5697b98
RemoveStageBarrier: refine temp test code
4f80b1d
RemoveStageBarrier: fix test code
39ddb9d
RemoveStageBarrier: remove temp code
8cb8e4c
RemoveStageBarrier: kill running tasks before resubmit failed stages
bc69fed
RemoveStageBarrier: add temp test code
1e1907d
RemoveStageBarrier: fix test code
930136d
RemoveStageBarrier: handle fetch failed task only if it comes from a …
b49cbdb
RemoveStageBarrier: kill tasks without interrupting the thread
6bcca9b
RemoveStageBarrier: remove test code
8fded0e
RemoveStageBarrier: use AKKA actor to access DAGScheduler's data stru…
aa2e0f2
RemoveStageBarrier: fix bug
0bbdb5d
RemoveStageBarrier: compute sorted task sets without holding a lock o…
d941899
RemoveStageBarrier: make the updater sleep a little longer if maps ar…
12b8093
RemoveStageBarrier: fix bug
c74a876
Revert "RemoveStageBarrier: fix bug"
c313fe0
RemoveStageBarrier: pre-start a stage if all of its parents' tasks ha…
033ffc0
RemoveStageBarrier: code refactor
8a08a6c
RemoveStageBarrier: add some log
a8b5d75
RemoveStageBarrier: revert change about tracking waiting tasks
f66a8eb
RemoveStageBarrier: code cleanup
1521fef
merge upstream master branch
9747d6b
RemoveStageBarrier: fix code style
8f798d8
RemoveStageBarrier: minor fix
1ab7a15
RemoveStageBarrier: minor fix
8417ffe
RemoveStageBarrier: let the reducer wake the updater
31c4634
RemoveStageBarrier: introduce a min interval to update map status
e1c374c
RemoveStageBarrier: fix bug
a503508
RemoveStageBarrier: code clean up
85a5d85
fix style
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
ConcurrentHashMap
is better in most cases.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this PR be merged soon? If not, I hope this line can be merged soon because it solves a critical concurrent issue of
mapStatuses
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zsxwing thanks for the comments. Maybe it's better to make it ConcurrentHashMap in the base class.
I don't think this PR can be merged soon... So maybe you can open another JIRA to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because MapOutputTrackerMaster uses TimeStampedHashMap which is not a ConcurrentHashMap, MapOutputTracker still needs to use Map. Nevertheless, I can add a comment on MapOutputTracker.mapStatuses to mark that it should be a thread-safe map.