Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IndexedSet to help manage workers/jobs. #463

Closed
wants to merge 5 commits into from
Closed

Add IndexedSet to help manage workers/jobs. #463

wants to merge 5 commits into from

Conversation

stephenh
Copy link
Contributor

This is admittedly somewhat cute, but I like the idea of more strictly/DRYly applying the "remove the job from all xxxToJob indexes" logic, instead of having to manually remember to do the "xxxToJob -= job" calls.

@mateiz
Copy link
Member

mateiz commented Feb 14, 2013

This is definitely interesting. Let me think about it a bit more (haven't had a lot of time in the past few days), but it might be worth going for if we can use this throughout.

Stephen Haberman added 2 commits February 18, 2013 16:26
This also fixes a bug where a StatusUpdate message after an executor had already
been removed would result in a NoSuchElementException when updating freeCores.
Conflicts:
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
@stephenh
Copy link
Contributor Author

Yesterday I had a job run into a race condition in StandaloneSchedulerBackend where freeCores was being decremented in StatusUpdate, but the executor had already been removed, so it failed with NoSuchElementException:

13/02/17 07:03:23 ERROR cluster.StandaloneSchedulerBackend$DriverActor: key not found: 13
java.util.NoSuchElementException: key not found: 13
  at scala.collection.MapLike$class.default(MapLike.scala:225)
  at scala.collection.mutable.HashMap.default(HashMap.scala:45)
  at scala.collection.MapLike$class.apply(MapLike.scala:135)
  at scala.collection.mutable.HashMap.apply(HashMap.scala:45)
  at spark.scheduler.cluster.StandaloneSchedulerBackend$DriverActor$$anonfun$receive$1.apply(StandaloneSchedulerBackend.scala:60)

I thought this would be a good excuse for more IndexedSet cuteness, so fixed the bug by having just one "executors" map and ensuring the executor still exists before updating it.

Also merged in master and ran the tests.

Conflicts:
	core/src/main/scala/spark/deploy/master/Master.scala
@stephenh
Copy link
Contributor Author

Again remerged in master with your job->app/split->partition changes (nice!).

totalCoreCount.addAndGet(cores)
makeOffers()
makeOffers(e)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this change is right--we only have 1 new executor, so I believe calling makeOffers(e) is fine, vs. the previous behavior which would re-makeOffers() for the new + all existing executors.

Conflicts:
	core/src/main/scala/spark/deploy/master/Master.scala
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@AmplabJenkins
Copy link

I'm the Jenkins test bot for the UC, Berkeley AMPLab. I've noticed your pull request and will test it once an admin authorizes me to. Thanks for your submission!

1 similar comment
@AmplabJenkins
Copy link

I'm the Jenkins test bot for the UC, Berkeley AMPLab. I've noticed your pull request and will test it once an admin authorizes me to. Thanks for your submission!

@AmplabJenkins
Copy link

Thank you for your pull request. An admin will review this request soon.

@stephenh stephenh closed this Oct 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants