Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only reserves resources for workers that have recently called get_work #1272

Merged
merged 1 commit into from
Oct 7, 2015

Conversation

daveFNbuck
Copy link
Contributor

This presents workers that are scheduling or being phased out from
holding on to resources that they're not ready to use yet or will never
request. To accomplish this, we only reserve resources for workers that
have recently made a get_work request. We define recently by being
within the worker_disconnect_delay. Really this should be the
wait_interval, but the scheduler doesn't have access to that value.

@Tarrasch
Copy link
Contributor

Tarrasch commented Oct 6, 2015

I'm confused. I thought you could only hold on to a resource by running a task that contains resources. No?

@daveFNbuck
Copy link
Contributor Author

No, in the get_work we also do speculative scheduling and reserve resources for higher priority tasks. That way if you have a worker with a lot of low-priority tasks, it can't hold on to all the resources just because it's the first to ask when one of its jobs finishes.

The problem that this fixes is that I was scheduling a very large backfill and it included some high priority tasks that blocked resource usage for the 30+ minutes that scheduling took.

@@ -343,6 +343,7 @@ def test_do_not_lock_resources_when_not_ready(self):
self.assertEqual('C', self.sch.get_work(worker='Y')['task_id'])

def test_lock_resources_when_one_of_multiple_workers_is_ready(self):
self.sch.get_work(worker='X')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment for these new lines why you are doing them?

Even with the context of this PR, I'm not certain. I suppose it's for making the worker "active"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, comment added

@Tarrasch
Copy link
Contributor

Tarrasch commented Oct 6, 2015

I wonder if this really is a good solution. Maybe it would be better to "just implement" that the scheduling and get_work phases happen in parallel.

Otherwise this LGTM.

@daveFNbuck
Copy link
Contributor Author

I'd still want this solution because it helps when a worker stops requesting more jobs. For example, after a SIGUSR1. I think long-term we probably want a better solution for how the reservations work, but I think that would involve basing reservations on get work calls too.

…k request

This presents workers that are scheduling or being phased out from
holding on to resources that they're not ready to use yet or will never
request. To accomplish this, we only reserve resources for workers that
have recently made a get_work request. We define recently by being
within the worker_disconnect_delay. Really this should be the
wait_interval, but the scheduler doesn't have access to that value.
@Tarrasch
Copy link
Contributor

Tarrasch commented Oct 6, 2015

Ah, the SIGUSR1 case makes sense. I'm off now but I'll review it again and potentially merge by tomorrow.

@Tarrasch
Copy link
Contributor

Tarrasch commented Oct 7, 2015

Makes sense to me. Thanks @daveFNbuck!

Tarrasch added a commit that referenced this pull request Oct 7, 2015
Only reserves resources for workers that have recently called get_work
@Tarrasch Tarrasch merged commit b4656dc into spotify:master Oct 7, 2015
@daveFNbuck daveFNbuck deleted the reserve_after_get branch June 1, 2017 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants