Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: fix work fetch edge case to avoid idleness #2837

Merged
merged 1 commit into from Dec 17, 2018
Merged

Conversation

davidpanderson
Copy link
Contributor

@davidpanderson davidpanderson commented Nov 22, 2018

It may take a minute or two between deciding to fetch work
and actually getting some; you may have to try a few projects.
So it's better to start work fetch a bit before a resource instance becomes idle,
rather than waiting until it's idle.

We were already doing this, with a "lead time" of 3 minutes,
except for the case where all the fetchable projects
are zero resource share ("backup" projects).
We'd request work from backup projects only when an instance is idle.

This change fixes that by allowing work fetch from backup projects
if an instance is within 3 minutes of going idle.
It also makes the 3 minutes, in both places,
into a constant WF_EST_FETCH_TIME rather than hardwired.

BTW, the reason for the old policy is that we want to avoid
situations where we fetch a big job from a backup project
when jobs from a non-backup project would have been available soon.
This change may cause that to happen (rarely)
but it's worth it to avoid idleness.

Fixes #

Description of the Change

Alternate Designs

Release Notes

It may take a minute or two between deciding to fetch work
and actually getting some; you may have to try a few projects.
So it's better to start work fetch a bit before a resource instance becomes idle,
rather than waiting until it's idle.

We were already doing this, with a "lead time" of 3 minutes,
except for the case where all the fetchable projects
are zero resource share ("backup" projects).
We'd request work from backup projects only when an instance is idle.

This change fixes that by allowing work fetch from backup projects
if an instance is within 3 minutes of going idle.
It also makes the 3 minutes, in both places,
into a constant WF_EST_FETCH time rather than hardwired.

BTW, the reason for the old policy is that we want to avoid
situations where we fetch a big job from a backup project
when jobs from a non-backup project would have been available soon.
This change may cause that to happen (rarely)
but it's worth it to avoid idleness.
@JuhaSointusalo
Copy link
Contributor

@JacobWKlein

Since you were the one with this odd setup of mostly backup projects could you test this? The changes look ok but I'm not all that familiar with the work fetch code so I can't tell for sure there couldn't be any surprises.

You can get prebuild client from AppVeyor. You don't need to do full alpha release testing. Testing work fetch is enough.

@JacobWKlein
Copy link

Put it in an Alpha release, and I will be able to test it. I'm running long running tasks (think 400 days to complete), that I don't want to risk losing on anything that isn't a public or alpha build.

@TheAspens
Copy link
Member

I was able to test this between our QA system and production site and it looks like it works as it should. I saw it fetching work from the backup project when it wasn't able to get work from the primary and it was going to go idle.

Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants