New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outgoing/global launch limit #1038

Merged
merged 16 commits into from Dec 11, 2018

Conversation

Projects
None yet
2 participants
@scrosby
Copy link
Member

scrosby commented Nov 30, 2018

Changes proposed in this PR

  • New global rate limit. This lets us set a hard limit so that we only launch up to a given token-bucket-filter rate of jobs/second.

Why are we making these changes?

  • Better control over cook's behavior.

@scrosby scrosby added the wip label Nov 30, 2018

@scrosby scrosby requested a review from pschorf Nov 30, 2018

@scrosby scrosby force-pushed the scrosby:outgoing/global-launch-limit branch from 742dbc6 to 109ecd5 Nov 30, 2018

@scrosby scrosby force-pushed the scrosby:outgoing/global-launch-limit branch from 3b7ff94 to 70288cc Dec 3, 2018

@scrosby scrosby removed the wip label Dec 3, 2018

.travis.yml Outdated
@@ -62,6 +62,12 @@ matrix:
before_script: cd integration && ./travis/prepare_integration.sh
script: ./travis/run_integration.sh --pools=off --auth=http-basic --job-launch-rate-limit=on

- name: 'Cook Scheduler integration tests with global job launch rate limit'

This comment has been minimized.

@pschorf

pschorf Dec 4, 2018

Contributor

Is it possible to consolidate this test with the other rate limit (for instance by launching enough jobs across several users to rate limit them globally?)

This comment has been minimized.

@scrosby

scrosby Dec 4, 2018

Member

Yes, its possible, but IMO, a bad idea. That means more moving parts, a more complicated test, and a harder-to-tune test.

This comment has been minimized.

@pschorf

pschorf Dec 6, 2018

Contributor

How much more complicated? The setup/teardown time of the travis agents takes a few minutes regardless of how many tests we run, and we only have so many workers that we can run at a time. I'd like to at least try it.

This comment has been minimized.

@scrosby

scrosby Dec 6, 2018

Member

IMO, a better approach is to build one cook and launch it multiple times for different tests. Most of the build time is docker overhead. Launching is relatively fast in comparison. How about I open a JIRA to do that integration test consolidation and we plan on that approach instead?

pytest.skip("Can't test job launch rate limit without launch rate limit set.")

# Allow an environmental variable override.
name = os.getenv('COOK_LAUNCH_RATE_LIMIT_NAME')

This comment has been minimized.

@pschorf

pschorf Dec 4, 2018

Contributor

COOK_LAUNCH_RATE_LIMIT_USER_NAME?

This comment has been minimized.

@scrosby

scrosby Dec 4, 2018

Member

changed --- here and another place.

Show resolved Hide resolved scheduler/src/cook/mesos/constraints.clj
Show resolved Hide resolved scheduler/src/cook/mesos/constraints.clj Outdated
Show resolved Hide resolved scheduler/src/cook/mesos/constraints.clj Outdated
Scott Crosby
.travis.yml Outdated
@@ -62,6 +62,12 @@ matrix:
before_script: cd integration && ./travis/prepare_integration.sh
script: ./travis/run_integration.sh --pools=off --auth=http-basic --job-launch-rate-limit=on

- name: 'Cook Scheduler integration tests with global job launch rate limit'

This comment has been minimized.

@pschorf

pschorf Dec 6, 2018

Contributor

How much more complicated? The setup/teardown time of the travis agents takes a few minutes regardless of how many tests we run, and we only have so many workers that we can run at a time. I'd like to at least try it.

[]
(let [enforcing? (ratelimit/enforce? ratelimit/global-job-launch-rate-limiter)
max-tasks (ratelimit/get-token-count! ratelimit/global-job-launch-rate-limiter ratelimit/global-job-launch-rate-limiter-key)]
(if enforcing?

This comment has been minimized.

@pschorf

pschorf Dec 6, 2018

Contributor

This can just be when enforcing? so you can get rid of the nil.

This comment has been minimized.

@pschorf

pschorf Dec 11, 2018

Contributor

Bump

Scott Crosby added some commits Dec 10, 2018

@scrosby scrosby force-pushed the scrosby:outgoing/global-launch-limit branch from a2c9686 to 4d5a730 Dec 10, 2018

Scott Crosby

@scrosby scrosby force-pushed the scrosby:outgoing/global-launch-limit branch from 76892ad to fd74b19 Dec 10, 2018

Scott Crosby added some commits Dec 10, 2018

@pschorf
Copy link
Contributor

pschorf left a comment

Just small comment from a previous PR, but the new script looks good to me.

[]
(let [enforcing? (ratelimit/enforce? ratelimit/global-job-launch-rate-limiter)
max-tasks (ratelimit/get-token-count! ratelimit/global-job-launch-rate-limiter ratelimit/global-job-launch-rate-limiter-key)]
(if enforcing?

This comment has been minimized.

@pschorf

pschorf Dec 11, 2018

Contributor

Bump

@pschorf pschorf merged commit 44fc9b3 into twosigma:master Dec 11, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment