New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor and better manage scale-testing infra #9223

Open
shyamjvs opened this Issue Aug 31, 2018 · 7 comments

Comments

Projects
None yet
3 participants
@shyamjvs
Member

shyamjvs commented Aug 31, 2018

Following today's infra-team meeting, I'm creating this umbrella for things we'd need to do for sig-scalability (that I can think of atm):

  • Move smaller scale projects to boskos wherever possible (short-term)
  • Move all scalability jobs to boskos (long-term, and needs more work)
  • Isolate GKE-testing from OSS ones project-wise
  • Delete deprecated projects or move to boskos. Few I could find:
    • k8s-jkns-pr-kubemark
    • k8s-jenkins-gci-kubemark (after moving kubemark-canary to boskos)
    • k8s-jenkins-scalability (after moving 100-node scalability canary to boskos)
  • Host perf-dash.k8s.io on CNCF project and probably change it's DNS entry
  • Below are projects (roughly sorted from largest to smallest) in use by scalability jobs currently. We need to get their billing right.
    • kubernetes-scale (still being used for some GKE scale tests)
    • k8s-scale-testing (still being used for some GKE scale tests)
    • k8s-presubmit-scale
    • kubemark-scalability-testing
    • k8s-e2e-gce-scalability-1-1
    • k8s-e2e-gci-gce-scale-1-4
    • k8s-jenkins-gci-scalability
    • k8s-jenkins-scalability
    • k8s-jenkins-blocking-kubemark
    • k8s-jenkins-kubemark
    • k8s-jkns-clusterloader

cc @kubernetes/sig-testing @kubernetes/sig-scalability-misc @dims @BenTheElder @thockin @amwat

@shyamjvs

This comment has been minimized.

Show comment
Hide comment
@shyamjvs

shyamjvs Aug 31, 2018

Member

/assign
I can drive this overall, and do some of the pieces. But if someone wants to take up any AIs, you're gladly welcome :)

Member

shyamjvs commented Aug 31, 2018

/assign
I can drive this overall, and do some of the pieces. But if someone wants to take up any AIs, you're gladly welcome :)

@shyamjvs

This comment has been minimized.

Show comment
Hide comment
@shyamjvs

shyamjvs Aug 31, 2018

Member

Some analysis on projects/jobs that can be moved to boskos:

  1. k8s-jenkins-scalability (used by scalability-canary job - uses ~100 CPU in us-central1-f)
  2. k8s-jenkins-gci-scalability (used by gce-100 job - uses ~100 CPU in us-central-f)
  3. k8s-e2e-gce-scalability-1-1 (used by gce-100-beta job - uses ~100 CPU in us-east1-b)
  4. k8s-e2e-gci-gce-scale-1-4 (used by gce-100-stable1 and gce-100-stable2 jobs - each uses ~100 CPU in us-east1-b)
  5. k8s-jenkins-kubemark (used by following jobs:
  1. k8s-jenkins-blocking-kubemark (used by kubemark-500 job - uses ~80 CPU)
  • So we can create a new boskos pool with all the above projects and use it for those jobs (as each of them needs about 100 CPU). Few blockers to clear before that:
    • [ ] Move CPU quota in projects 3 and 4 from us-east1-b to us-central1-f (as otherwise the pool would be inconsistent wrt the zone having the quota) (in favour of below option)
    • Move CPU quota in projects 1, 2, 5 and 6 from us-central1-f to us-east1-b
    • Increase quota for projects 5 and 6 from 100 CPU to a bit higher (sth like 110) so that other jobs using the pool would also be able to run on it (done)

And now, some projects/jobs that can go to boskos right away:

Member

shyamjvs commented Aug 31, 2018

Some analysis on projects/jobs that can be moved to boskos:

  1. k8s-jenkins-scalability (used by scalability-canary job - uses ~100 CPU in us-central1-f)
  2. k8s-jenkins-gci-scalability (used by gce-100 job - uses ~100 CPU in us-central-f)
  3. k8s-e2e-gce-scalability-1-1 (used by gce-100-beta job - uses ~100 CPU in us-east1-b)
  4. k8s-e2e-gci-gce-scale-1-4 (used by gce-100-stable1 and gce-100-stable2 jobs - each uses ~100 CPU in us-east1-b)
  5. k8s-jenkins-kubemark (used by following jobs:
  1. k8s-jenkins-blocking-kubemark (used by kubemark-500 job - uses ~80 CPU)
  • So we can create a new boskos pool with all the above projects and use it for those jobs (as each of them needs about 100 CPU). Few blockers to clear before that:
    • [ ] Move CPU quota in projects 3 and 4 from us-east1-b to us-central1-f (as otherwise the pool would be inconsistent wrt the zone having the quota) (in favour of below option)
    • Move CPU quota in projects 1, 2, 5 and 6 from us-central1-f to us-east1-b
    • Increase quota for projects 5 and 6 from 100 CPU to a bit higher (sth like 110) so that other jobs using the pool would also be able to run on it (done)

And now, some projects/jobs that can go to boskos right away:

@shyamjvs

This comment has been minimized.

Show comment
Hide comment
@shyamjvs

shyamjvs Aug 31, 2018

Member

@amwat @krzysied - Would you be able/interested to pick up some of the AIs above? That'll be quite helpful for the CNCF migration.

Member

shyamjvs commented Aug 31, 2018

@amwat @krzysied - Would you be able/interested to pick up some of the AIs above? That'll be quite helpful for the CNCF migration.

@amwat

This comment has been minimized.

Show comment
Hide comment
@amwat

amwat Aug 31, 2018

Collaborator

I can create the boskos pool and update the jobs accordingly.

Collaborator

amwat commented Aug 31, 2018

I can create the boskos pool and update the jobs accordingly.

@shyamjvs

This comment has been minimized.

Show comment
Hide comment
@shyamjvs

shyamjvs Aug 31, 2018

Member

That would be great, thanks. Note that the blockers mentioned above need to be fixed first before doing that.

Member

shyamjvs commented Aug 31, 2018

That would be great, thanks. Note that the blockers mentioned above need to be fixed first before doing that.

@shyamjvs

This comment has been minimized.

Show comment
Hide comment
@shyamjvs

shyamjvs Aug 31, 2018

Member

Move CPU quota in projects 3 and 4 from us-east1-b to us-central1-f (as otherwise the pool would be inconsistent wrt the zone having the quota)

Actually, now that I think of it, it's probably better (for the long run) to move the remaining projects' quota in the opposite direction (us-central1-f -> us-east1-b) because:

  • we have more slack in us-east1-b wrt resources available
  • when we later split our large monolithic presubmit project into multiple smaller ones, they would all be in us-east1-b (where we've already been granted quota currently). So it'll be easier to achieve consistency. cc @BenTheElder
Member

shyamjvs commented Aug 31, 2018

Move CPU quota in projects 3 and 4 from us-east1-b to us-central1-f (as otherwise the pool would be inconsistent wrt the zone having the quota)

Actually, now that I think of it, it's probably better (for the long run) to move the remaining projects' quota in the opposite direction (us-central1-f -> us-east1-b) because:

  • we have more slack in us-east1-b wrt resources available
  • when we later split our large monolithic presubmit project into multiple smaller ones, they would all be in us-east1-b (where we've already been granted quota currently). So it'll be easier to achieve consistency. cc @BenTheElder
@shyamjvs

This comment has been minimized.

Show comment
Hide comment
@shyamjvs

shyamjvs Sep 17, 2018

Member

/assign @amwat
(for the boskos part)

Member

shyamjvs commented Sep 17, 2018

/assign @amwat
(for the boskos part)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment