Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groups/groups.yaml can become difficult to maintain #460

Closed
neolit123 opened this issue Nov 10, 2019 · 24 comments
Closed

groups/groups.yaml can become difficult to maintain #460

neolit123 opened this issue Nov 10, 2019 · 24 comments
Assignees
Labels
area/groups Google Groups management, code in groups/ kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@neolit123
Copy link
Member

neolit123 commented Nov 10, 2019

i foresee that groups/groups.yaml will face the fate of k/test-infra/testgrid/config.yaml :) - i.e. one big yaml with a few owners that is difficult to edit and causes conflicts in PRs.

ideally groups should be sharded on the owner side.
can group settings be delegated on the side of group owners and perhaps only the group names can be enumerated in groups/groups.yaml ?

/kind feature
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Nov 10, 2019
@justaugustus
Copy link
Member

Strong +1 for this!

@cblecker
Copy link
Member

Maybe longterm, but right now the token to manage groups is extremely highly privileged.. I think only dims and I have access to it (the process is a manual run). So either way, we have to funnel approvals through a couple folks anyways. Sharding the config will only get us partially the way there.

@nikhita
Copy link
Member

nikhita commented Nov 11, 2019

So either way, we have to funnel approvals through a couple folks anyways.

Can we go around this by setting up the sync as a postsubmit like we do for peribolos?

@cblecker
Copy link
Member

Peribolos has a bunch of safeguards in place to prevent locking itself out from GitHub. We don't have those kinds of safeguards in place on the groups sync script.. right now, it requires a dry run, manual verification of the changes, and then a real live run.

If we had better testing and some of those safeguards in place, I'd be way more comfortable/confident with us doing this.

@thockin
Copy link
Member

thockin commented Dec 11, 2019

maybe we can break the monolith file into smaller files which are merged at run-time. At least it would be easier to read? Maybe even directory structure to separate RBAC groups from role groups from administrative groups

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 10, 2020
@justaugustus
Copy link
Member

/remove-lifecycle stale
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 10, 2020
@spiffxp
Copy link
Member

spiffxp commented May 29, 2020

#925 is a start at enforcing some conventions, though I'm not sure what "don't lock us out" tests we should have

@spiffxp
Copy link
Member

spiffxp commented May 30, 2020

I added #927 to address @cblecker's particular concern above

Longer term I agree sharding the config would be cool.

I personally think we need more conventions documented and enforced before we take that next step.

#928 should reduce some of the maintenance burden

@spiffxp
Copy link
Member

spiffxp commented May 30, 2020

Oh, one other note to address #460 (comment)

There are four humans who directly have the org admin role, in the event that the @kubernetes.io groups disappear or stop working

@nikhita
Copy link
Member

nikhita commented Jul 4, 2020

Created #996

/assign

@spiffxp spiffxp added this to Needs Triage in sig-k8s-infra Jan 21, 2021
@spiffxp
Copy link
Member

spiffxp commented Jan 28, 2021

I recently updated slack-infra configs to shard out sig-testing resources. I really liked the approach of restrictions.yaml at root defining restrictions for what was allowed to be defined in sub-directories. I think modifying the groups reconciler to use the same approach would address my concerns about enforcing conventions/policies.

references:

@spiffxp spiffxp moved this from Needs Triage to Backlog (existing infra) in sig-k8s-infra Feb 19, 2021
@spiffxp
Copy link
Member

spiffxp commented May 27, 2021

So I'm not sure whether to repurpose this issue or close it in favor of a new one. But to refresh on the definition of done....

I'm not going to allow approvers in groups/sig-foo/OWNERS until we have sufficient checks or enforcement in place to ensure that:

  • someone can't quietly escalate privileges
  • someone can't create random official-sounding groups without sufficient review/oversight

The former can probably handled by properly sharding groups, and enumerating which use cases don't need all the extra paranoia.

The latter could even be as simple as groups/whatever_test.go with a hardcoded list of groups that you've got to update every time you add a new group.

If someone wants to join the pool of approvers in groups/OWNERS, I'll allow it if they're going to work on implementing the above.

@justaugustus
Copy link
Member

cc: @kubernetes/release-engineering, if this piques your interest.

@spiffxp -- Happy to jump in on groups/OWNERS to share the load.

It'd also be motivation to jump back into this refactor I was noodling on: uwu-tools/ggreconcile#1

@spiffxp
Copy link
Member

spiffxp commented Jul 16, 2021

Jumping in on OWNERS doesn't happen unless I see demonstrated commitment to implement the enforcement I described. A good demonstration would be PR's to improve the state of things, or authoring and driving consensus on a proposal and the necessary work items such that they meet the criteria of a help-wanted issue.

@nikhita
Copy link
Member

nikhita commented Jul 28, 2021

A year too late for #996 (comment) .... created #2401

@spiffxp
Copy link
Member

spiffxp commented Jul 29, 2021

#2414 follows on from @nikhita's work

I'm realizing we have very, very little test coverage of how this thing actually works. I would feel more comfortable with more test coverage, and think it'd be a good help-wanted issue. But I don't think it's a complete blocker to proceeding with approver delegation.

I think what remains is:

  • verify groups we definitely want root oversight over
  • decide if there are any other policies to encode in tests
  • start delegating approvers

@spiffxp
Copy link
Member

spiffxp commented Aug 17, 2021

/milestone v1.23

@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Aug 17, 2021
@spiffxp
Copy link
Member

spiffxp commented Aug 17, 2021

/remove-priority important-longterm
/priority important-soon
One other thought I had which maybe is just a nice to have: support loading in from arbitrarily named files. Same as we do for prowjobs. This allows contributors to organize however they see fit, instead of sticking to one massive file and sharding solely on approval boundaries.

I could see this being useful if, for example, we end up migrating mailing lists

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Aug 17, 2021
@spiffxp
Copy link
Member

spiffxp commented Sep 8, 2021

/area groups

@spiffxp
Copy link
Member

spiffxp commented Sep 9, 2021

Looking to push this forward to done now that we have tooling able to support delegated approval. I've been going through and sharding out groups to sig-owned subdirectories, creating them as needed. What remains are the trickier corner cases, updating docs, and updating owners files.

Proposed path forward

@spiffxp spiffxp moved this from Backlog (existing infra) to In Progress in sig-k8s-infra Sep 9, 2021
@spiffxp
Copy link
Member

spiffxp commented Sep 9, 2021

/assign

@spiffxp
Copy link
Member

spiffxp commented Sep 15, 2021

/close
Followup issues opened for remaining groups in root

Announcement: https://groups.google.com/g/kubernetes-dev/c/DhPel8J8QiA

@k8s-ci-robot
Copy link
Contributor

@spiffxp: Closing this issue.

In response to this:

/close
Followup issues opened for remaining groups in root

Announcement: https://groups.google.com/g/kubernetes-dev/c/DhPel8J8QiA

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sig-k8s-infra automation moved this from In Progress to Done Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/groups Google Groups management, code in groups/ kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
sig-k8s-infra
  
Done
Development

No branches or pull requests

8 participants