New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
authz/github: repo-centric perms sync from team/org perms caches #24328
Conversation
internal/authz/github/cache.go
Outdated
// Repositories entities associated with this group has access to. | ||
// | ||
// This should ONLY be populated on a USER-centric sync, but may be appended to if | ||
// already populated. | ||
Repositories []extsvc.RepoID | ||
// Users associated with this group | ||
// | ||
// This should ONLY be populated on a REPO-centric sync, but maybe to appended to if | ||
// already populated. | ||
Users []extsvc.AccountID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- cache values might get very large
- there might be a better way to do this, but I'm opting to focus on getting the functionality right and improving the implementation in a separate pass
…github-repo-perms-caching
group.Repositories = nil | ||
group.Users = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing in-place modification within a method call for a passed-in object to abandon it feels too magic in the long term, especially when the modification is not doing cache invalidation (ie. we are not saving en empty object but delete the key entirely).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would improved docstring help, or rename this to invalidateGroup
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we just stop using the group
object all the call sites? I mean cachedGroups
is and should ideally only purely for cache management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using group
makes dealing with this a lot more convenient I think, and helps reduce a lot of repetition and things that can go wrong (e.g. forgetting to set Repos and Users or creating a new Group when invalidating a cache)
cachedGroups is and should ideally only purely for cache management.
What I'm thinking here is ensuring the cache values being used in the code is valid also falls under pure cache management 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's improve the docstring then, not sure how we could improve elsewhere top off my head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does 471682e look?
internal/authz/github/github.go
Outdated
return userIDs, nil | ||
} | ||
|
||
// Check if repo belongs in an org |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also factor rest of the body into a helper method/function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
74b953d starts this, and also makes the code mirror the user perms syncing a bit more
…github-repo-perms-caching
0dc737d
to
e4136b4
Compare
d9f2c29
to
69b10f2
Compare
Notifying subscribers in CODENOTIFY files for diff 82d260b...5af46f9.
|
d127b34
to
68f4b3e
Compare
internal/authz/github/github.go
Outdated
// Just use cache if available and not invalidated | ||
addUserToRepoPerms(group.Users...) | ||
// Perform partial cache update to repositories iff non-empty | ||
if len(group.Repositories) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure this if
is redundant, it is OK to loop over a slice with zero element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change this around a bit, after testing (7a90087) I realized this wasn't behaving correctly. The code now looks like this:
sourcegraph/internal/authz/github/github.go
Lines 178 to 191 in 555d34f
// If this is a partial cache, add self to group | |
if len(group.Users) > 0 { | |
hasUser := false | |
for _, user := range group.Users { | |
if user == accountID { | |
hasUser = true | |
break | |
} | |
} | |
if !hasUser { | |
group.Users = append(group.Users, accountID) | |
p.groupsCache.setGroup(group) | |
} | |
} |
Since the whole block is unnecessary if the field being checked for is empty, the check is still useful in denoting that
…github-repo-perms-caching
) Implements caching of groups permissions for repo-centric permissions sync. Follows up on the GitHub user-centric perms sync caching introduced in #23978. Repo sync now lists direct collaborators to a repo before querying for the organization and teams for users with access to this repo, and caching them as groups. Also adds `allowGroupsPermissionsSync` to the GitHub OAuth provider config, which requests the additional `read:repo` scope required to enable `authorization.groupsCacheTTL` in GitHub code hosts. Co-authored-by: ᴜɴᴋɴᴡᴏɴ <joe@sourcegraph.com> Co-authored-by: Geoffrey Gilmore <geoffrey@sourcegraph.com>
… enabled (#24561) allowGroupsPermissionsSync, introduced in #24328, is actually a prerequisite to enabling groupsCacheTTL. This change checks if allowGroupsPermissionsSync is enabled, and if not, forcibly sets groupsCacheTTL to nil and reports a warning. This is a breaking change, but it is currently flagged as an experimental feature and opt-in only, so will stick to a changelog item.
Implements caching of groups permissions for repo-centric permissions sync. Follows up on the GitHub user-centric perms sync caching introduced in #23978 - see that PR for more information.
Repo sync now lists direct collaborators to a repo before querying for the organization and teams for users with access to this repo, and caching them as groups.
Group caches now includes both repositories and users. In general, the expectation is that:
Repositories
in the cacheUsers
in the cacheToken-wise, verified that all this needs is
repo
andread:org
To review
authz/github/github.go
internal/extsvc/github/v3.go
enterprise/cmd/repo-updater/internal/authz/integration_test.go
users * repos < 5000 * 100
- in most smaller cases it is better to use the cacheless implementation. An argument for making this opt-in only.Subsequent patches
(if we cut a patch release with this, we must include the following as well)