Skip to content

Commit

Permalink
authz/github: repo-centric perms sync from team/org perms caches (#24328
Browse files Browse the repository at this point in the history
)

Implements caching of groups permissions for repo-centric permissions sync. Follows up on the GitHub user-centric perms sync caching introduced in #23978. Repo sync now lists direct collaborators to a repo before querying for the organization and teams for users with access to this repo, and caching them as groups.

Also adds `allowGroupsPermissionsSync` to the GitHub OAuth provider config, which requests the additional `read:repo` scope required to enable `authorization.groupsCacheTTL` in GitHub code hosts.

Co-authored-by: ᴜɴᴋɴᴡᴏɴ <joe@sourcegraph.com>
Co-authored-by: Geoffrey Gilmore <geoffrey@sourcegraph.com>
  • Loading branch information
3 people committed Aug 31, 2021
1 parent 4e0053f commit fa91928
Show file tree
Hide file tree
Showing 30 changed files with 6,303 additions and 642 deletions.
8 changes: 4 additions & 4 deletions CHANGELOG.md
Expand Up @@ -15,11 +15,11 @@ All notable changes to Sourcegraph are documented in this file.

### Added

-
- The required authentication scopes required to enable caching behaviour for GitHub repository permissions can now be requested via `allowGroupsPermissionsSync` in GitHub `auth.providers`. [#24328](https://github.com/sourcegraph/sourcegraph/pull/24328)

### Changed

-
- Caching behaviour for GitHub repository permissions enabled via the `authorization.groupsCacheTTL` field in the code host config can now leverage additional caching of team and organization permissions for repository permissions syncing (on top of the caching for user permissions syncing introduced in 3.31). [#24328](https://github.com/sourcegraph/sourcegraph/pull/24328)

### Fixed

Expand All @@ -45,6 +45,7 @@ All notable changes to Sourcegraph are documented in this file.
- Batch Changes changesets can now be [set to published when previewing new or updated batch changes](https://docs.sourcegraph.com/batch_changes/how-tos/publishing_changesets#within-the-ui). [#22912](https://github.com/sourcegraph/sourcegraph/issues/22912)
- Added Python3 to server and gitserver images to enable git-p4 support. [#24204](https://github.com/sourcegraph/sourcegraph/pull/24204)
- Code Insights drill-down filters now allow filtering insights data on the dashboard page using repo: filters. [#23186](https://github.com/sourcegraph/sourcegraph/issues/23186)
- GitHub repository permissions can now leverage caching of team and organization permissions for user permissions syncing. Caching behaviour can be enabled via the `authorization.groupsCacheTTL` field in the code host config. This can significantly reduce the amount of time it takes to perform a full permissions sync due to reduced instances of being rate limited by the code host. [#23978](https://github.com/sourcegraph/sourcegraph/pull/23978)

### Changed

Expand Down Expand Up @@ -72,11 +73,10 @@ All notable changes to Sourcegraph are documented in this file.
- The `sourcegraph-frontend.Role` in Kubernetes deployments was updated to permit statefulsets access in the Kubernetes API. This is needed to better support stable service discovery for stateful sets during deployments, which isn't currently possible by using service endpoints. [#3670](https://github.com/sourcegraph/deploy-sourcegraph/pull/3670) [#23889](https://github.com/sourcegraph/sourcegraph/pull/23889)
- For Docker-Compose and Kubernetes users, the built-in main Postgres and codeintel databases have switched to an alpine Docker image. This requires re-indexing the entire database. This process can take up to a few hours on systems with large datasets. [#23697](https://github.com/sourcegraph/sourcegraph/pull/23697)
- Results are now streamed from searcher by default, improving memory usage and latency for large, unindexed searches. [#23754](https://github.com/sourcegraph/sourcegraph/pull/23754)
- [`deploy-sourcegraph` overlays](https://docs.sourcegraph.com/admin/install/kubernetes/configure#overlays) now use `resources:` instead of the [deprecated `bases:` field] -(https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/bases/) for referencing Kustomize bases. [deploy-sourcegraph#3606](https://github.com/sourcegraph/deploy-sourcegraph/pull/3606)
- [`deploy-sourcegraph` overlays](https://docs.sourcegraph.com/admin/install/kubernetes/configure#overlays) now use `resources:` instead of the [deprecated `bases:` field](https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/bases/) for referencing Kustomize bases. [deploy-sourcegraph#3606](https://github.com/sourcegraph/deploy-sourcegraph/pull/3606)
- The `deploy-sourcegraph-docker` Pure Docker deployment scripts and configuration has been moved to the `./pure-docker` subdirectory. [deploy-sourcegraph-docker#454](https://github.com/sourcegraph/deploy-sourcegraph-docker/pull/454)
- In Kubernetes deployments, setting the `SRC_GIT_SERVERS` environment variable explicitly is no longer needed. Addresses of the gitserver pods will be discovered automatically and in the same numerical order as with the static list. Unset the env var in your `frontend.Deployment.yaml` to make use of this feature. [#24094](https://github.com/sourcegraph/sourcegraph/pull/24094)
- The consistent hashing scheme used to distribute repositories across indexed-search replicas has changed to improve distribution and reduce load discrepancies. In the next upgrade, indexed-search pods will re-index the majority of repositories since the repo to replica assignments will change. This can take a few hours in large instances, but searches should succeed during that time since a replica will only delete a repo once it has been indexed in the new replica that owns it. You can monitor this process in the Zoekt Index Server Grafana dashboard - the "assigned" repos in "Total number of repos" will spike and then reduce until it becomes the same as "indexed". As a fail-safe, the old consistent hashing scheme can be enabled by setting the `SRC_ENDPOINTS_CONSISTENT_HASH` env var to `consistent(crc32ieee)` in the `sourcegraph-frontend` deployment. [#23921](https://github.com/sourcegraph/sourcegraph/pull/23921)
- GitHub repository permissions can now leverage caching of team and organization permissions for user permissions syncing. Caching behaviour can be enabled via the `authorization.groupsCacheTTL` field in the code host config. This can significantly reduce the amount of time it takes to perform a full permissions sync due to reduced instances of being rate limited by the code host. [#23978](https://github.com/sourcegraph/sourcegraph/pull/23978)
- In Kubernetes deployments an emptyDir (`/dev/shm`) is now mounted in the `pgsql` deployment to allow Postgres to access more than 64KB shared memory. This value should be configured to match the `shared_buffers` value in your Postgres configuration. [deploy-sourcegraph#3784](https://github.com/sourcegraph/deploy-sourcegraph/pull/3784/)

### Fixed
Expand Down
2 changes: 1 addition & 1 deletion doc/admin/external_service/github.md
Expand Up @@ -40,7 +40,7 @@ No token scopes are required if you only want to sync public repositories and do
- `read:org` to use the `"allowOrgs"` setting [with a GitHub authentication provider](../auth/index.md#github) and `groupsCacheTTL` for [permissions caching](../repo/permissions.md#permissions-caching).
- `repo`, `read:org`, `user:email`, and `read:discussion` to use [batch changes](../../batch_changes/index.md) with GitHub repositories. See "[Code host interactions in batch changes](../../batch_changes/explanations/permissions_in_batch_changes.md#code-host-interactions-in-batch-changes)" for details.

>NOTE: If you plan to use repository permissions with background syncing, an access token that has admin access to all private repositories is required. It is because only admin can list all collaborators of a repository.
> NOTE: If you plan to use repository permissions with [background permissions syncing](../repo/permissions.md#background-permissions-syncing), an access token that has admin access to all private repositories is required. It is because only admin can list all collaborators of a repository.
## GitHub.com rate limits

Expand Down
4 changes: 2 additions & 2 deletions doc/admin/repo/permissions.md
Expand Up @@ -51,7 +51,7 @@ The events we consume are:

<span class="badge badge-experimental">Experimental</span> <span class="badge badge-note">Sourcegraph 3.31+</span>

For GitHub providers, Sourcegraph can leverage caching of GitHub [team](https://docs.github.com/en/organizations/managing-access-to-your-organizations-repositories/managing-team-access-to-an-organization-repository) and [organization](https://docs.github.com/en/organizations/managing-access-to-your-organizations-repositories/repository-permission-levels-for-an-organization) permissions - [learn more](#permissions-caching).
For GitHub providers, Sourcegraph can leverage caching of GitHub [team](https://docs.github.com/en/organizations/managing-access-to-your-organizations-repositories/managing-team-access-to-an-organization-repository) and [organization](https://docs.github.com/en/organizations/managing-access-to-your-organizations-repositories/repository-permission-levels-for-an-organization) permissions - [learn more about permissions caching](#permissions-caching).

Caching behaviour can be enabld via the `authorization.groupsCacheTTL` field:

Expand All @@ -69,7 +69,7 @@ We currently recommend a default of `72` (hours, or 3 days) for the `groupsCache

Caches can also be [manually invalidated](#permissions-caching) if necessary.

Note the token associated with the external service must have `org:read` or `user` scope in order to read the repo permissions and cache them - [learn more](../external_service/github.md#github-api-token-and-access).
> NOTE: The token associated with the external service must have `repo` and `read:org` scope in order to read the repo, orgs, and teams permissions and cache them - [learn more](../external_service/github.md#github-api-token-and-access).
<br />

Expand Down
Expand Up @@ -114,7 +114,7 @@ func requestedScopes(p *schema.GitHubAuthProvider, extraScopes []string) []strin
}

// Needs extra scope to check organization membership
if len(p.AllowOrgs) > 0 {
if len(p.AllowOrgs) > 0 || p.AllowGroupsPermissionsSync {
scopes = append(scopes, "read:org")
}

Expand Down

0 comments on commit fa91928

Please sign in to comment.