A GitHub CLI extension for auditing GitHub repositories to highlight data that cannot be automatically migrated using GitHub's migration tools
You can use this tool to identify data that will be lost, or which you'll need to migrate manually, when migrating data away from:
- GitHub.com, including GitHub Enterprise Cloud (GHEC)
- GitHub Enterprise Cloud with data residency
- GitHub Enterprise Server (GHES) v3.11 onwards
to another GitHub target.
The results from audits with this tool are non-exhaustive - they won't identify every possible kind of data loss from a migration - but the goal is to get you 95% of the way there.
- Actions secrets
- Actions variables
- Code Scanning alerts
- Code Scanning analyses
- Code Scanning default setup
- Codespaces secrets
- Custom properties
- Dependabot alerts
- Dependabot secrets
- Deploy keys
- Deployments
- Discussions
- Environments
- Forks
- Git LFS objects
- Git submodules
- Packages
- Pages custom domain configuration
- Pinned issues
- Release assets which are too large
- Rulesets
- Secret Scanning alerts
- Self-hosted runners
- Stars
- Tag protection rules
- Watchers
- Webhooks
- Workflow runs
It will also warn you if your repository is large (over 1GB), and may be slow or difficult to migrate.
Make sure you've got the GitHub CLI installed. If you haven't, you can install it by following the instructions here.
Once gh
is ready and available on your machine, you can install this extension by running gh extension install timrogers/gh-migration-audit
.
You can check that the extension is installed and working by running gh migration-audit --help
.
gh migration-audit audit-repo
will audit a single GitHub repository and output a CSV file with information about data that cannot be migrated automatically.
You'll need an access token token with appropriate permissions. The extension won't use your existing login session for the GitHub CLI. You'll need a classic token with the repo
scope, authorized for SSO if applicable.
You can audit a repo like this:
gh migration-audit audit-repo \
# A GitHub access token with the permissions described above. This can also be configured using the `GITHUB_TOKEN` environment variable.
--access-token GITHUB_TOKEN \
# The login of the user or organization that owns the repository
--owner monalisa \
# The name of the repository
--repo octocat \
# OPTIONAL: The path to write the audit result CSV to. Defaults to the specified owner and repo, followed by the current date and time, e.g. `monalisa_octocat_1698925405325.csv
--output-path octocat.csv \
# OPTIONAL: The base URL of the GitHub API, if you're running an audit against a GitHub product other than GitHub.com. For GitHub Enterprise Server, this will be something like `https://github.acme.inc/api/v3`. For GitHub Enterprise Cloud with data residency, this will be `https://api.acme.ghe.com`, replacing `acme` with your own tenant.
--base-url https://github.acme.inc/api/v3 \
# OPTIONAL: The URL of an HTTP(S) proxy to use for requests to the GitHub API (e.g. `http://localhost:3128`). This can also be set using the PROXY_URL environment variable.
--proxy-url https://10.0.0.1:3128 \
# OPTIONAL: Whether to emit detailed, verbose logs
--verbose \
# OPTIONAL: Disable anonymous telemetry that gives the maintainers of this tool basic information about real-world usage
--disable-telemetry \
# OPTIONAL: Skip automatic check for updates to this tool
--skip-update-check
The tool will audit your repo, and then write a CSV file to the --output-path
with the results.
gh migration-audit audit-all
will audit all GitHub repositories owned by a specific organization or user, and output a CSV file with information about data that cannot be migrated automatically.
You can run the command like this:
gh migration-audit audit-all \
# A GitHub access token with the permissions described above. This can also be configured using the `GITHUB_TOKEN` environment variable.
--access-token GITHUB_TOKEN \
# The login of the user or organization that owns the repositories.
--owner monalisa \
# The type of the owner of the repositories - either `user` or `organization`.
--owner-type user \
# OPTIONAL: The path to write the audit result CSV to. Defaults to the specified owner followed by the current date and time, e.g. `monalisa_1698925405325.csv`.
--output-path octocat.csv \
# OPTIONAL: The base URL of the GitHub API, if you're running an audit against a GitHub product other than GitHub.com. For GitHub Enterprise Server, this will be something like `https://github.acme.inc/api/v3`. For GitHub Enterprise Cloud with data residency, this will be `https://api.acme.ghe.com`, replacing `acme` with your own tenant.
--base-url https://github.acme.inc/api/v3 \
# OPTIONAL: The URL of an HTTP(S) proxy to use for requests to the GitHub API (e.g. `http://localhost:3128`). This can also be set using the PROXY_URL environment variable.
--proxy-url https://10.0.0.1:3128 \
# OPTIONAL: Whether to emit detailed, verbose logs
--verbose \
# OPTIONAL: Disable anonymous telemetry that gives the maintainers of this tool basic information about real-world usage
--disable-telemetry \
# OPTIONAL: Skip archived repositories when auditing all repositories owned by the specified user or organization
--skip-archived \
# OPTIONAL: Skip automatic check for updates to this tool
--skip-update-check
The tool will audit all of the repos, and then write a CSV file to the --output-path
with the results.
gh migration-audit audit-repos
will audit a specific list of repos specified in a CSV file, and output a CSV file with information about data that cannot be migrated automatically.
To use this command, you'll need to start by preparing your input CSV. It should have exactly two columns, owner
and name
. On each row, owner
should have the name of the user or organization that owns the repo, and name
should have the name of the repo, e.g.:
owner,name
monalisa,octocat
timrogers,gh-migration-audit
You can run the command like this:
gh migration-audit audit-repos \
# A GitHub access token with the permissions described above. This can also be configured using the `GITHUB_TOKEN` environment variable.
--access-token GITHUB_TOKEN \
# The path to a input CSV file with a list of repos to audit. The file should have a header row with the columns `owner` and `name`, followed by a series of rows.
--input-path input.csv \
# OPTIONAL: The path to write the audit result CSV to. Defaults to the "repos" followed by the current date and time, e.g. `repos_1698925405325.csv`.
--output-path octocat.csv \
# OPTIONAL: The base URL of the GitHub API, if you're running an audit against a GitHub product other than GitHub.com. For GitHub Enterprise Server, this will be something like `https://github.acme.inc/api/v3`. For GitHub Enterprise Cloud with data residency, this will be `https://api.acme.ghe.com`, replacing `acme` with your own tenant.
--base-url https://github.acme.inc/api/v3 \
# OPTIONAL: The URL of an HTTP(S) proxy to use for requests to the GitHub API (e.g. `http://localhost:3128`). This can also be set using the PROXY_URL environment variable.
--proxy-url https://10.0.0.1:3128 \
# OPTIONAL: Whether to emit detailed, verbose logs
--verbose \
# OPTIONAL: Disable anonymous telemetry that gives the maintainers of this tool basic information about real-world usage
--disable-telemetry \
# OPTIONAL: Skip automatic check for updates to this tool
--skip-update-check
The tool will audit all of the repos, and then write a CSV file to the --output-path
with the results.
For the commands audit-all, audit-repos and audit-repo, this tool supports two types of authentication: token
which expects a Personal Access Token or installation
which uses a GitHub App installation for authentication. The tool will automatically determine the authentication type based on the provided parameters.
- If a
--access-token
orGITHUB_TOKEN
is provided it is assumed that the token approach should be taken. - If a
--app-installation-id
orGITHUB_APP_INSTALLATION_ID
is provided it is assumed that the installation approach should be taken. This approach also requires that an--app-id
and also--private-key
be provided in addition to the--app-installation-id
.
By default, the extension expects an access token with appropriate scopes to be provided with the --access-token
argument or GITHUB_ACCESS_TOKEN
environment variable.
gh migration-audit audit-repo \
# A GitHub access token with the permissions described above. This can also be configured using the `GITHUB_TOKEN` environment variable.
--access-token GITHUB_TOKEN \
# The login of the user or organization that owns the repository
--owner monalisa \
# The name of the repository
--repo octocat
Instead of token authentication, installation authentication can be used by providing values for --app-id
, --app-installation-id
, and either --private-key
or --private-key-file
. Use of environment variables is also supported by providing GITHUB_APP_ID
, GITHUB_APP_INSTALLATION_ID
, and either GITHUB_APP_PRIVATE_KEY
or GITHUB_APP_PRIVATE_KEY_FILE
.
Installation Authentication uses a GitHub App
to access resources on behalf of a user in an organization and can provide longer-lived integration. A personal access token is great for short-lived scripts but if you have a large organization you are trying to audit you should consider using a GitHub App instead. Additionally, a personal access token is associated with a user and can cause the process to fail if the user no longer has access to the resources you are auditing. A GitHub App is not dependent on a user.
gh migration-audit audit-repo \
# The GitHub app ID
--app-id GITHUB_APP_ID \
# The private key of the GitHub App. Alternatively, use --private-key-file if you have a .pem file.
--private-key GITHUB_APP_PRIVATE_KEY \
# OR
# The private key of the GitHub App. For example, path to a *.pem file you downloaded from the about page of the GitHub App.
--private-key-file /path/to/private-key-file.pem \
# The GitHub app installation ID
--app-installation-id GITHUB_APP_INSTALLATION_ID \
# The login of the user or organization that owns the repository
--owner monalisa \
# The name of the repository
--repo octocat
This tool works with GitHub Enterprise Server, and supports all GitHub Enterprise Server versions currently supported by GitHub. GitHub Enterprise Server versions are supported for approximately a year from release.
At the time of writing, the earliest supported version of GitHub Enterprise Server is 3.7. When running against GitHub Enterprise Server, the tool will check that the version is supported.
You may be able to run audits against an older GitHub Enterprise Server version with an older version of gh-migration-audit
:
- To run audits against GitHub Enterprise Server 3.6, use v0.4.1 or earlier
If you identify a piece of data that isn't automatically migrated which isn't detected by this tool, you can write an auditor to generate a migration warning. Here's what you need to do:
- Figure out how to detect the data that can't be migrated using GitHub's REST or GraphQL API
- If the data is returned on the
Repository
object in the GraphQL API, and that GraphQL query is supported across all supported GitHub Enterprise Server versions (>= 3.6), update the GraphQL query insrc/repositories.ts
to fetch that data, and then update theGraphqlRepository
type insrc/types.ts
in line with the change - Create a TypeScript file in
src/auditors
for your auditor, with a sensible name, copying an existing auditor.repository-packages.ts
is a good one to use as a template for a simple auditor using GraphQL data. For a more complex example with GitHub Enterprise Server version checks, logging, REST API requests and error handling, seerepository-dependabot-alerts.ts
. - Update your new auditor to check for unmigratable data and return a warning if there is data which can't be migrated. You may be able to use GraphQL data from step 2, or you can make a fresh API request from your auditor using the
octokit
,owner
andrepo
passed to the function. - Write a unit test for your auditor in
test/auditors
, using an existing one as a template. - Update
DEFAULT_AUDITORS
insrc/repository-auditor.ts
, importing and adding the auditor you created - Create a pull request with your changes, including evidence of your functional testing to make sure the auditor works on. If possible, please also test with the oldest supported GitHub Enterprise Server version (v3.7).
This extension includes basic, anonymous telemetry to give the maintainers information about real-world usage. This data is limited to:
- The number of executions of each command
- The versions of GitHub Enterprise Server being used
- The versions of the extension currently in use
You can disable all telemetry by specifying the --disable-telemetry
argument.