Microsoft's Open Source Management Portal
This Node.js application is a part of the suite of services provided by the Open Source Programs Office at Microsoft to enable large-scale GitHub management and open source business review experiences.
Key features center around opinionated takes on at-scale management, with an emphasis on relentless automation and delegation:
- Linking: the concept of associating a GitHub identity with an authenticated identity in another provider, for example an Azure Active Directory user
- Self-service GitHub organization join: one-click GitHub organization joining for authorized users
- Cross-organization functionality: consolidated + transparent views across a set of managed GitHub organizations including people, repos, teams
An introduction to this project is available in this 2015 post by Jeff Wilcox: https://jeffwilcox.blog/2015/11/azure-on-github/
Improvements made a few years back include:
- The portal works best as a GitHub App instead of the older GitHub OAuth app model
- The app can be installed as multiple parallel apps (user-facing, operations, background jobs, and data) to ensure that key user experiences continue to function even if a background job or other task exhausts available REST API reosurces
- When combined with a near-realtime webhook feed, the app tracks updates for views in a database instead of through REST API caches.
While the default application experience is a server-rendered old-school site, at Microsoft a modern React front-end sits on top of this backend that just serves REST API requests.
We'd love to eventually open source the React front-end; while there are some cool React server-side-and-frontend frameworks like Next.js, we have chosen not to take such a dependency. It feels overly complicated to have the React client in this open repository right now, and would likely be a sidecar project (separate repo) when we do get that ready.
- Node.js LTS (v16+)
Native promises and await/async are being introduced into the codebase. Older callback-based
code using libraries such as
async or the promise library
q have been removed.
- At least one of your own GitHub organizations
- Bring your own cache system (built-in providers for Redis, Cosmos DB, and Azure storage)
- Azure Active Directory, or hack your own Passport provider in (and help us extend the concept to be more generic and useful for Google auth, Okta, etc.)
- Data storage for links, etc.: either Azure Storage or Postgres
Firehose + query cache webhook processing
While the original portal works fine for very small GitHub presences, it was designed around the idea that the cache would fill, while respecting the GitHub REST API by using Conditional Requests, and being very eventually consistent.
However, the REST API v3 (non-GraphQL client) maximum size for a page of results is 100 entries, which ... is very painful if you have tens of thousands of anything.
The "firehose" is designed to be run either within the app itself, or as a secondary app processing results. At Microsoft, we use a service bus to process webhook events from GitHub, since we have a robust webhook ingestion mechanism elsewhere. The firehose runs as a daemon that pulls off the queue and works to keep the "query cache" primed with newer information than the REST API may have in some cases.
What this improves:
- The user views of the orgs, repos, teams they are added to and have access to
- Cross-organization views and querying
The firehose and query cache are not used for important or auth-style scenarios:
- Query cache is not used to make permission decisions
- Query cache is not used to authorize access to administrative functions
We did at one point design the idea of having a
/webhook endpoint and validating
the webhook signatures before processing hooks for simple app hosting, but it's
slightly broken right now.
Dev prep, build, deploy
Install Node packages
Make sure to include dev dependencies.
The default assets package is a super ancient Bootstrap and jQuery app that in theory provides basic skin for the site, favicons, graphics, etc. However, it's ... really, really, really old. Microsoft discards the default-assets-package, using a different set of assets, so you've been mildly warned.
main module of the defined default-assets-package should resolve to the
path to serve content from. Since the default version uses Grunt to build the
assets, it returns the
/public, which is the output/built location for Grunt.
npm install cd default-assets-package npm install
npm run build
You need to rebuild the default-assets-package if you change something. see Static Site Assets
Building the Docker image
$ docker build .
The most easy way to run is by using the docker-compose setup. This will bootup the postgres and redis components as well. The docker-compose setup depends on 2 environment files and 1 json file:
Make sure to copy the .secrets.env.example and env-orgs.json.example files and provide the configuration values.
cp .secrets.env.example .secrets.env cp env-orgs.json.example env-orgs.json # provide configuration values for .secrets.env and env-orgs.json docker-compose up
If you desire to run all on your local machine (redis, postgres) you might want to use following approach.
# ensure redis and postgres is running on localhost source .secrets.env source .local.env npm run start
If the docker image doesn't start you can debug the image using an interactive shell session. This allows you to browse the folders, update the files to test things and run the portal.
$ docker run --rm -it --env-file .secrets.env --env-file .docker.env --entrypoint /bin/sh opensource-portal /usr/src/repos $ ls app.js data lib package.json tsconfig.tsbuildinfo webhooks app.js.map entities localEnvironment.js routes user bin features localEnvironment.js.map test utils.js business github middleware transitional.js utils.js.map config jobs node_modules transitional.js.map views /usr/src/repos $ npm run start-in-container
This project basically has very few tests, and aspirations to start using Jest better.
Work to be done
- Support more interesting cloud and data providers
- Support other authentication technologies
- More tests
- Ship the front-end UI to the world as open source
- Continuing to refactor out Microsoft-specific things when possible
Please see the
docs/ sub-folder, including docs/index.md.
Please see docs/configuration.md
Please see docs/jobs.md
Please see the docs/api.md file for information about the current API.
Bare minimum local development environment
If you place a JSON file
.env above the directory of your cloned repo
(to prevent committing secrets to your repo by accident or in your editor),
you can configure the following extreme minimum working set to use the app.
The central operations token is a personal access token that is a org owner of the GitHub org(s) being managed.
DEBUG_ALLOW_HTTP=1 GITHUB_CENTRAL_OPERATIONS_TOKEN=a github token for the app GITHUB_ORGANIZATIONS_FILE=../../env-orgs.json GITHUB_CLIENT_ID=your client id GITHUB_CLIENT_SECRET=your client secret GITHUB_CALLBACK_URL=http://localhost:3000/auth/github/callback AAD_CLIENT_ID=your corporate app id AAD_REDIRECT_URL=http://localhost:3000/auth/azure/callback AAD_CLIENT_SECRET=a secret for the corporate app AAD_TENANT_ID=your tenant id AAD_ISSUER=https://sts.windows.net/your tenant id/
In this mode memory providers are used, including a mocked Redis client. Note that this does mean that a large GitHub organization configured with memory providers could become a token use nightmare, as each new execution of the app without a Redis Cache behind the scenes is going to have 100% cache misses for GitHub metadata. Consider configuring a development or local Redis server to keep cached data around.
The built-in Redis mock will likely be removed when we move to the next major semver of the Node Redis library.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.