Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flag for manual review design doc #6138

Merged
merged 28 commits into from Jul 24, 2020
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
02e3f8c
Create flag-for-manual-review.md
bkjohnson Feb 20, 2020
cdd178c
Add some more overview info
bkjohnson Feb 20, 2020
375099d
Add some specifics
bkjohnson Feb 20, 2020
0c1f5dc
Add some more details
bkjohnson Feb 25, 2020
56b9b14
Remove some remplate info
bkjohnson Feb 25, 2020
8656751
Merge branch 'master' into bkjohnson/flag-for-manual-review-doc
bkjohnson Mar 2, 2020
c7164ad
Add more info on what triggers a manual review
bkjohnson Mar 2, 2020
a1b7693
Add a caveat
bkjohnson Mar 2, 2020
afe6eef
Move into design-docs folder
bkjohnson Mar 2, 2020
c8a2b88
Fix typo
bkjohnson Mar 3, 2020
51eccd3
Remove harsh-sounding non-goal
bkjohnson Mar 3, 2020
50e096f
Be more succinct about review turnaround
bkjohnson Mar 3, 2020
72a8d42
Apply suggestions from code review
bkjohnson Mar 4, 2020
46a043b
Update security concerns section
bkjohnson Mar 4, 2020
52934c3
Add detail on how to handle the token securely
bkjohnson Mar 4, 2020
0f4922b
Add more info on environment variables
bkjohnson Mar 4, 2020
4091768
Use file extension in link
bkjohnson Mar 4, 2020
d609d45
Move design doc out of design-docs folder
bkjohnson Mar 4, 2020
8618817
Fix typo in environment variables link
bkjohnson Mar 4, 2020
c021a72
Add detail on discovering when process is ignored
bkjohnson Mar 4, 2020
5015873
Add repo and file description to code location
bkjohnson Mar 4, 2020
0e05484
Be more specific about testing plan
bkjohnson Mar 4, 2020
8225b5c
Add a bit of background on codeowners changes
bkjohnson Mar 4, 2020
de85478
Add detail on how the Github API is used
bkjohnson Mar 4, 2020
510190b
Fix ref case
bkjohnson Mar 4, 2020
03324b8
Apply suggestions from code review
bkjohnson Mar 5, 2020
c41f349
Remove manual-review-triggers doc
bkjohnson Jul 24, 2020
11ae901
Mark as approved
bkjohnson Jul 24, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion platform/engineering/code_review_guidelines.md
Expand Up @@ -15,7 +15,7 @@ Information on how VSP uses code owners can be found [here](code-owners.md).
## The pull request review lifecycle in brief

1. For initial review by your project team, create a [Draft Pull Request](https://github.blog/2019-02-14-introducing-draft-pull-requests/)
2. If your PR triggers any [additional automated checks](./manual-review-triggers), a bot will leave a comment and request a manual review from the **frontend-review-group**
2. If your PR triggers any [additional automated checks](./manual-review-triggers.md), a bot will leave a comment and request a manual review from the **frontend-review-group**
- the frontend-review-group completes reviews within one business day
3. When all review comments have been resolved, the PR can be merged into the master branch for deployment.

Expand Down
Expand Up @@ -30,6 +30,8 @@ From the perspective of a VFS engineer, the current review process involves:
The final approval from the review group is often not helpful, and it slows the process down.
This frustration often leads to VFS teams ignoring this process and they will directly ask for a VSP review before their PR is ready.

These problems have led to an effort to shift some review responsibility away from the frontend-review-group by implementing [code owners changes](./codeowners.md).

### High Level Design

The plan is to use a CI process to run a script each time a change gets pushed to a PR.
Expand All @@ -39,24 +41,40 @@ This script will look for anything that should trigger a manual review by the VS

### Detailed Design

We will use a script that can be run from a job in CircleCI.
We will use a nodeJS script that can be run from a job in CircleCI.
bkjohnson marked this conversation as resolved.
Show resolved Hide resolved
Any Github API calls will be made through the [Octokit](https://octokit.github.io/rest.js) package which uses a RESTful API.

The script will:

- Diff the current PR branch against master
- Make a pass which will mark any additions with the filename and position in the diff
- `octokit.pulls.get(...)` will be used with a [custom media type](https://developer.github.com/v3/pulls/#custom-media-types) format of `diff`
- Loop through the diff output and mark any additions with the filename and position in the diff
- Search the processed diff for occurrences of anything that should warrant a manual review
- This can be done with a regular expression
- Remove any items from the list of offense additions that have already been commented on by the review bot
- This is a two step process:
- Use `octokit.paginate()` to get a list of all comments on a PR and filter out comments that are outdated or weren't made by the bot
- Filter out any additions from the processed & filtered diff if there is already a current bot comment on it
- If there are any offenses which haven't been commented on, leave a comment and request a review from the *frontend-review-group*
- Use `octokit.pulls.createReviewRequest()` to request a review from the frontent-review-group
- Use `octokit.pulls.createReview()` to leave a bot review with a comment on any addition that matched the pattern which hasn't been commented on

This will rely on the Github API as well as some environment variables provided by Circle CI.
Various environment variables will be used in order to allow the script to be easily configurable from CircleCI - see [Debugging](#debugging) for more information.

### Code Location

This code will live in the [vets-website repo](https://github.com/department-of-veterans-affairs/vets-website) in the following directories:

- `script/pr-check.js`
- The script responsible for performing the automated checks & leaving comments
- `.circleci/config.yml`
- Responsible for adding jobs and assigning appropriate environment variables

### Testing Plan
I will run the script locally in addition to having Circle run it.

There are no automated tests planned for the initial release - all testing will be manual.
Once this process is released and teams provide feedback we will iterate and add unit tests.
The script will be written in a way that should make it easier to test.

### Logging
The script will log to the console:
Expand All @@ -67,7 +85,7 @@ The script will log to the console:

### Debugging

If a user sets up the proper environment variables locally, the script can be run in a local environment and the logging output can be viewed. Otherwise we will rely on viewing CircleCI logs.
If a user sets up the [proper environment variables](./manual-review-triggers.md#required-environment-variables) locally, the script can be run in a local environment and the logging output can be viewed. Otherwise we will rely on viewing CircleCI logs.

### Caveats

Expand All @@ -76,7 +94,11 @@ No other context is included.

### Security Concerns

We should make sure that the Github bot auth token is kept secure and it should not be associated with an individual's account. The sole communicators in the system are CircleCI and Github. This isn't really a security concern, but more of a reliability one: if one of those services goes down we do not have a fallback plan.
We should make sure that the Github bot auth token is kept secure and it should not be associated with an individual's account.
The current strategy for storing this token is as an [environment variable in CircleCI](https://ui.circleci.com/settings/project/github/department-of-veterans-affairs/vets-website/environment-variables) called `GITHUB_TOKEN`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the CircleCI docs:

Note: secret masking will only prevent the value of the environment variable from appearing in your build output. The value of the environment variable is still accessible to users debugging builds with SSH.

who has SSH access to these builds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that anyone who has an SSH public key on Github will be able to connect. I re-ran a job with SSH and this is what I see:

image

So anytime we deliberately kick off a job with SSH access, we should be able to see whoever connects to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyone with an SSH public key? So, that sounds to me like the GitHub bot auth token can be accessed by anyone?

Copy link
Contributor Author

@bkjohnson bkjohnson Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a more specific answer in the docs:

In order to SSH into a CircleCI build, the username must be one which has access to the project being built!

When I try to go to the project website in a private window it blocks me and forces me to authenticate with Github or Bitbucket.

I created a fake github account to test this and was able to login and view the project in Circle, but I can't even look at the project settings. I get errors when I try to follow the project, and I can't see the list of branches either. The option to trigger an SSH build is disabled as well:

image

After seeing all of this, I think that project access in circle requires an account to be a part of the Github organization that the repo is associated with. I've looked around at various settings to try to get a more definitive answer, but I'm not an organization admin in Circle so there are certain things I can't see. I wonder if this page would be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://discuss.circleci.com/t/disable-ssh/31057 has a couple key things:

Any user with write capabilities will also have SSH access. [...] any user with write permissions could theoretically also expose secrets via commands issues through the config.yml file.

Related question to investigate: how often do we hand out write permission on vets-website/vets-api, or do we generally keep people's access to read?

A good first step is to ensure your deploy keys only have read access to your repos.

This made me wonder, what kinds of access does our bot token have? Can we keep that as minimal as possible? (read-only + posting comments?).

Other possible idea: it sounds like contexts might be useful here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know who gives write access to people?

Our bot token has the public_repo scope. I believe that this is the least permissions that we can give it in order for it to do its job, but it does include write access for our public repos.

Limits access to public repositories. That includes read/write access to code, commit statuses, repository projects, collaborators, and deployment statuses for public repositories and organizations.

Here is a relevant Slack thread.

Contexts are something that I haven't explored deeply. I'm interested to see how we could use them, but I don't have permissions to manage security groups so I'm a bit limited.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know who gives write access to people?

VA has an entire GitHub admin group. you can also talk to @ricetj about longer-term strategy for ownership on these repos, or (more immediately helpful) see if you can track down an admin for vets-website and vets-api to see how we've handed out write access so far. it's possible that most people just have read access and that this isn't a concern, but write access is typically needed for managing issues, which might mean we've handed it out more broadly.

Our bot token has the public_repo scope. I believe that this is the least permissions that we can give it in order for it to do its job, but it does include write access for our public repos.

hmm, this set of information is making me more and more concerned 😬

what does the bot need from public_repo that it can't get from (no scope)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW- Write access is needed by a GitHub team for codeowners to trigger so every VFS member has write access to the repo.

All of the FE group on VSP has admin access.

Generally everyone has write access.

If it has to be reset for any reason it should be either updated directly in CircleCI or transmitted securely using keybase or 1Password to someone who has permission to update the environment variable.

The sole communicators in the system are CircleCI and Github. This isn't really a security concern, but more of a reliability one: if one of those services goes down we do not have a fallback plan.

### Privacy Concerns

Expand All @@ -86,7 +108,17 @@ None. This will be static code analysis of a public repo. No user data will be

With this planned implementation we don't have a way of forcing a VFS team to wait for a manual review - all we do is _ask_ them to wait. It's possible that some teams won't respect this request and will merge a PR that has code that requries a manual review.

It is difficult to estimate the liklihood of this risk. These changes will be communicated to the teams, and if they disregard the process then we will be forced to implement stricter rules. Here are some possible actions we could take, either by themselves or in combination:
It is difficult to estimate the liklihood of this risk. In order to discover when VFS teams are ignoring the manual review requests, we can perform [a Github Search](https://github.com/department-of-veterans-affairs/vets-website/pulls?q=is%3Apr+commenter%3Ava-vfs-bot+is%3Amerged+) with some special parameters:

> `is:pr commenter:va-vfs-bot is:merged`

This will show us any merged PRs that the `va-vfs-bot` has commented on, and from there we can look at each PR to see if a review was left by the frontend-review-group before it was merged.
To help with the search we can also add a date parameter if we only want to look at PRs merged after a certain date:

> `merged:>=YYYY-MM-DD`

If we discover that teams are disregarding the manual review requests then we will implement stricter rules.
Here are some possible actions we could take, either by themselves or in combination:

- Have the bot leave a review that "requests changes" rather than a simple comment.
- If the script leaves a comment for manual review, have the process exit with a failing status and make it block the build.
Expand Down
40 changes: 33 additions & 7 deletions platform/engineering/manual-review-triggers.md
Expand Up @@ -14,17 +14,43 @@ Any of the following items if found in a PR diff will prompt a manual review

### Sentry calls

We want to ensure that anything logged to Sentry will be useful (don't log an entire request response if all we care
about is an error code) and will not contain PII.
We review logs to Sentry to ensure that any new logs will be useful and will not contain PII.

Examples:
- Don't log an entire request response if all that's important is an error code
- Don't log user-identifying information such as names, emails, or user IDs

### Disabling ESLint

We have various ESLint rules in place to help improve code quality.
There are situations where it makes sense to disable certain rules, but those will have
to be evaluated on a case-by-case basis.
Disabling eslint rules will be evaluated on a case-by-case basis.
bkjohnson marked this conversation as resolved.
Show resolved Hide resolved

### Adding icons

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool


We use fontawesome as a dependency which encourages icons to be added with the `<i>` tag.
Sometimes an icon is used purely as decoration, but other times it is used to convey meaning to the user.
We want to ensure that if an icon is being used semantically, those semantics are also conveyed to a screen reader.
We use Font Awesome as a dependency, which uses the `<i>` tag for adding icons. Sometimes an icon is used purely as decoration, but other times it is used to convey meaning to the user.

We review to ensure that whenever an icon is being used semantically, those semantics are also conveyed to a screen reader.


## Required Environment variables

The script relies on some environment variables.

### Provided by Circle by default
- `CIRCLE_PROJECT_REPONAME`
- `CIRCLE_PROJECT_USERNAME`
- `CIRCLE_PULL_REQUEST` (link to the PR -used to get PR number)

### Configured in the Circle UI
- `BOT_NAME`
- This isn't sensitive, but it made sense to pair it with the auth token
- `GITHUB_TOKEN`
- An Oauth token used to make calls to the Github API

### YAML config
- `CODE_PATTERN`
- Regex pattern which will trigger a review comment if found
- `LINE_COMMENT`
- Review comment for an individual line comment
- `OVERALL_REVIEW_COMMENT`
- Review comment for the whole review