Add predicates for human reviews #151

adityasaky · 2023-03-13T19:25:40Z

This is still a WIP. Crev needs a major update but I took a pass at defining a generic human review attestation type followed by system specific ones that extend the generic predicate.

Fixes #77

spec/predicates/human-review/human-review.md

spec/predicates/human-review/vcs.md

spec/predicates/human-review/crev.md

TomHennen · 2023-03-24T19:12:16Z

spec/predicates/human-review/vcs.md

+```json
+{
+    "_type": "https://in-toto.io/Statement/v1",
+    "subject": [


It would probably be good to discuss what the subject is (that it should point to some code in a VCS) and probably to discuss if the review covers the referenced change or a review of the entire codebase? Maybe that could be another field in the predicate?

That's a great point, I considered supporting a range of commits that comprise a changeset. I'll take a pass at tweaking it.

Is it too complex to have the subject point to the tip and include starting commit for the range in the predicate?

I think git & commit ranges are hard to reason about. Perhaps:

The predicate could contain multiple subjects, one per gitCommit (if it's attesting to commit level info)

If attesting to the entirety of a repo it could use gitTree (and maybe the head commit)?

What if we leaned into ITE-4 to make the subject of a github-specific attestation a pull request itself? I think it could work but we'd then have to identify how we link the commits in the PR to that being used as a source for the build etc.

We've been thinking pretty hard about this on our end, but it really comes down to what you're trying to do.

My guess is that people will want to look at a specific commit/merge and ask if it was reviewed. The attestation should probably provide a link to the review (ITE-4 can probably help there, but maybe it can just be a URI?). Then I suspect that some users will want to perform that same check for each commit in the log.

If that's true then I think the subject should be the commit where the PR was merged. IMO we'd only want the attestation for the final commit because any review or approval probably applies to the sum of the changes and not the individual commits in the PR.

That means that for a branch to be considered reviewed you want each commit the branch pointed to to have a review that applies to it, but you don't necessarily care about the intermediate commits since the branch never pointed to them individually and you likely don't want people to use those commits directly anyways.

What goal do the users of this predicate have in mind "how do I know if the all the code in this repo is safe?", "how can I identify which things haven't been reviewed so I can take a closer look at them?", "I'd like to make sure I trust the folks that provided all reviews on this repo so I need to know who they are", "I'd like to make sure the code in this repo was reviewed for problem X". "I'd like to have confidence that, at some point, the code in this repo was evaluated for problems X, Y, Z, but I don't need to ensure that each individual commit meets that bar".

I think I agree. For a v0.1.0, I suggest we leave it as applying to one or more commits, up to the user. In automated systems, I foresee it being issued at the tip of the feature branch, and I think this works from the "was this change reviewed? give me a CR issued for this PR matching the tip of the PR's branch".

As we always allow multiple subjects, we should have no trouble supporting more granular use cases such as a per-commit scenario.

TomHennen · 2023-03-30T12:56:20Z

Have we given any thought tho how this interacts with #124 and this testifysec attestation. cc @colek42

adityasaky · 2023-05-26T15:28:43Z

Have we given any thought tho how this interacts with #124 and #124 (comment). cc @colek42

I think they're complementary. The review predicate can apply to sources recorded in links in a prior step, say from a git clone, and we can extend that with other notions about the repository using testifysec's git attestor.

marcelamelara

Thank you for getting this started, @adityasaky ! I have a general question, and apologies if this has already been addressed: what's the reasoning for needing three separate predicates for this? The human review and VCS review predicate are practically identical and frankly don't capture much information. I really like the Crev format, and wonder if we can either a) simply use the Crev format in the other use cases, or b) extract the most important properties a consumer wants out of a human review attestation, and develop a single format that will be compatible with all three use cases (Crev, generic, and VCS). If we go with option b, I suspect the predicate will end up looking a lot like Crev, possibly with a few extra fields.

protos/in_toto_attestation/predicates/human_review/crev/v0/crev.proto

spec/predicates/human-review-crev.md

spec/predicates/human-review-vcs.md

spec/predicates/human-review.md

marcelamelara · 2023-06-16T22:36:04Z

CC'ing @pdxjohnny as this in-toto predicate/topic may be of interest re: code ingestion validation.

protos/in_toto_attestation/predicates/human_review/v0/human_review.proto

protos/in_toto_attestation/predicates/human_review/vcs/v0/vcs.proto

spec/predicates/human-review-vcs.md

spec/predicates/human-review.md

Signed-off-by: Aditya Sirish <aditya@saky.in>

* Use RD for VCS review type reviewers * Show source vs binary subject for crev predicate Signed-off-by: Aditya Sirish <aditya@saky.in>

Signed-off-by: Aditya Sirish <aditya@saky.in>

spec/predicates/human-review-crev.md

TomHennen · 2023-07-17T13:35:07Z

spec/predicates/human-review-crev.md

+
+Type URI: (tentative) https://in-toto.io/attestation/human-review/crev/v0.1
+
+Version: Do we use Crev's versioning? Currently -1? What about their URI?


Do we have a resolution for this? If we don't know maybe just stick with v0.1?

I'll reach out to the crev developers again.

I've added a note for now. The crev developers are open to us reimplementing crev without necessarily being backwards compatible. IMO we should start in a place of compatibility (or close-to-compatibility) and see where we go from there. For now, I've added a note about crev's versioning and updated to using in-toto specific URIs.

spec/predicates/human-review-vcs.md

Signed-off-by: Aditya Sirish <aditya@saky.in>

TomHennen

Just two minor changes.

I think that's it besides resolving the question of including the version in the type?

protos/in_toto_attestation/predicates/human_review/crev/v0/crev.proto

protos/in_toto_attestation/predicates/human_review/vcs/v0/vcs.proto

Signed-off-by: Aditya Sirish <aditya@saky.in>

adityasaky · 2023-08-21T14:51:46Z

I've added a new field to the VCS one called target that records the base branch and its state. This is, IMO, valuable when judging a changeset as it matters where it's being submitted to and when.

zachariahcox · 2023-09-03T16:07:26Z

spec/predicates/human-review-vcs.md

+
+`reviewers` _array of ResourceDescriptor_, _required_
+
+Indicates uniquely the identity of one or more reviewers. MUST NOT be used over


The source of the attestation is always the pull request product, right?

Can you elaborate on source? Do you mean the subject? The subject of the attestation identifies the tip of the PR feature branch.

zachariahcox · 2023-09-03T16:12:37Z

spec/predicates/human-review-vcs.md

+                "gitCommit": "330500b54433de4f6f9575676b67738b98ba5e54",
+            },
+        },
+        "reviewLink": "https://github.com/in-toto/in-toto/pull/503#pullrequestreview-1341209941",


This spec doesn't describe the actual change set that was reviewed, and therefore places a lot of the load on this reviewLink.

For a given PR, the diffs reviewed and voted upon by each reviewer regularly change (unless rules are enforced to dismiss "stale" reviews).

Mechanically this review link is reverencing the critical "provenance" data maintained by the PR product (GitHub in this case).
Pull request data is saved as evidence justifying the existence of the new repo version.

PR data is not quite as nice as signed attestation statements like this though.
PR data can change over time, either by-design (EG: by rerunning checks on the same refs) or maliciously if an attacker gains access to the DB.

The idea is to infer the diff from the subject (which identifies the tip of the reviewed topic branch) and the base recorded in the predicate. I leaned away from including the full diff inline due to length considerations. reviewLink is included ATM for completeness, any information that isn't considered vital to the review process for the changeset but may still be necessary.

You make a good point about the iterative nature of reviews and that's where a tighter alignment with GitHub mechanics would be helpful to answer questions like when the attestation is signed and generated. If an attestation is generated at merge time, for example, it captures the final states which is probably what we care about?

TODO: target needs to be expanded to handle current tip of the base branch and the merge base for diff inference.

Signed-off-by: Aditya Sirish <aditya@saky.in>

marcelamelara · 2023-09-15T21:46:20Z

@adityasaky Can you help clarify this? As I re-review these predicates, I have a few questions and one major concern around Crev specifically.

First, I really like the intent of the project and would definitely like to support it in in-toto, and I noticed that they actually support more SW supply chain info beyond the source code review proofs that this predicate introduces. Specifically, they also seem to support package-level reviews as well as "meta-reviews" that they call trust proofs. Do you intend to add predicates for these other types of Crev claims at a later point in time? AFAIU, the package review proof format is similar, if not identical to the code review proof format. So, I wonder if it would be worth creating a separate, more encompassing Crev "dependency review" predicate that can cover both types of dependency reviews?

Second, my major concern. The project do not seem to have an authoritative spec anywhere, i.e. I found a spec for a review proof in a few different places: main repo vs cargo-crev. According to the main Crev project repo, the Rust implementation is the most-well maintained, up-to-date version of Crev, but I have no idea if the other implementations are truly consistent or anything. I am still in support of having a predicate for Crev, but I would feel a lot more comfortable if we can be unambiguously clear on our end about which specific version of the Crev review proof formats the predicate follows.

EDIT: I realized at the end of my review that this predicate does already seem to cover the package reviews as well, though this isn't always clear in the text of the predicate spec. I'd suggest being very clear about this in the predicate spec. My other questions and concerns still stand.

marcelamelara

Thanks for these changes @adityasaky ! This review only covers the Crev predicate, and I'll come back to review the VCS predicate.

protos/README.md

spec/predicates/README.md

marcelamelara · 2023-09-15T22:03:50Z

spec/predicates/human-review-crev.md

+Identifies the reviewer. This has some meaning for crev's trust proliferation
+aspects, but the identity of the reviewer can also be mapped based on in-toto's
+functionary handling. `idType` is used to determine the contents of `reviewer`.
+The `url` is a reference to the reviewer's crev-proofs repository.


A question and a suggestion. Q: What are possible idType values, and who determines these? Rec: I'd add in here that this field is intended to correspond to the from field in the Code Review Proofs format when the idType matches the ID for Crev.

There's a bit of a disconnect here, because the crev code review type includes from but missing from the package review type. Same for date. I suspect date may be omitted as a package-wide review is meant as an ongoing document with updates as new versions / advisories emerge. On the other hand, I'm not sure about the from field. I wonder if in the in-toto context we want to drop it and lean entirely on the signature. That takes us to a broader conversation about how in-toto supports crev, though.

spec/predicates/human-review-crev.md

marcelamelara · 2023-09-15T22:11:02Z

spec/predicates/human-review-crev.md

+Optional field with any other comments a reviewer may have about the
+dependency.


Which consumers, if any, of this predicate need to be able to parse/care about the contents of the comment field? This would be helpful info.

marcelamelara

I really like where this is going. @adityasaky I mostly have clarifying comments for the VCS review predicate.

marcelamelara · 2023-09-18T20:18:43Z

spec/predicates/human-review-vcs.md

+
+Version: 0.1.0
+
+## Purpose


Given the Crev dependency review predicate is available as well, I think it would be helpful to add a short comment here about the complementary relationship between that and this predicate.

marcelamelara · 2023-09-18T20:20:50Z

spec/predicates/human-review-vcs.md

+GitHub allows repository owners to create protection rules that require a
+threshold of approvals before a pull request can be merged. This predicate can
+be used to express meeting such a policy. In this case, GitHub would issue a
+signed attestation identifying all the reviewers for a pull request. The


The idea is that this sort of attestation wouldn't be generated until the threshold is met, right?

Suggested change

signed attestation identifying all the reviewers for a pull request. The

signed attestation identifying all accepting reviewers for a pull request. The

Also, this may be getting too much into the weeds.... the current GitHub rule for requiring approvals before a PR can be merged reads "When enabled, pull requests targeting a matching branch require a number of approvals and no changes requested before they can be merged" (emphasis mine).

I've personally experienced this: even though I had the threshold number of needed approvals, a third reviewer had requested changes in the past and not re-reviewed the PR. It wasn't until their review was either dismissed or updated that the PR was able to be merged. This may be a GH-specific rule, but what is the expectation for generating this attestation in this sort of case? The current text only describes how the attestation subject is chosen wrt a PR merge rule, but it may be worth clarifying the exact point or operation upon which this attestation should be generated (e.g., after the approved PR has been merged).

These are great questions and have (somewhat) come up in the SLSA Source track doc as well. In my mind, it comes down to when the attestation is created and signed + what it's associated with. I authored this take on the VCS review predicate thinking it'd be generated at merge time. But this obfuscates cases where stale approvals aren't dismissed (which is I think the more common configuration). @zachariahcox has made this point in the Source track doc if I'm not mistaken, and his take on a similar predicate identifies associates each individual review of the changeset with the commit at the tip of the topic branch. Settling when the predicate is generated will help us decide what and how its contents must be presented IMO.

spec/predicates/human-review-vcs.md

marcelamelara · 2023-09-18T20:37:01Z

spec/predicates/human-review-vcs.md

+reviewers, the attestation is re-created for each new review and the time
+reflects that of the latest review.


What happens to the old attestations that are superseded by the more recent review attestations? I'm wondering if we should add the ability to link to prior attestations related to the same subject PR to maintain full visibility into the review history.

Related: #151 (comment)

I think it depends on the intended policies. I'm not sure we have policies / use cases that require full review history, but I'm not opposed to including the information. This is currently written with the admittedly simpler rule k of n approve changes in mind.

For a v0.1, I think the k out of n assumption is actually fine. I would just make it explicit, if it isn't stated anywhere yet.

Signed-off-by: Aditya Sirish <aditya@saky.in>

marcelamelara

Only some minor nits, but LGTM otherwise. Once the final outstanding comments are resolved, I think this is ready for a v0.1 release.

marcelamelara · 2023-10-10T15:28:51Z

spec/predicates/human-review-crev.md


 Indicates time of review creation. `timestamp` in the original crev
 specification.

 `thoroughness` _enum_, _required_

 Describes how thorough the reviewer was. Must be set to one of `low`, `medium`,
-or `high`.
+`high`, or `none`.


Semantically, what does a none value tell a consumer of the attestation? Could this be interpreted as an "endorsement" of a particular package?

marcelamelara · 2023-10-10T15:31:23Z

spec/predicates/human-review-crev.md

-the review, understanding of the source code, and a final rating.
+crev enables social review of popular open source software dependencies. A crev
+review includes information such as the thoroughness of the review,
+understanding of the source code, and a final rating. The ratings for these


Being very nitpicky here, though I think this conveys the expectations more clearly.

Suggested change

understanding of the source code, and a final rating. The ratings for these

understanding of a package's source code, and a final rating. The ratings for these

marcelamelara · 2023-10-10T15:34:22Z

spec/predicates/human-review-crev.md

-dependency, and can be performed by one or more of several actors in the supply
-chain. The developer importing a new dependency can perform the review or a
-dedicated security team can be tasked with it.
+process of reviewing and verifying the source code of a particular dependency,


Nit:

Suggested change

process of reviewing and verifying the source code of a particular dependency,

process of reviewing and verifying the source code of a particular open-source package,

jkjell · 2023-10-12T02:46:20Z

spec/predicates/human-review-vcs.md

+    "predicate": {
+        "reviewers": ["<ResourceDescriptor>", ...],
+        "targetTip": "<ResourceDescriptor>",
+        "mergeBase": "<ResourceDescriptor>",
+        "reviewLink": "<URI OF REVIEW>",
+        "reviewTime": "<TIMESTAMP>",
+        "annotations": {...}


Perhaps this falls under the annotations (or could better fit into the CREV predicate) but, I wonder if there's some quantitative that would be useful to include here. For example, number of lines of code added/removed/modified. I've seen some projects require additional reviewers for larger feature work, and fewer reviewers for minor bug fixes. It may be hard to craft a policy around a metric like that but, most other options I can think of turn VCS specific pretty quickly (things like commit messages that link to specific issues with a label or tag for [feat] or [bug]).

marcelamelara · 2023-12-12T17:36:26Z

@adityasaky Where have we landed on this? At the last attestation maintainer's meeting, we decided to close this issue if it's stalled while we wait for the SLSA Source Track to progress.

adityasaky force-pushed the human-review-attestations branch 5 times, most recently from b05e4dc to 2af4cac Compare March 16, 2023 19:55

adityasaky marked this pull request as ready for review March 16, 2023 19:55

adityasaky force-pushed the human-review-attestations branch from 2af4cac to 7b5ce18 Compare March 16, 2023 19:57

adityasaky mentioned this pull request Mar 16, 2023

Capture proposed new predicates from repo #155

Open

TomHennen requested changes Mar 24, 2023

View reviewed changes

adityasaky force-pushed the human-review-attestations branch from 7b5ce18 to eb8e2cb Compare May 25, 2023 17:15

adityasaky requested a review from a team as a code owner May 25, 2023 17:15

adityasaky requested a review from TomHennen May 25, 2023 17:29

adityasaky force-pushed the human-review-attestations branch 3 times, most recently from e51c218 to 8971878 Compare May 26, 2023 17:46

marcelamelara reviewed Jun 15, 2023

View reviewed changes

protos/in_toto_attestation/predicates/human_review/crev/v0/crev.proto Outdated Show resolved Hide resolved

protos/in_toto_attestation/predicates/human_review/crev/v0/crev.proto Outdated Show resolved Hide resolved

spec/predicates/human-review-crev.md Outdated Show resolved Hide resolved

marcelamelara reviewed Jun 15, 2023

View reviewed changes

spec/predicates/human-review-vcs.md Outdated Show resolved Hide resolved

spec/predicates/human-review.md Outdated Show resolved Hide resolved

adityasaky force-pushed the human-review-attestations branch from 8971878 to f705b61 Compare June 19, 2023 19:52

pxp928 reviewed Jul 14, 2023

View reviewed changes

adityasaky added 3 commits July 14, 2023 12:34

Add predicates for human reviews

e1db1d9

Signed-off-by: Aditya Sirish <aditya@saky.in>

Review updates

a6edb01

* Use RD for VCS review type reviewers * Show source vs binary subject for crev predicate Signed-off-by: Aditya Sirish <aditya@saky.in>

Remove generic review type

9f98b89

Signed-off-by: Aditya Sirish <aditya@saky.in>

adityasaky force-pushed the human-review-attestations branch from 9b967fb to 9f98b89 Compare July 14, 2023 16:35

TomHennen requested changes Jul 17, 2023

View reviewed changes

(WIP) Address comments

10614e7

Signed-off-by: Aditya Sirish <aditya@saky.in>

adityasaky requested review from TomHennen and marcelamelara July 28, 2023 19:36

adityasaky requested a review from pxp928 July 28, 2023 19:36

TomHennen requested changes Jul 28, 2023

View reviewed changes

protos/in_toto_attestation/predicates/human_review/crev/v0/crev.proto Outdated Show resolved Hide resolved

protos/in_toto_attestation/predicates/human_review/vcs/v0/vcs.proto Outdated Show resolved Hide resolved

adityasaky added 2 commits August 21, 2023 10:45

(WIP) Add target data to VCS reviews

c29b165

Signed-off-by: Aditya Sirish <aditya@saky.in>

(WIP) Fix field names, note versioning

0475f64

Signed-off-by: Aditya Sirish <aditya@saky.in>

adityasaky requested a review from TomHennen August 21, 2023 14:52

adityasaky mentioned this pull request Aug 31, 2023

Attestation for policy verification #277

Open

zachariahcox reviewed Sep 3, 2023

View reviewed changes

Add mergeBase field to VCS predicate

d39f850

Signed-off-by: Aditya Sirish <aditya@saky.in>

marcelamelara requested changes Sep 15, 2023

View reviewed changes

marcelamelara reviewed Sep 18, 2023

View reviewed changes

adityasaky added 2 commits September 19, 2023 12:10

(WIP) Address Marcela's comments on crev

545dc62

Signed-off-by: Aditya Sirish <aditya@saky.in>

(WIP) Address Marcela's comment on VCS predicate

8f6a088

Signed-off-by: Aditya Sirish <aditya@saky.in>

marcelamelara approved these changes Oct 10, 2023

View reviewed changes

jkjell reviewed Oct 12, 2023

View reviewed changes

marcelamelara mentioned this pull request May 3, 2024

Source track attestation claims slsa-framework/slsa#1042

Open


		Type URI: (tentative) https://in-toto.io/attestation/human-review/crev/v0.1

		Version: Do we use Crev's versioning? Currently -1? What about their URI?


		`reviewers` _array of ResourceDescriptor_, _required_

		Indicates uniquely the identity of one or more reviewers. MUST NOT be used over

		Optional field with any other comments a reviewer may have about the
		dependency.

	signed attestation identifying all the reviewers for a pull request. The
	signed attestation identifying all accepting reviewers for a pull request. The

		reviewers, the attestation is re-created for each new review and the time
		reflects that of the latest review.

	understanding of the source code, and a final rating. The ratings for these
	understanding of a package's source code, and a final rating. The ratings for these

	process of reviewing and verifying the source code of a particular dependency,
	process of reviewing and verifying the source code of a particular open-source package,

Add predicates for human reviews #151

Are you sure you want to change the base?

Add predicates for human reviews #151

Conversation

adityasaky commented Mar 13, 2023 • edited by marcelamelara Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomHennen Mar 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomHennen commented Mar 30, 2023

adityasaky commented May 26, 2023

marcelamelara left a comment

Choose a reason for hiding this comment

marcelamelara commented Jun 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomHennen left a comment

Choose a reason for hiding this comment

adityasaky commented Aug 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcelamelara commented Sep 15, 2023 • edited Loading

marcelamelara left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcelamelara left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adityasaky Sep 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcelamelara left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcelamelara commented Dec 12, 2023

adityasaky commented Mar 13, 2023 •

edited by marcelamelara

Loading

TomHennen Mar 30, 2023 •

edited

Loading

marcelamelara commented Sep 15, 2023 •

edited

Loading

adityasaky Sep 19, 2023 •

edited

Loading