Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do we compute commit trustworthiness? #9

Open
Ekleog opened this issue Oct 6, 2018 · 5 comments
Open

How do we compute commit trustworthiness? #9

Ekleog opened this issue Oct 6, 2018 · 5 comments
Labels
needs-feedback Needs more people to give their opinion

Comments

@Ekleog
Copy link
Contributor

Ekleog commented Oct 6, 2018

When verifying a commit, we have a list of reviews that all have:

  • A trust level associated to the key of the signer
  • A context-understanding, a diff-understanding, and a thoroughness level
  • A result (that can be taken from result-otherwise, see RFC)

(I don't think priority should be taken into account here… actually, see #8)

How do we compute, from this list of (trust in reviewer, work done by reviewer, result) triplets, the binary result “do I trust this commit”?

Currently my thinking would be:
For each review,

  1. Take the minimum between context-understanding, diff-understanding and thoroughness, and consider it as the level of work the reviewer did for their review. Convert to integer between 1 and 3 (higher is more work), hereafter named work
  2. Take the trust in the reviewer, convert it to integer between 1 and 3 (higher is more trust), name trust
  3. Multiply (todo: just thinking of that randomly, there are most likely much better options) work with trust. This gives a value between 1 and 9 (higher is better) that represents the confidence the user has towards this reviewer's review, hereafter named confidence
  4. Take the result.
    • If it is !, then end result is !
    • If it is -, then remove confidence from the confidence accumulator
    • If it is +, then add confidence to the confidence accumulator
    • If it is 0, then ignore review

Then, if the confidence accumulator exceeds the configured required confidence level, then accept the review.

Drawbacks:

  • It's likely hard to understand. There is no easy translation of “require 2 reviews from this contributor set”, because each review has its own self-trust level. This can be mitigated by having a “simple” tool just default to setting all review parameters to the maximum, setting all trust to 1, and requiring confidence level of 4… but it can be problematic if not everyone uses this tool, because some commits may actually require more than 2 reviews (because a reviewer would not use max. parameters for their review)
  • It doesn't allow to set a different confidence level to different files (eg. for nixpkgs, requiring changes to the linux kernel to have more confidence than a change to random-package-used-by-1-person)

If you have any idea of how to handle these drawbacks… I'd be glad to hear them :)

/cc @oxij @lrvick

This was referenced Oct 6, 2018
@oxij
Copy link
Member

oxij commented Oct 8, 2018

I feel like those "thoroughness" and other fields should be treated like a "priority" field. IMHO, the only fields that should matter are "result" fields. Hence, "no !, more + than - reviews" seems GTM, assuming your own signatures in your local review-branch can override decisions. I probably should change the ordering of those fields in the syntax and examples to make that clear, that's a new TODO.

@lrvick
Copy link

lrvick commented Oct 8, 2018

Overriding is not safe since a malicious server could always replay old signatures for a commit. Once you have signed on a given commit, there is no safe way to issuing a new signature/review safely for that commit.

Personally I don't think anyone should sign anything until they are ready to bless it to move forward in the respective workflow for a given repo.

A malicious server can always drop negative reviews to get bad code shopped but has no motivation to drop positive ones except to create a DoS.

@oxij
Copy link
Member

oxij commented Oct 8, 2018 via email

@Ekleog
Copy link
Contributor Author

Ekleog commented Oct 9, 2018

@oxij Well, the “no !, more + than - reviews” has the issue that it makes it impossible to trust some people more than others (eg. I'd like to trust nixpkgs committers a bit, but not as much as I'd trust people I personally trust, so there'd need to be 2 reviews by nixpkgs committers for me to consider it OK, and only 1 by a person I personally trust).

Overall I think the scheme should also support use cases like the one given by @lrvick in #5 (comment):

been signed by an engineer, 2 CI servers, 1 reviewer, 1 member of release engineering team OR signed by 2 members of the production engineering team (emergency hotfix)

This would still require something a bit more complex than the thing I proposed, though.

Also, I'm not sure what the point of the low settings to thoroughness, context-understanding and diff-understanding is, then, if not taking them into account for review checking: if I set the three to low, then I'd hope people trust (by default) my review less than when I set all of them to high, because otherwise it means I'll just not sign anything with low understanding.

@Ekleog
Copy link
Contributor Author

Ekleog commented Oct 12, 2018

OK, so new proposal heavily based on both of your suggestions:

  • The configuration includes a list of categories. Each category has a list of fingerprints associated to it, with a weight level
  • The user can then require a certain “difficulty level” from each category. All requirements must be fulfilled for the commit to be considered as reviewed. The difficulty level is computing by taking the sum of the weight of the fingerprints having done + reviews and substracting to it the sum of the weight of the fingerprints having done - reviews
  • Any ! by any known fingerprint is treated as a !

What do you think about it? I think this fits the bill for our three use cases (except for @lrvick's for which support is incomplete):

  • @oxij's use case is just having all fingerprints in the same category with weight 1, and a difficulty level of 1
  • @lrvick's use case is categories “CI servers” with difficulty of 2, “reviewer” with difficulty of 1 and "release team" with difficulty of 1. The issue is with the “OR 2 members of the production engineering team” for emergency hotfix… honestly, I don't really see how to handle that without introducing arbitrary boolean expressions in the mix, which sounds too complex. If someone has a better idea…
  • My use case is a single category with eg difficulty level 5, and highly-trusted keys with weight 3 and less-trusted keys with weight 1

Does someone see an idea to handle @lrvick's use case better without introducing too much complexity? Disjunction-of-conjunctions might be a possibility, but…

@Ekleog Ekleog added the needs-feedback Needs more people to give their opinion label Oct 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-feedback Needs more people to give their opinion
Projects
None yet
Development

No branches or pull requests

3 participants