Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"cargo vet certify" github bot workflow #330

Open
Gankra opened this issue Sep 15, 2022 · 8 comments
Open

"cargo vet certify" github bot workflow #330

Gankra opened this issue Sep 15, 2022 · 8 comments

Comments

@Gankra
Copy link
Contributor

Gankra commented Sep 15, 2022

"cargo vet certify, but all with bots/webapps in github so devs don't have to know/use the cli"

An MVP for a bot-based UX should be based on github, either on top of dependabot or a different bot. The design should be extensible enough to port to other infra like gitlab or mozilla's stuff, but supporting those is not in scope for the first version. Of course we will keep that future in mind and not wildly hardcode github stuff. Use json messages, standard auth protocols, etc. where it makes sense.

Basic workflow sketch:

  • cargo-vet fails in ci and spits out json describing the failure (already done)
  • this triggers some sort of bot event to have it consume that json and post a comment on the PR saying audits need to be done and linking to a webapp
  • the webapp has a ui for displaying what audits needs to be done, and links to the URLs that cargo vet diff and friends would use
  • the webapp has a ui for specifying you completed the audit and for what criteria, displaying the EULA text for those criteria (and whatever else certify supports)
  • the webapp has authentication integration with github so when you push certify buttons in the app it does the thing over on github

unclear details:

  • do the certifies appear on the existing PR, open new PRs, or push to main immediately?
  • are the certify commits expected to be done by the bot account or the auditor's account?
  • should there be one commit per audit, or a single "all the audits" commit
    • if the latter, are audits batched up and the user says "all done" in the app, or do they get submitted to github as each one is performed (editing the mega commit)?
@Gankra
Copy link
Contributor Author

Gankra commented Sep 15, 2022

#329 is an initial prototype of this but without any authentication logic, so it just has the user copy base64 strings of json blobs around to sheppard data between the app and github

@Gankra Gankra changed the title github bot workflow "cargo vet certify" github bot workflow Sep 15, 2022
@Gankra
Copy link
Contributor Author

Gankra commented Sep 15, 2022

how to test the current impl's UI:

git fetch origin pull/329/head:nika
git checkout nika
cd certify-gui; npm install; npm run dev

then go to this localhost url.

Which should show this:

image

@Gankra
Copy link
Contributor Author

Gankra commented Sep 15, 2022

To generate new errors for the cargo-vet to report/handle there is a mock project inside the tests.

First, install whatever code for cargo-vet you currently have:

> cargo install --path ./

...
   Compiling cargo-vet v0.3.0 (C:\Users\ninte\dev\cargo-vet)
    Finished release [optimized] target(s) in 1m 25s
   Replacing C:\Users\ninte\.cargo\bin\cargo-vet.exe
    Replaced package `cargo-vet v0.1.0 (C:\Users\ninte\dev\cargo-vet)` with `cargo-vet v0.3.0 (C:\Users\ninte\dev\cargo-vet)` (executable `cargo-vet.exe`)

Then go to the test-project and update a package:

> cd tests/test-project/
> cargo update -p clap
    Updating crates.io index
    Updating clap v3.1.8 -> v3.2.21
      Adding clap_lex v0.2.4

Now cargo vet should fail

> cargo vet
Vetting Failed!

4 unvetted dependencies:
  clap:3.2.21 missing ["safe-to-deploy"]
  atty:0.2.14 likely missing ["safe-to-deploy"]
  clap_lex:0.2.4 likely missing ["safe-to-deploy"]
  hermit-abi:0.1.19 likely missing ["safe-to-deploy"]

recommended audits for safe-to-deploy:
    cargo vet diff clap 3.1.8 3.2.21     (used by test-project)  (176 files changed, 13437 insertions(+), 10196 deletions(-))
    cargo vet inspect atty 0.2.14        (used by clap)          (508 lines)
    cargo vet inspect clap_lex 0.2.4     (used by clap)          (860 lines)
    cargo vet inspect hermit-abi 0.1.19  (used by atty)          (938 lines)

estimated audit backlog: 25939 lines

You can get this report in json form with --output-format=json:

cargo vet --output-format=json
{
  "certify_web_gui_data": null,
  "conclusion": "fail (vetting)",
  "failures": [
    {
      "missing_criteria": [
        "safe-to-deploy"
      ],
      "name": "atty",
      "version": "0.2.14"
    },
    ...
  ],
  "suggest": {
    "suggest_by_criteria": {
      "safe-to-deploy": [
        {
          "name": "clap",
          "notable_parents": "test-project",
          "suggested_criteria": [
            "safe-to-deploy"
          ],
          "suggested_diff": {
            "diffstat": {
              "count": 23633,
              "raw": " 176 files changed, 13437 insertions(+), 10196 deletions(-)\n"
            },
            "from": "3.1.8",
            "to": "3.2.21"
          }
        },
        ...

With the changes in nika's PR, you can now pass --certify-web-gui to make it generate the giant base64 blob that her UI takes in the url (I think it's just a compressed version of the rest of the json):

cargo vet --output-format=json --certify-web-gui
{
  "certify_web_gui_data": "Z3ppcDsfiwgAAAAAAAD/lVVNbxtHDP0rgz3VgbSokh5aHwIkAYrmkEvQW9doqB2uNPAsZzsfUjaB/3sfZyVZdg5JTrI15OMj+R71temjyxwdNbf/fG2ERm5um0QDr3NYxyLNqnHj5B3btzNCLk+WJx9mvFKxALDN3aqxnIA2ZRcEIH/vXTJ9pMymJzFbfIRxcp7tygB4ZUisyZyQbIIYMj705M0xxPuUSUFMiMZJJ32QHINHpqGSw7g8Hl3eh5JNKnGKLjnZoYAk/q+w9JxWeOj3htJtJy/MRyarEUA8omH901ImM8QwmsSS8N2B9blIZE9KaqKYkwmDyXs2A4inGWTHVvHeCzh6rzgpDPlIsSZHBoXB7UrUF82zfHA915x3QYT7WjsH1Mmx1OaFszZtWOwUnORUoz+4VBJr+aUssFMoEZ2ZX7jdtRjtPOXQlxjR72xGJ0C+QW7zsPpmk5d1PVnmj+zu6Lw3EjI2gSXY0jNWlaCYUBI+Ud/l2RyKF460dV7/Q3tTjVW4Ti4T4s9TSOj4SftOppJBu5M3SifEZDRWS0Ys08UlYeI4hDii+FDAyIed6xFwcHw874glI3rhDcCPhC/jSp8wHhQ7h7OEstsrqELN+JqSKnCrclKgLe/p4LBPAGPLIKtz7GQLid6nKtySaFe3M4UjmBVvMNkAwbTmTySSnGFp6/kxWtGXuaqwEeHX2L23FxWpLQzlTP09x4W1TgLuqUDgPJK4qahEKxqslN2o5K5I4wHaSVwXkFBLUTF9H1yuOAi78s2R5rqA9xaE/Ly6ahqesiCbTqOCBMUu1qVx6zBxmHta9u7O0uzk0S2Gekg23dSd7ilaFrXxjhwc9NiLGlpB1cEOecBVf2sjnZA9cMxOlfOsydOuzV9YwkF3ncIIzzlomSsmiEM4o8tQWgvX1qOAi0JJL4QiCOUSL7u5Sk24Ll4rVleDuEq1Bn3CRjh9wk1gBIShE/22mukUW6epOqj8kjnuHW7RjtUjWY+CupHtMtzFztviAIYXnAEYWoeJfNioZ1si+U5G6mNIN6vvCOvEG1qGr5YKS/mRGfesUt2GA+Z6Pv3Pbsb5Kjw9Ft/ciFQwWXMOvsofypcv308/RT3cnX5C0vUPUO9pajTDZ/wwNa/aTfu7Wb82r9qX7cuNQgv0Rv7d44/X81N3d91RzvMV3K9A2fz20yhK6l/Pn58h/TzQvipyDdc8gdq0mz9+EOvu4X8s59hquQcAAA==",
  "conclusion": "fail (vetting)",
  "failures": [
    {
      "missing_criteria": [
        "safe-to-deploy"
      ],
      "name": "atty",
      "version": "0.2.14"
    },
    ...

@Gankra
Copy link
Contributor Author

Gankra commented Sep 16, 2022

correction: the current base64 blob is actually a new json format designed for this usecase. in particular the current json is missing some information the GUI wants like a listing of every supported criteria you can audit for and their description strings / X-implies-Y relationships. If we move away from the base64 format I suggest that we either unconditionally emit this information in json output, or make it something you can toggle on with a flag.

@Gankra
Copy link
Contributor Author

Gankra commented Sep 19, 2022

@mystor had you actually worked out what the bot scripting was going to look like, or were other people on the hook for that step?

I'm mentally sketching out what I want a cleaned up version of this API to look like, I see two reasonable choices but which is better depends on how a bot wants to interface with things (will consult with my coworkers on this, but gonna sketch out the options from the cli's perspective to get them out there). (Also names here are kinda strawman.)

  1. Add --output-format=json-full which includes all the "extra" information about the repo like the defined criteria that a gui/bot would want to have available (probably some subset of the current stuff you're base64ing).

  2. Add cargo vet metadata which prints out just that info as its own json object.

I aesthetically like 2 because it feels very orthogonal and generalizable. My expectation is that 1 is a lot nicer for bots because one json object unconditionally dumped sounds a lot easier than 2 objects, with the second needing to be manually requested on error.

As for getting info back in, I'm not a huge fan of the --from-gui approach (take a json blob of commands from stdin). We already have plumbing commands on cargo-vet and ideally any automation on top of cargo-vet would just use that. That said I'm willing to believe that this format is somehow more desirable (or that you will inevitably just write a script that does --from-gui so maybe it's best to just build it in and save everyone some time).

@Gankra
Copy link
Contributor Author

Gankra commented Sep 19, 2022

Actually maybe https://docs.rs/twelf/0.7.0/twelf/ can let us have our cake an eat it too on the --from-gui side of things

@Gankra
Copy link
Contributor Author

Gankra commented Sep 19, 2022

(we already have some amount of hand-rolled twelf-like logic and it sucks shit and I want to destroy my awful child)

cargo-vet/src/main.rs

Lines 350 to 388 in ddae3ab

//////////////////////////////////////////////////////
// Parse out our own configuration
//////////////////////////////////////////////////////
let default_config = MetaConfigInstance {
version: Some(1),
store: Some(StoreInfo {
path: Some(
metadata
.workspace_root
.join(storage::DEFAULT_STORE)
.into_std_path_buf(),
),
}),
};
// FIXME: what is `store.path` relative to here?
let workspace_metacfg = metadata
.workspace_metadata
.get(WORKSPACE_VET_CONFIG)
.map(|cfg| {
// ERRORS: immediate fatal diagnostic
MetaConfigInstance::deserialize(cfg)
.into_diagnostic()
.wrap_err("Workspace had [{WORKSPACE_VET_CONFIG}] but it was malformed")
})
.transpose()?;
// FIXME: what is `store.path` relative to here?
let package_metacfg = metadata
.root_package()
.and_then(|r| r.metadata.get(PACKAGE_VET_CONFIG))
.map(|cfg| {
// ERRORS: immediate fatal diagnostic
MetaConfigInstance::deserialize(cfg)
.into_diagnostic()
.wrap_err("Root package had [{PACKAGE_VET_CONFIG}] but it was malformed")
})
.transpose()?;

@mystor
Copy link
Collaborator

mystor commented Sep 21, 2022

unclear details:

* do the certifies appear on the existing PR, open new PRs, or push to main immediately?

I don't think we'd really want this workflow to assume that main ever doesn't pass vet, so I expect it'd only run on existing PRs, and either update the PR or open a new PR into that PR's branch (though that's probably a github thing and might have bad ergonomics?) so that vet can be passing in the PR before merging into main.

* are the certify commits expected to be done by the bot account or the auditor's account?

I don't think it matters a ton either way. The auditor should definitely be credited in the who for the audit, but I think it'd be fine if the certify commits themselves came from the bot.

* should there be one commit per audit, or a single "all the audits" commit

I think a single "all the audits" commit would be fine.

  * if the latter, are audits batched up and the user says "all done" in the app, or do they get submitted to github as each one is performed (editing the mega commit)?

I would expect that they're batched up and applied in a single "all done", but I suppose doing something more live could be a reasonable approach? I worry a bit about wasting a ton of CI time with that though, as we'd be re-running CI for each new audit performed, and it wouldn't be passing.

@mystor had you actually worked out what the bot scripting was going to look like, or were other people on the hook for that step?

I hadn't quite gotten to that step in figuring this out. I was originally working on getting a basic framework working with the UI and a mechanism for applying the changes, and was then going to figure out how that would integrate with a bot-based workflow.

I'm mentally sketching out what I want a cleaned up version of this API to look like, I see two reasonable choices but which is better depends on how a bot wants to interface with things (will consult with my coworkers on this, but gonna sketch out the options from the cli's perspective to get them out there). (Also names here are kinda strawman.)

1. Add `--output-format=json-full` which includes all the "extra" information about the repo like the defined criteria that a gui/bot would want to have available (probably some subset of the current stuff you're base64ing).

2. Add `cargo vet metadata` which prints out just that info as its own json object.

I aesthetically like 2 because it feels very orthogonal and generalizable. My expectation is that 1 is a lot nicer for bots because one json object unconditionally dumped sounds a lot easier than 2 objects, with the second needing to be manually requested on error.

One of the reasons I added a new system was to simplify the amount of data being sent into the app as much as possible, as I was planning to communicate it on the client side using the URL. The original idea bholley proposed in #293 was to use a wasm blob of cargo-vet itself within the web client so that the blob could be kept simple and minimal.

As for getting info back in, I'm not a huge fan of the --from-gui approach (take a json blob of commands from stdin). We already have plumbing commands on cargo-vet and ideally any automation on top of cargo-vet would just use that. That said I'm willing to believe that this format is somehow more desirable (or that you will inevitably just write a script that does --from-gui so maybe it's best to just build it in and save everyone some time).

The decision do to it this way with --from-gui rather than have the bot handle the communication with the web UI was largely because of the discussion in #293, though I'm not super attached to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants