Skip to content

Conversation

@bhearsum
Copy link
Contributor

@bhearsum bhearsum commented Oct 24, 2024

This is part 1 of the plan from this document. The latest version of this has been tested on dev scriptworkers in https://treeherder.mozilla.org/jobs?repo=try&revision=3856381ba7e5d539896bc5dca31aa7a4df43c7ed. There are a handful of failures that we can ignore:

  • MAR signing tasks fail (but they get past the actual autograph communication part) due to not having the real dep MAR key on autograph stage. This will be fixed as part of https://mozilla-hub.atlassian.net/browse/AUT-339, but ought not to block this PR.
  • linux64 signing failed due to a bad keyid, which I fixed and reran for.
  • part of mac signing failed due to me not configuring stage_autograph_langpack on the dev mac signing worker. I could, theoretically, do this and rerun, but I honestly don't think it's worth the effort. All of the non ja-jP-mac signing worked fine, because we don't do langpacks for them. I have no reason to believe this will fail in fake-prod or prod.
  • focus signing failed due to a bad secret. I fixed this and reran.

@bhearsum bhearsum force-pushed the autograph-gcp-stage branch from 057b3e1 to fd3008c Compare October 24, 2024 13:15
@bhearsum
Copy link
Contributor Author

This PR is still in draft while we wait for all of the necessary keys to be populated in autograph stage. (I'll probably also do some sanity checking with scriptworker dev instances before having this reviewed and landed.)

@bhearsum bhearsum force-pushed the autograph-gcp-stage branch 5 times, most recently from 13807e0 to 9d5fb66 Compare October 29, 2024 00:05
@bhearsum bhearsum force-pushed the autograph-gcp-stage branch 10 times, most recently from 3e3a933 to fd4de3a Compare November 19, 2024 01:50
@bhearsum bhearsum force-pushed the autograph-gcp-stage branch 2 times, most recently from 5026ac2 to 82e5660 Compare November 26, 2024 00:24
@bhearsum bhearsum changed the title AUT-293: migrate releng signers to use GCP stage AUT-293: add stage formats for all dev/fake-prod signingscript formats Nov 26, 2024
@bhearsum bhearsum force-pushed the autograph-gcp-stage branch 2 times, most recently from 14a51ba to 3a041b4 Compare November 26, 2024 15:30
@bhearsum bhearsum marked this pull request as ready for review November 26, 2024 16:08
@bhearsum bhearsum requested a review from jcristau November 26, 2024 16:14
Copy link
Contributor

@jcristau jcristau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I only looked at the first commit initially.

@bhearsum bhearsum force-pushed the autograph-gcp-stage branch from 456ff73 to d3eee3e Compare November 28, 2024 21:05
@bhearsum bhearsum requested a review from jcristau November 28, 2024 21:41
@bhearsum
Copy link
Contributor Author

https://treeherder.mozilla.org/jobs?repo=try&revision=09c0318c363c2043d61db910ba924de5c2a68e7c&searchStr=sign has my latest tests with this patch + up-to-date sops + cloudops-infra. I've also prepared the signingscript-fake-prod branch in the sops repo with the necessary secrets changes we'll need for fake-prod before this lands, as well as some tests to ensure that dev + fake-prod entries (where both exist) have the same autograph secrets.

@bhearsum
Copy link
Contributor Author

I'm going to fix up the things you mentioned, I also want to note here that I'm looking at a strange failure in this linux l10n signing job. In just this one chunk, we ended up with:

aiohttp.client_exceptions.ClientConnectorDNSError: Cannot connect to host autograph-external.prod.autograph.services.mozaws.net:443 ssl:default [Name or service not known]

Normally I'd write this off as network gremlins, but the fact that it's trying to connect to autograph prod is very surprising, as it's configured only to use stage formats. I've pushed an additional change with some logging improvements to help diagnose this, and see if in fact all of the chunks are doing some signing against prod.

@bhearsum bhearsum force-pushed the autograph-gcp-stage branch 2 times, most recently from a763a35 to 67a2eef Compare November 29, 2024 20:51
@bhearsum
Copy link
Contributor Author

I'm going to fix up the things you mentioned, I also want to note here that I'm looking at a strange failure in this linux l10n signing job. In just this one chunk, we ended up with:

aiohttp.client_exceptions.ClientConnectorDNSError: Cannot connect to host autograph-external.prod.autograph.services.mozaws.net:443 ssl:default [Name or service not known]

Normally I'd write this off as network gremlins, but the fact that it's trying to connect to autograph prod is very surprising, as it's configured only to use stage formats. I've pushed an additional change with some logging improvements to help diagnose this, and see if in fact all of the chunks are doing some signing against prod.

This turned out to be another case of a format being hardcoded deep in the bowels of signingscript. I've fixed it, and found a bunch more along the way. I've also added some logging on the success cases. I'm kicking off some widespread testing again, and I'll verify that a) things continue to work, and b) everything is actually using stage when it should.

Regardless, this will need review again, as the changes I've made are not trivial.

@bhearsum bhearsum force-pushed the autograph-gcp-stage branch 5 times, most recently from 02420a5 to 1b79253 Compare December 2, 2024 15:56
@bhearsum
Copy link
Contributor Author

bhearsum commented Dec 2, 2024

My latest try push is still finishing up, but it's already done every type of signing AFAICT, and I'm quite confident it will turn out fine. (The scattered failures are DNS resolution issues, which I've reported back to the Autograph team about.)

@bhearsum bhearsum requested a review from jcristau December 2, 2024 17:16
Copy link
Contributor

@jcristau jcristau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of logging nits.

fmt = None
base_filename = os.path.basename(filename)
if base_filename not in _WIDEVINE_BLESSED_FILENAMES and base_filename not in _WIDEVINE_NONBLESSED_FILENAMES:
log.debug(f"{filename} is not a widevine signing file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is going to be quite noisy...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fair. It's probably not necessary to add this in the end; we already log the other path.

log.debug("{} is already signed! Skipping...".format(filename))
blessed = True

log.debug("Found {} to sign {}".format(filename, blessed))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Found firefox/firefox-bin to sign False" is a bit of an odd message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this could be improved.

@jcristau
Copy link
Contributor

jcristau commented Dec 3, 2024

Another failure on the try push is https://firefox-ci-tc.services.mozilla.com/tasks/FIpZBU_ZSUqWZ5KbKncx1g, where iscript doesn't know about stage_autograph_langpack; probably something to handle (or not) separately from this PR.

We need this to easily allow dev and fake-prod scriptworkers to opt into testing against Autograph stage. Rather than duplicating all of the formats, we add some simple fallback to the non-stage_ version when selecting signing function. (Note that we must pass through the original format to allow `get_autograph_config` to find the correct server configuration deeper down the stack.)
These variants don't necessarily use the same certs as the autograph prod versions, but they are similar enough that they allow us to verify that Autograph works from a functional point of view.
@bhearsum
Copy link
Contributor Author

bhearsum commented Dec 3, 2024

Another failure on the try push is https://firefox-ci-tc.services.mozilla.com/tasks/FIpZBU_ZSUqWZ5KbKncx1g, where iscript doesn't know about stage_autograph_langpack; probably something to handle (or not) separately from this PR.

Yeah. I did some best effort testing of stage with iscript, but that configuration and set-up is a whole other beast that I'm not prepared to tackle at the moment. (I did do enough testing and set-up to verify that iscript machines can route to the new autograph envs, and that the other types of signing they do with autograph works.)

@bhearsum bhearsum force-pushed the autograph-gcp-stage branch from 1b2393a to eba63fc Compare December 3, 2024 13:39
@bhearsum bhearsum merged commit 9ada102 into mozilla-releng:master Dec 3, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants