Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SA: Implement schema and methods for (account, hostname) pausing #7490

Merged
merged 26 commits into from
Jun 17, 2024

Conversation

beautifulentropy
Copy link
Member

@beautifulentropy beautifulentropy commented May 16, 2024

Add the storage implementation for our new (account, hostname) pair pausing feature.

  • Add schema and model for for the new paused table
  • Add SA service methods for interacting with the paused table

Part of #7406
Part of #7475

@beautifulentropy beautifulentropy marked this pull request as ready for review May 16, 2024 15:42
@beautifulentropy beautifulentropy requested a review from a team as a code owner May 16, 2024 15:42
Copy link
Contributor

@beautifulentropy, this PR appears to contain configuration and/or SQL schema changes. Please ensure that a corresponding deployment ticket has been filed with the new values.

@beautifulentropy beautifulentropy marked this pull request as draft May 16, 2024 16:53
sa/model.go Outdated Show resolved Hide resolved
@beautifulentropy beautifulentropy marked this pull request as ready for review May 16, 2024 17:45
@letsencrypt letsencrypt deleted a comment from github-actions bot May 16, 2024
Copy link
Contributor

@aarongable aarongable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few high-level suggestions to simplify the new gRPC API methods:

  • Pause and Repause can be the same method, since I don't think the RA needs to know that a pair has been previously paused. The SA can determine which case it is in inside a transaction, and increment the relevant metric.
  • CheckPair[s]Paused can be a single method, no need for separate singular/plural implementations (when would we be calling the singular one?). And rather than taking a single identifier type and many values, it should take a repeated message Identifier { string type = 1; string value = 2 }. Let the SA handle the need for multiple separate db queries if necessary; no need to expose that deficiency of our database schema to the RA.
  • I'm not clear on why we need both UnpausePair and UnpauseAccount, since the current design doc only calls for ever unpausing whole accounts at a time.

Doing all three of these would reduce the new API surface to just three methods: CheckPairsPaused, PausePair, UnpauseAccount.

There's one more method which might be necessary: GetPausedIdentifiersForAccount, to be used by the self-service unpausing page, to populate it with a (truncated) list of identifiers that will be unpaused.

sa/model.go Outdated Show resolved Hide resolved
@beautifulentropy
Copy link
Member Author

beautifulentropy commented May 21, 2024

  • Pause and Repause can be the same method, since I don't think the RA needs to know that a pair has been previously paused. The SA can determine which case it is in inside a transaction, and increment the relevant metric.

On its face I think this is a good idea. However, it was implemented this way because I had always assumed that these observations would be confined to the RA. I'm a little uncomfortable with the idea of emitting metrics dependant on business logic inside of the storage layer.

  • CheckPair[s]Paused can be a single method, no need for separate singular/plural implementations (when would we be calling the singular one?). And rather than taking a single identifier type and many values, it should take a repeated message Identifier { string type = 1; string value = 2 }. Let the SA handle the need for multiple separate db queries if necessary; no need to expose that deficiency of our database schema to the RA.

Fair. I'll fix this.

  • I'm not clear on why we need both UnpausePair and UnpauseAccount, since the current design doc only calls for ever unpausing whole accounts at a time.

You're right, UnpausePair was an oversight on my part.

There's one more method which might be necessary: GetPausedIdentifiersForAccount, to be used by the self-service unpausing page, to populate it with a (truncated) list of identifiers that will be unpaused.

Thanks, I'll add this.

@aarongable
Copy link
Contributor

On its face I think this is a good idea. However, it was implemented this way because I had always assumed that these observations would be confined to the RA. I'm a little uncomfortable with the idea of emitting metrics dependant on business logic inside of the storage layer.

I totally agree with this line of thinking. But for me it's a balancing act. I think emitting the "repaused" metric from the SA is slightly unfortunate, because it does feel like an RA sort of thing to be measuring. But I think that exposing twice as many methods from the SA and having strict calling conventions for those methods that will fail if the RA is ever wrong (or races against another RA!) is significantly more unfortunate. Requiring every potential SA client to have identical "call CheckPairsPaused, then depending on its answer, either call Pause or Repause, and gracefully handle the error in case the situation has changed between those two calls" logic feels much worse than allowing SA clients to just say "hey, pause this one" and letting the SA handle the intricacies of making that idempotent inside a database transaction.

@beautifulentropy
Copy link
Member Author

On its face I think this is a good idea. However, it was implemented this way because I had always assumed that these observations would be confined to the RA. I'm a little uncomfortable with the idea of emitting metrics dependant on business logic inside of the storage layer.

I totally agree with this line of thinking. But for me it's a balancing act. I think emitting the "repaused" metric from the SA is slightly unfortunate, because it does feel like an RA sort of thing to be measuring. But I think that exposing twice as many methods from the SA and having strict calling conventions for those methods that will fail if the RA is ever wrong (or races against another RA!) is significantly more unfortunate. Requiring every potential SA client to have identical "call CheckPairsPaused, then depending on its answer, either call Pause or Repause, and gracefully handle the error in case the situation has changed between those two calls" logic feels much worse than allowing SA clients to just say "hey, pause this one" and letting the SA handle the intricacies of making that idempotent inside a database transaction.

I agree and as long as I've got support from you I'm happy to implement this as described above.

@pgporada pgporada self-requested a review May 22, 2024 14:19
sa/model.go Outdated Show resolved Hide resolved
sa/model.go Outdated Show resolved Hide resolved
sa/model.go Outdated Show resolved Hide resolved
sa/model.go Outdated Show resolved Hide resolved
sa/model.go Outdated Show resolved Hide resolved
sa/proto/sa.proto Outdated Show resolved Hide resolved
sa/proto/sa.proto Outdated Show resolved Hide resolved
sa/model.go Outdated Show resolved Hide resolved
sa/saro.go Outdated Show resolved Hide resolved
sa/model.go Outdated Show resolved Hide resolved
aarongable
aarongable previously approved these changes Jun 13, 2024
sa/sa.go Outdated Show resolved Hide resolved
sa/sa.go Outdated Show resolved Hide resolved
sa/db-next/boulder_sa/20240514000000_Paused.sql Outdated Show resolved Hide resolved
sa/saro.go Show resolved Hide resolved
sa/db-users/boulder_sa.sql Show resolved Hide resolved
sa/sa_test.go Show resolved Hide resolved
sa/sa.go Outdated Show resolved Hide resolved
sa/sa.go Show resolved Hide resolved
sa/saro.go Outdated Show resolved Hide resolved
pgporada
pgporada previously approved these changes Jun 14, 2024
Copy link
Contributor

@aarongable aarongable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-LGTM, with a pile of optional nits :)

sa/model.go Show resolved Hide resolved
sa/model.go Show resolved Hide resolved
sa/model.go Show resolved Hide resolved
sa/sa.go Show resolved Hide resolved
sa/model.go Show resolved Hide resolved
sa/sa.go Show resolved Hide resolved
sa/sa_test.go Show resolved Hide resolved
sa/sa_test.go Show resolved Hide resolved
@beautifulentropy beautifulentropy merged commit 594cb13 into main Jun 17, 2024
12 checks passed
@beautifulentropy beautifulentropy deleted the fail3pause-part-1 branch June 17, 2024 14:18
beautifulentropy added a commit that referenced this pull request Jun 17, 2024
beautifulentropy added a commit that referenced this pull request Jun 17, 2024
beautifulentropy added a commit that referenced this pull request Jun 17, 2024
beautifulentropy added a commit that referenced this pull request Jun 20, 2024
@beautifulentropy
Copy link
Member Author

Tracking this change in SRE issue: IN-10533

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants