-
-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SA: Implement schema and methods for (account, hostname) pausing #7490
Conversation
@beautifulentropy, this PR appears to contain configuration and/or SQL schema changes. Please ensure that a corresponding deployment ticket has been filed with the new values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few high-level suggestions to simplify the new gRPC API methods:
- Pause and Repause can be the same method, since I don't think the RA needs to know that a pair has been previously paused. The SA can determine which case it is in inside a transaction, and increment the relevant metric.
- CheckPair[s]Paused can be a single method, no need for separate singular/plural implementations (when would we be calling the singular one?). And rather than taking a single identifier type and many values, it should take a repeated
message Identifier { string type = 1; string value = 2 }
. Let the SA handle the need for multiple separate db queries if necessary; no need to expose that deficiency of our database schema to the RA. - I'm not clear on why we need both UnpausePair and UnpauseAccount, since the current design doc only calls for ever unpausing whole accounts at a time.
Doing all three of these would reduce the new API surface to just three methods: CheckPairsPaused, PausePair, UnpauseAccount.
There's one more method which might be necessary: GetPausedIdentifiersForAccount, to be used by the self-service unpausing page, to populate it with a (truncated) list of identifiers that will be unpaused.
On its face I think this is a good idea. However, it was implemented this way because I had always assumed that these observations would be confined to the RA. I'm a little uncomfortable with the idea of emitting metrics dependant on business logic inside of the storage layer.
Fair. I'll fix this.
You're right, UnpausePair was an oversight on my part.
Thanks, I'll add this. |
I totally agree with this line of thinking. But for me it's a balancing act. I think emitting the "repaused" metric from the SA is slightly unfortunate, because it does feel like an RA sort of thing to be measuring. But I think that exposing twice as many methods from the SA and having strict calling conventions for those methods that will fail if the RA is ever wrong (or races against another RA!) is significantly more unfortunate. Requiring every potential SA client to have identical "call CheckPairsPaused, then depending on its answer, either call Pause or Repause, and gracefully handle the error in case the situation has changed between those two calls" logic feels much worse than allowing SA clients to just say "hey, pause this one" and letting the SA handle the intricacies of making that idempotent inside a database transaction. |
I agree and as long as I've got support from you I'm happy to implement this as described above. |
ec34ece
to
588a35d
Compare
8918fe4
to
8e95c04
Compare
8e95c04
to
7adf408
Compare
9e71051
to
ea18644
Compare
a3d48f3
to
f3ecde3
Compare
70b8c4f
to
a7dab52
Compare
c1e9f58
to
bf20b84
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-LGTM, with a pile of optional nits :)
Tracking this change in SRE issue: IN-10533 |
Add the storage implementation for our new (account, hostname) pair pausing feature.
Part of #7406
Part of #7475