New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve renewal rate limiting #2832
Conversation
As described in Boulder issue #2800 the implementation of the SA's `countCertificates` function meant that the renewal exemption for the Certificates Per Domain rate limit was difficult to work with. To maximize allotted certificates clients were required to perform all new issuances first, followed by the "free" renewals. This arrangement was difficult to coordinate. In this commit `countCertificates` is updated such that renewals are excluded from the count reliably. To do so the SA takes the serials it finds for a given domain from the issuedNames table and cross references it with the FQDN sets it can find for the associated serials. With the FQDN sets a second query is done to find all the non-renewal FQDN sets for the serials, giving a count of the total non-renewal issuances to use for rate limiting.
s/it with/them with/ |
sa/sa.go
Outdated
@@ -434,19 +434,18 @@ func (ssa *SQLStorageAuthority) countCertificatesByExactName(domain string, earl | |||
} | |||
|
|||
// countCertificates returns, for a single domain, the count of | |||
// certificates issued in the given time range for that domain using the | |||
// non-renewal certificates issued in the given time range for that domain using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change this to "non-renewal certificate issuances," which I think parses a bit better.
sa/sa.go
Outdated
// provided query, assumed to be either `countCertificatesExactSelect` or | ||
// `countCertificatesSelect`. | ||
// `countCertificatesSelect`. Renewals of certificates issued within the same | ||
// window are considered "free" and not returned as part of this count. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and are not counted" instead of "not returned as part of this count."
sa/sa.go
Outdated
@@ -1031,6 +1046,101 @@ func (ssa *SQLStorageAuthority) CountFQDNSets(ctx context.Context, window time.D | |||
return count, err | |||
} | |||
|
|||
// getFQDNSetsBySerials returns a slice of []byte `setHash` entries from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do type setHash []byte
so we can use types to clearly express the relationship between what this function returns and what the other function takes.
Also, this comment mostly documents the return type and parameters (which can be deduced from the code). Instead, it's better to talk more about the "why." E.g. "finds the setHashes corresponding to a set of certificate serials so we can check whether any previous certificate was issued with the same set of names."
sa/sa.go
Outdated
qmarks[i] = "?" | ||
} | ||
query := "SELECT setHash FROM fqdnSets " + | ||
"WHERE serial IN (" + strings.Join(qmarks, ",") + ") " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trailing space in string.
sa/sa.go
Outdated
query, | ||
params...) | ||
|
||
if err != nil && err != sql.ErrNoRows { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ErrNoRows should be considered an error here because it's an internal consistency violation. If the serial existed in issuedNames, it should exist here.
sa/sa.go
Outdated
} | ||
|
||
// getNewIssuancesByFQDNSet returns a count of new issuances (renewals are not | ||
// included) for a given slice of fqdnSets that were issued after the earliest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/that were issued after/that occurred after/
sa/sa.go
Outdated
|
||
processedSetHashes := make(map[string]struct{}) | ||
issuanceCount := 0 | ||
// Loop through each set hash result, counting issuance per unique set hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*issuances
sa/sa.go
Outdated
key := string(result.SetHash) | ||
// Skip set hashes that we have already processed - we only care about the | ||
// first issuance | ||
if _, exists := processedSetHashes[key]; exists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: if we use a map[string]bool
instead, this becomes much simpler:
if processedSetHashes[key] {
In general, I encourage maps of bools for sets rather than maps of empty struct unless the set is very large, since the readability is typically more important than size in memory. I acknowledge that my advice on this may have changed during the course of the project. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great!
sa/sa.go
Outdated
_, err := ssa.dbMap.Select( | ||
&fqdnSets, | ||
query, | ||
params...) | ||
|
||
if err != nil && err != sql.ErrNoRows { | ||
// NOTE(@cpu): We don't explicitly check if `err != sql.ErrNoRows` here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but even better would be to use the same number of lines to explicitly check for ErrNoRows, and log an error saying something about "internal consistency violation" like above. Leans a little more in the "self-documenting code" direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved (I pushed the commit but it hasn't shown up yet. Maybe some GH lag?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There it is: Fixed in 76e30d1
sa/sa.go
Outdated
// to exist here. Otherwise an internal consistency violation occured and | ||
// needs to be audit logged | ||
if err != nil { | ||
err := fmt.Errorf("getFQDNSetsBySerials returned no rows - internal consistency violation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will mask other non sql.ErrNoRows
errors, should probably be a special case.
for i, setHash := range fqdnSets { | ||
// We have to cast the setHash back to []byte here since the sql package | ||
// isn't able to convert `sa.setHash` for the parameter value itself | ||
params[i] = []byte(setHash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: probably doesn't make sense for this since it's a single use but we could add a converter case to sa/type-converter.go
so that gorp knows how to natively use the setHash
type without having to cast it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Agreed that its probably not worth it here, but good to know for the future.
sa/sa.go
Outdated
// were visible within our search window | ||
fqdnSets, err := ssa.getFQDNSetsBySerials(serials) | ||
if err != nil { | ||
return -1, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually use -1
here? Given our style is generally not to use trapdoor returns for cases where we fail to check the returned error I'd probably rather just use 0
, but ¯_(ツ)_/¯.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was following the convention that the function was using before I touched it: https://github.com/letsencrypt/boulder/blob/master/sa/sa.go#L469:L471
// should loudly complain | ||
if err == sql.ErrNoRows || len(results) == 0 { | ||
ssa.log.AuditErr(fmt.Sprintf("Found no results from fqdnSets for setHashes known to exist: %#v", fqdnSets)) | ||
return 0, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per above comment you're also not consistently returning -1
so I'm not sure I see the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - in that case I'll just update the whole thing to use 0
. I agree that a -1
sentinel value is un-gophery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small comments, otherwise looks good.
@rolandshoemaker I believe this is ready for another 🔍 Thanks! |
As described in Boulder issue #2800 the implementation of the SA's
countCertificates
function meant that the renewal exemption for theCertificates Per Domain rate limit was difficult to work with. To
maximize allotted certificates clients were required to perform all new
issuances first, followed by the "free" renewals. This arrangement was
difficult to coordinate.
In this PR
countCertificates
is updated such that renewals areexcluded from the count reliably. To do so the SA takes the serials it
finds for a given domain from the issuedNames table and cross references
them with the FQDN sets it can find for the associated serials. With the
FQDN sets a second query is done to find all the non-renewal FQDN sets
for the serials, giving a count of the total non-renewal issuances to
use for rate limiting.
Resolves #2800