Skip to content
This repository has been archived by the owner on May 29, 2023. It is now read-only.

Create roots analyzer #40

Merged
merged 1 commit into from
Jan 29, 2020
Merged

Conversation

RJPercival
Copy link
Contributor

@RJPercival RJPercival commented Sep 24, 2019

Creates incident reports when the root certificates returned by a CT Log's get-roots endpoint change.

@RJPercival RJPercival added the enhancement New feature or request label Sep 24, 2019
@googlebot googlebot added the cla: yes https://cla.developers.google.com/ label Sep 24, 2019
@codecov-io
Copy link

codecov-io commented Sep 24, 2019

Codecov Report

Merging #40 into master will increase coverage by 0.86%.
The diff coverage is 70.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #40      +/-   ##
==========================================
+ Coverage   64.94%   65.81%   +0.86%     
==========================================
  Files           8        9       +1     
  Lines         368      430      +62     
==========================================
+ Hits          239      283      +44     
- Misses        115      126      +11     
- Partials       14       21       +7
Impacted Files Coverage Δ
rootsanalyzer/rootsanalyzer.go 70.96% <70.96%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6671dc5...e49d05f. Read the comment docs.

@RJPercival
Copy link
Contributor Author

There's probably a better name than "rootsanalyzer" for this - maybe "rootswatcher" or "rootsmonitor"?

@RJPercival
Copy link
Contributor Author

@taknira, will you have time to review this soon?

@taknira
Copy link
Contributor

taknira commented Oct 21, 2019

Yup, on it :)

Copy link
Contributor

@taknira taknira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have looked at everything except the tests so far - have some comments (mostly tiny things or thinking out loud) to be going on with :)

storage/storage.go Outdated Show resolved Hide resolved
storage/storage.go Show resolved Hide resolved
storage/storage.go Outdated Show resolved Hide resolved
@@ -41,3 +41,19 @@ type STHWriter interface {
type RootsWriter interface {
WriteRoots(ctx context.Context, l *ctlog.Log, roots []*x509.Certificate, receivedAt time.Time) error
}

// RootSetID uniquely identifies a specific set of certificates, regardless of their order.
type RootSetID string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You had a bunch of reasons for making this a string instead of []byte - what were they again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a string, it can be used as the key in a map. This is helpful since that's exactly what it is - a key for referring to a set of certificates. If I make it a []byte, I have to cast it to string whenever I used it with a map. It is also trivially comparable with other RootSetIDs as a string. However, there aren't any significant obstacles to making it a []byte so I'd be fine with changing it, if you've got strong reasons for doing so? The only one I can think of is that it's not really a printable string, so []byte is semantically more accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm relaxed, I've been going back and forth on this in my head. It looks like the only map place it's used atm is for testing, and so that shouldn't be the sole reason for it to be a string, but I also don't have strong reasons why it should be []byte, so whichever is fine. I think in my head we'd be calculating it by doing a bunch of hashing, so that felt bytey, but then having the ID as []byte might cause problems if we want to use it as part of a primary key in a DB, depending on the db type. So author's choice - I leave it up to you :)

storage/testonly/roots.go Outdated Show resolved Hide resolved
glog.Errorf("rootsanalyzer: %s", err)
return
}
addedCerts, removedCerts := diffRootSets(oldRoots, newRoots)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought, that can be done in a different PR if you agree with it - it might be useful to also store the diffs alongside the root sets or RootSetIDs or something, so that that changes are stored somewhere that's not just in the incident DB? Just a thought. Depends on your db schema of course, but having a list of rootSetIDs in the order that we saw them (or something), and storing any change (or maybe a representation of the change if the change could be too hefty) between that one and the previous one might be useful? WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mercurrent once you're familiar with this stuff it'd be good to get your take on this too!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's something I planned to have in the design doc, but haven't actually run into a need for so far. @Mercurrent may wish to add this when it becomes necessary.

@@ -41,3 +41,19 @@ type STHWriter interface {
type RootsWriter interface {
WriteRoots(ctx context.Context, l *ctlog.Log, roots []*x509.Certificate, receivedAt time.Time) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does WriteRoots need some sort of comment or potential future tweaking to either comment on the fact it should create or take as an argument a RootSetID, now that this concept has been introduced? It may be that (again, in a diff PR) the Roots Getter sorts and de-dupes the set of roots it gets, and calculates the RootSetID before calling WriteRoots, or something.... This is thinking about where the interface line should be drawn - I'm pro making as much of the useful functionality sit above the interface line as possible, at least in cases where we have a clear idea of how it should work. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was letting the RootsWriter implementation handle calculating the RootSetID, since it's arguably implementation-specific. The RootSetID isn't defined by Monologue to be anything but an opaque, deterministic identifier. I could make WriteRoots() return the RootSetID, which makes this a bit more obvious?

rootsanalyzer/rootsanalyzer.go Show resolved Hide resolved
rootsanalyzer/rootsanalyzer.go Show resolved Hide resolved
rootsanalyzer/rootsanalyzer.go Outdated Show resolved Hide resolved
Copy link
Contributor

@taknira taknira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaaand looked at the tests :)

rootsanalyzer/rootsanalyzer_test.go Outdated Show resolved Hide resolved
rootsanalyzer/rootsanalyzer_test.go Show resolved Hide resolved
{{ end }}{{ end }}{{ if gt (len .RemovedCerts) 0 }}
Certificates removed ({{ len .RemovedCerts }}):
{{ range .RemovedCerts }}{{ .Subject }}
{{ end }}{{ end }}`))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only the subject of the certs will be going in to the incident reports, it's almost definitely worth storing the complete cert diff data along side the RootSetID or something (as mentioned in another comment), just in case there end up being roots with the same subject field, but different keys, or something? It'd be good to have the full diff info somewhere I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could include the full diff in here; I can imagine that being useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I've just included the SHA256 hash of the cert, which makes it clear if something about the cert has changed (if it's in both the added and removed sections). That hash can be thrown into https://crt.sh to get the full cert details. I think the full diff could be deferred for a future PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, SHA256 hash should do the trick for now :)

glog.Errorf("rootsanalyzer: %s", err)
return
}
addedCerts, removedCerts := diffRootSets(oldRoots, newRoots)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mercurrent once you're familiar with this stuff it'd be good to get your take on this too!

{{ end }}{{ end }}{{ if gt (len .RemovedCerts) 0 }}
Certificates removed ({{ len .RemovedCerts }}):
{{ range .RemovedCerts }}{{ .Subject }}
{{ end }}{{ end }}`))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, SHA256 hash should do the trick for now :)

rootsanalyzer/rootsanalyzer.go Outdated Show resolved Hide resolved
rootsanalyzer/rootsanalyzer.go Outdated Show resolved Hide resolved
@@ -41,3 +41,19 @@ type STHWriter interface {
type RootsWriter interface {
WriteRoots(ctx context.Context, l *ctlog.Log, roots []*x509.Certificate, receivedAt time.Time) error
}

// RootSetID uniquely identifies a specific set of certificates, regardless of their order.
type RootSetID string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm relaxed, I've been going back and forth on this in my head. It looks like the only map place it's used atm is for testing, and so that shouldn't be the sole reason for it to be a string, but I also don't have strong reasons why it should be []byte, so whichever is fine. I think in my head we'd be calculating it by doing a bunch of hashing, so that felt bytey, but then having the ID as []byte might cause problems if we want to use it as part of a primary key in a DB, depending on the db type. So author's choice - I leave it up to you :)

storage/testonly/doc.go Outdated Show resolved Hide resolved
@taknira
Copy link
Contributor

taknira commented Jan 29, 2020

There are just a couple of small things left to tweak (logStr, testonly doc comment location) so approving so you can go ahead and merge once those are done.

Creates incident reports when the root certificates return by a CT Log's
get-roots endpoint change.
@RJPercival RJPercival merged commit 0d4f685 into google:master Jan 29, 2020
Minimum Viable Monitor automation moved this from In progress to Done Jan 29, 2020
@RJPercival RJPercival deleted the roots_analyzer branch January 29, 2020 17:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes https://cla.developers.google.com/ enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants