New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add README.md clarifying TLS registry purpose and processes #28434
Add README.md clarifying TLS registry purpose and processes #28434
Conversation
tls/README.md
Outdated
## Registry purpose | ||
|
||
The registry is used to collect all TLS artifacts used in OpenShift, certificate key pairs and CA bundles alike. | ||
For simplicity this document will use "certificate" for both certificate, key or CA bundle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For simplicity this document will use "certificate" for both certificate, key or CA bundle. | |
For simplicity this document will use "pki artifact" for both certificate, key or CA bundle. |
? It's confusing to read certificate where it's actually a bundle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really wanted to get away with this and just use "cert" throughout the text :/
Not sure if "pki artifact" is better, maybe "tls artifact"?
tls/README.md
Outdated
|
||
In order to ensure certificates are following a set of defined standards, we need a way to collect | ||
all certificates used in OpenShift. This is done via "[sig-arch][Late] collect certificate data" test. | ||
The test produces a json in `openshift-e2e-test/artifacts/rawTLSInfo`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test produces a json in `openshift-e2e-test/artifacts/rawTLSInfo`: | |
The test produces a json in `openshift-e2e-test/artifacts/rawTLSInfo/<put-the-pattern-of-filename-here>json`: |
tls/README.md
Outdated
|
||
This stores the following info: | ||
* all secrets / configmaps which are certificates along with their metadata | ||
* parses them and stores information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* parses them and stores information | |
* parses them and stores metadata |
tls/README.md
Outdated
|
||
Recently we've added `openshift.io/owning-component: foobar` annotation to (almost) all certificates, | ||
so that issues related to this certificate would be routed to a proper location. For instance, | ||
certificates created by `service-ca` are managed by network team regardless of the repo these are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
certificates created by `service-ca` are managed by network team regardless of the repo these are | |
certificates created by `service-ca` are managed by service-ca jira component regardless of the repo these are |
The ownership by jira components, not teams, is extremely intentional. Teams can move and dissolve, but we're good about making sure every jira component has an owner.
tls/README.md
Outdated
Apart from reports the registry is checked for requirement violations. | ||
`make update` creates `tls/violations/ownership/ownership-violations.json` listing certificate locations | ||
which don't have ownership annotation set. This file is meant to be "remove-only", meaning adding | ||
new entries is prohibited. This is enforced by using a separate `OWNERS` file for this directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new entries is prohibited. This is enforced by using a separate `OWNERS` file for this directory. | |
new entries is prohibited. This is enforced using an e2e test and by using a separate `OWNERS` file for this directory. |
tls/README.md
Outdated
|
||
Reports and violations mechanism can be extended to add new requirements. In order to add a new | ||
certificate metadata requirements developers would need to: | ||
* implement `Requirement` interface from `pkg/cmd/update-tls-artifacts/generate-owners/tlsmetadata/tlsmetadatainterfaces/types.:go`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add courtesy link
certificate metadata requirements developers would need to: | ||
* implement `Requirement` interface from `pkg/cmd/update-tls-artifacts/generate-owners/tlsmetadata/tlsmetadatainterfaces/types.:go`: | ||
```golang | ||
type Requirement interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
followup consideration: we could streamline further by having annotation requirements indicate their annotations.
tls/README.md
Outdated
|
||
Markdown report can also be customized: | ||
```golang | ||
func generateOwnershipMarkdown(pkiInfo *certgraphapi.PKIRegistryInfo) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indicate method and provide link. I think once we generalize the annotation based requirements most people won't need this.
5842e02
to
e45da70
Compare
tls/README.md
Outdated
## Registry purpose | ||
|
||
The registry is used to collect all TLS artifacts used in OpenShift, certificate key pairs, and CA bundles alike. | ||
For simplicity, this document will use "certificate" for both certificate, key, or CA bundle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll accept "tls artifact" if you like. But calling it a certificate is incorrect
tls/README.md
Outdated
|
||
Recently we've added the `openshift.io/owning-component: foobar` annotation to (almost) all certificates | ||
so that issues related to this certificate would be routed to a proper location. For instance, | ||
problems with `service-ca` certificates are filed in Jira for "Networking / cluster-network-operator" component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The service-ca is owned by the service-ca component, not the networking component
tls/README.md
Outdated
|
||
Along with the "collect tls artifacts" test the e2e test ensures that cluster certificates don't add | ||
new certificates violating requirements via the "all registered tls artifacts must have no metadata violation regressions" test. | ||
This test would collect the TLS registry from the cluster and generate new violation files from it. If this new file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test would collect the TLS registry from the cluster and generate new violation files from it. If this new file | |
This test collects the TLS registry from the cluster and generate new violation files from it. If this new file |
And similar through. Not future ("would"), but current
tls/README.md
Outdated
from PRs that add new certificates without required metadata across the entire platform. Another | ||
test would verify that metadata for actual certificates matches metadata for known certificate locations. | ||
|
||
Once a baseline set of metadata is determined the tests would move from flaking to properly failing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to enforce ASAP. I think we have sufficient data to do so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break a lot of optional operators (i.e. OpenShift GitOps or SRIOV operator), so we need a way of finding all jobs first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break a lot of optional operators (i.e. OpenShift GitOps or SRIOV operator), so we need a way of finding all jobs first
do you have a link to such a job so we can see the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I was wrong - I noticed the test didn't pass on a cluster with pipelines+gitops operators installed, but probably its due to service serving CAs, which are accounted now.
However Hypershift jobs would certainly fail, but its relatively easy to fix.
Just to be safe we should switch flake to fail in a separate PR so that TRT would revert just the small part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be safe we should switch flake to fail in a separate PR so that TRT would revert just the small part
Yes. I also identified 21c9d0d which I'm prepared to sacrifice for now to get the majority enforcing.
cadc31f
to
a26f447
Compare
Job Failure Risk Analysis for sha: a26f447
|
tls/README.md
Outdated
|
||
Recently we've added the `openshift.io/owning-component: foobar` annotation to (almost) all TLS artifacts | ||
so that issues related to this TLS artifact would be routed to a proper location. For instance, | ||
problems with `service-ca` TLS artifacts are filed in Jira for "Networking / cluster-network-operator" component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
problems with `service-ca` TLS artifacts are filed in Jira for "Networking / cluster-network-operator" component | |
problems with `service-ca` TLS artifacts are filed in Jira for "service-ca" component |
or perhaps you intended to point to the proxy-cas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I meant proxy CAs
tls/README.md
Outdated
|
||
## Certificate metadata | ||
|
||
TLS artifact contents may however be insufficient - i.e. it's not clear which OpenShift is responsible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLS artifact contents may however be insufficient - i.e. it's not clear which OpenShift is responsible | |
TLS artifact contents may however be insufficient - i.e. it's not clear which product component is responsible |
tls/README.md
Outdated
type OwnerRequirement struct { | ||
name string | ||
} | ||
|
||
func NewOwnerRequirement() tlsmetadatainterfaces.Requirement { | ||
return OwnerRequirement{ | ||
name: "ownership", | ||
} | ||
} | ||
|
||
// Return requirement name for filenames. Must be unique | ||
func (o OwnerRequirement) GetName() string { | ||
return o.name | ||
} | ||
|
||
// Boilerplate function which processes TLS registry and produces RequirementResult or throws an error | ||
func (o OwnerRequirement) InspectRequirement(rawData []*certgraphapi.PKIList) (tlsmetadatainterfaces.RequirementResult, error) { | ||
pkiInfo, err := tlsmetadatainterfaces.ProcessByLocation(rawData) | ||
if err != nil { | ||
return nil, fmt.Errorf("transforming raw data %v: %w", o.GetName(), err) | ||
} | ||
|
||
ownershipJSONBytes, err := json.MarshalIndent(pkiInfo, "", " ") | ||
if err != nil { | ||
return nil, fmt.Errorf("failure marshalling %v.json: %w", o.GetName(), err) | ||
} | ||
markdown, err := generateOwnershipMarkdown(pkiInfo) | ||
if err != nil { | ||
return nil, fmt.Errorf("failure marshalling %v.md: %w", o.GetName(), err) | ||
} | ||
violations := generateViolationJSON(pkiInfo) | ||
violationJSONBytes, err := json.MarshalIndent(violations, "", " ") | ||
if err != nil { | ||
return nil, fmt.Errorf("failure marshalling %v-violations.json: %w", o.GetName(), err) | ||
} | ||
|
||
return tlsmetadata.NewRequirementResult( | ||
o.GetName(), | ||
ownershipJSONBytes, | ||
markdown, | ||
violationJSONBytes) | ||
} | ||
``` | ||
|
||
This interface calls two internal functions: | ||
```golang | ||
func generateViolationJSON(pkiInfo *certgraphapi.PKIRegistryInfo) *certgraphapi.PKIRegistryInfo { | ||
ret := &certgraphapi.PKIRegistryInfo{} | ||
|
||
// Iterate over certificate-key pairs (==secrets) | ||
for i := range pkiInfo.CertKeyPairs { | ||
curr := pkiInfo.CertKeyPairs[i] | ||
owner := curr.CertKeyInfo.OwningJiraComponent | ||
// Include in the violations list if has no owner set or owner set to Unknown | ||
if len(owner) == 0 || owner == tlsmetadata.UnknownOwner { | ||
ret.CertKeyPairs = append(ret.CertKeyPairs, curr) | ||
} | ||
} | ||
|
||
// Iterate over CA bundles (==configmaps) | ||
for i := range pkiInfo.CertificateAuthorityBundles { | ||
curr := pkiInfo.CertificateAuthorityBundles[i] | ||
owner := curr.CABundleInfo.OwningJiraComponent | ||
// Include in the violations list if has no owner set or owner set to Unknown | ||
if len(owner) == 0 || owner == tlsmetadata.UnknownOwner { | ||
ret.CertificateAuthorityBundles = append(ret.CertificateAuthorityBundles, curr) | ||
} | ||
} | ||
|
||
return ret | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type OwnerRequirement struct { | |
name string | |
} | |
func NewOwnerRequirement() tlsmetadatainterfaces.Requirement { | |
return OwnerRequirement{ | |
name: "ownership", | |
} | |
} | |
// Return requirement name for filenames. Must be unique | |
func (o OwnerRequirement) GetName() string { | |
return o.name | |
} | |
// Boilerplate function which processes TLS registry and produces RequirementResult or throws an error | |
func (o OwnerRequirement) InspectRequirement(rawData []*certgraphapi.PKIList) (tlsmetadatainterfaces.RequirementResult, error) { | |
pkiInfo, err := tlsmetadatainterfaces.ProcessByLocation(rawData) | |
if err != nil { | |
return nil, fmt.Errorf("transforming raw data %v: %w", o.GetName(), err) | |
} | |
ownershipJSONBytes, err := json.MarshalIndent(pkiInfo, "", " ") | |
if err != nil { | |
return nil, fmt.Errorf("failure marshalling %v.json: %w", o.GetName(), err) | |
} | |
markdown, err := generateOwnershipMarkdown(pkiInfo) | |
if err != nil { | |
return nil, fmt.Errorf("failure marshalling %v.md: %w", o.GetName(), err) | |
} | |
violations := generateViolationJSON(pkiInfo) | |
violationJSONBytes, err := json.MarshalIndent(violations, "", " ") | |
if err != nil { | |
return nil, fmt.Errorf("failure marshalling %v-violations.json: %w", o.GetName(), err) | |
} | |
return tlsmetadata.NewRequirementResult( | |
o.GetName(), | |
ownershipJSONBytes, | |
markdown, | |
violationJSONBytes) | |
} | |
``` | |
This interface calls two internal functions: | |
```golang | |
func generateViolationJSON(pkiInfo *certgraphapi.PKIRegistryInfo) *certgraphapi.PKIRegistryInfo { | |
ret := &certgraphapi.PKIRegistryInfo{} | |
// Iterate over certificate-key pairs (==secrets) | |
for i := range pkiInfo.CertKeyPairs { | |
curr := pkiInfo.CertKeyPairs[i] | |
owner := curr.CertKeyInfo.OwningJiraComponent | |
// Include in the violations list if has no owner set or owner set to Unknown | |
if len(owner) == 0 || owner == tlsmetadata.UnknownOwner { | |
ret.CertKeyPairs = append(ret.CertKeyPairs, curr) | |
} | |
} | |
// Iterate over CA bundles (==configmaps) | |
for i := range pkiInfo.CertificateAuthorityBundles { | |
curr := pkiInfo.CertificateAuthorityBundles[i] | |
owner := curr.CABundleInfo.OwningJiraComponent | |
// Include in the violations list if has no owner set or owner set to Unknown | |
if len(owner) == 0 || owner == tlsmetadata.UnknownOwner { | |
ret.CertificateAuthorityBundles = append(ret.CertificateAuthorityBundles, curr) | |
} | |
} | |
return ret | |
} | |
const annotationName string = "certificates.openshift.io/supports-offline-hostname-change" | |
type SupportsOfflineHostnameChange struct{} | |
func NewSupportsOfflineHostnameChange() tlsmetadatainterfaces.Requirement { | |
md := tlsmetadatainterfaces.NewMarkdown("") | |
md.Text("Offline hostname change is an SNO feature driven using tool (provide link here) while a cluster is not running.") | |
return tlsmetadatainterfaces.NewAnnotationRequirement( | |
// requirement name | |
"offline-hostname-change", | |
// cert or configmap annotation | |
annotationName, | |
"Supports Offline Hostname Change", | |
string(md.ExactBytes()), | |
) | |
} |
a26f447
to
33164f8
Compare
Job Failure Risk Analysis for sha: 33164f8
|
33164f8
to
c17a17c
Compare
a few typos to tidy up, but we can do that as a followup /lgtm |
/retest-required |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, vrutkovs The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Job Failure Risk Analysis for sha: c17a17c
|
/retest-required |
@vrutkovs: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
4d2a9d3
into
openshift:master
[ART PR BUILD NOTIFIER] This PR has been included in build openshift-enterprise-tests-container-v4.16.0-202401051232.p0.g4d2a9d3.assembly.stream for distgit openshift-enterprise-tests. |
/cc @deads2k