Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS/Cert validity reporting #161

Closed
11 tasks
andrewvc opened this issue Apr 1, 2020 · 7 comments · Fixed by elastic/kibana#64059
Closed
11 tasks

TLS/Cert validity reporting #161

andrewvc opened this issue Apr 1, 2020 · 7 comments · Fixed by elastic/kibana#64059
Labels
enhancement New feature or request test-plan test-plan-ok Indicates an issue has been tested for release v7.8.0

Comments

@andrewvc
Copy link
Contributor

andrewvc commented Apr 1, 2020

Design Issue: #159

Personas / User Stories
As someone responsible for managing TLS/SSL Certificates
I want be able to view a table of all current TLS certificates where I can determine which are in danger of becoming invalid due to either expiration or advanced age (e.g. Safari)
So that my services that rely on these certificates remain available

ACs:

  • Unique page in Uptime that shows a table
  • Regular EUI Table that contains following columns:
    • status + reason (reason being the error state that is most urgent, either "expires soon" or "approaching age limit".
    • Common Name (plus x more indicator),
    • the monitor names that contain endpoints/domains with that certificate
    • expiration date/time (col name "Valid until")
    • age of the certificate in days
    • Issuer Name,
    • current status (ie OK, warning, alert).
    • Fingerprints Column (Showing empty buttons for SHA-1 and SHA-256 and a copy icon, shows a tooltip with the full SHA on hover. Screenshot further down in the comments.)
  • Alert state column rules match triggers for sending an actual alert
  • Additional data for heartbeat to capture: SHA fingerprint, all domains covered by the certificate, valid from, valid to, issuing authority
  • Search: Wildcards on monitor name, monitor ID, issuer.distinguished_name, subject.distinguished_name, common name case insensitively
  • Settings page adds:
    • "Certificate expiration warning threshold": default: 30 days
    • "Certificate age warning threshold" : default: 365 days
    • Descriptions for both fields say: "Change the threshold for displaying and alerting on certificate errors. Note, this will affect any configured alerts"
  • Pagination size should be remembered in local storage as-in overview page and ping history
  • All columns are sortable
@andrewvc andrewvc added enhancement New feature or request [zube]: Ready labels Apr 1, 2020
@katrin-freihofner
Copy link
Contributor

Design issue

@katrin-freihofner
Copy link
Contributor

Figma link
Prototype link

andrewvc added a commit to andrewvc/beats that referenced this issue Apr 13, 2020
This patch adds additional [ECS
fields](https://www.elastic.co/guide/en/ecs/current/ecs-tls.html).

Sample output of the `tls.*` fields with this patch is below. Note the
somewhat strange nesting of data in `issuer` and `subject`. This is per
the ECS spec, but a bit awkward. We may want to break this data out into
the more specific ECS `x509` type in the future. For UI work we are likely
fine to parse this on the client and display the CN section in most
cases.

```json
{
  "version": "1.2",
  "version_protocol": "tls"
  "cipher": "ECDHE-RSA-AES-128-GCM-SHA256",
  "server": {
    "subject": "CN=r2.shared.global.fastly.net,O=Fastly\\, Inc.,L=San Francisco,ST=California,C=US",
    "hash": {
      "sha1": "b7b4b89ef0d0caf39d223736f0fdbb03c7b426f1",
      "sha256": "12b00d04db0db8caa302bfde043e88f95baceb91e86ac143e93830b4bbec726d"
    },
    "not_before": "2019-08-16T01:40:25.000Z",
    "not_after": "2019-08-16T01:40:25.000Z",
    "issuer": "CN=GlobalSign CloudSSL CA - SHA256 - G3,O=GlobalSign nv-sa,C=BE"
  },
  "certificate_not_valid_before": "2019-08-16T01:40:25.000Z",
  "certificate_not_valid_after": "2020-07-16T03:15:39.000Z",
  "established": true,
  "rtt": {
    "handshake": {
      "us": 42491
    }
  },
}
```

Work goes towards elastic/uptime#161
@mdelapenya
Copy link

I've some questions about the ACs:

Unique page in Uptime that shows a table

I'd like to know if there is a strong business requirement to limit the number of tables in Uptime. If not, I would not add this criteria to the list, or at least to the spec. If it's a business requirement, I'd like to understand the importance of tables here. Could you please help me here? 🙏

Regular EUI Table that contains following columns:

I see this as an implementation detail, because the table could be also rendered as a JSON object in an API request. Please correct me if I'm wrong.

Alert state column rules match triggers for sending an actual alert

Is there a place to look up the rules defining the matches between certificate states and alert states? I'd like to create examples for this use case.

Search: Wildcards on monitor name, monitor ID, issuer.distinguished_name, subject.distinguished_name, common name

As a minor change, I'd use real field names (as they are named in the index) instead of English names, to avoid typos.

Settings "Certificate expiration warning threshold": default: 30 days

How are the setting initialised? Does it happen on Kibana's startup? On heartbeat creating the index? I'd like to define the preconditions that enables this use case, that's why I need to understand how to populate the settings for first time.

Descriptions for both fields say: "Change the threshold for displaying and alerting on certificate errors. Note, this will affect any configured alerts"

I see this requirement is hidden in a description text, when it should have its own AC: "Changing the threshold for alerting on certificate errors will affect any configured alerts"

@andrewvc
Copy link
Contributor Author

@mdelapenya It sounds like the goal is to have feature files describe just the business use case, but not any of the design details. So far we're using the issue here to capture the output of our discussions, which includes a mix of product decisions and design decisions.

With regard to the feature files would it be fair to say they capture user stories, without particular notes about implementation? I agree we could do a better job using user stories to plan features, so that would make sense, but I'd like to hear more from you here.

@mdelapenya
Copy link

Yes, I agree we need both: technical (design, implementation, ...) and product details. Maybe we can have, in the same issue, two lists of ACs: one for product criteria and technical criteria. It would be simply splitting above list into two, here in the issue. Does it makes sense?

About the feature files, yes, they should capture user stories, with no implementation detail (no clicking, no typing, no browsing), we should use product verbs (search for certificates, configure a certificate, create an alert, expire a certificate...).

With the great work in the description of this issue it would be very easy to translate into feature files.

@katrin-freihofner
Copy link
Contributor

katrin-freihofner commented Apr 22, 2020

The copy SHA action could look like this:
Screenshot 2020-04-22 at 18 20 59

I'm going to update the issue description accordingly

andrewvc added a commit to elastic/beats that referenced this issue Apr 27, 2020
Work in support of elastic/uptime#161

This patch adds additional ECS [TLS](https://www.elastic.co/guide/en/ecs/current/ecs-tls.html) and [x509](elastic/ecs#762) fields. Note that we are blocked on the x509 fields which are not yet merged into ECS.

Sample output of the `tls.*` fields with this patch is below. Note the somewhat strange nesting of data in `issuer` and `subject`. This is per the ECS spec, but a bit awkward. We may want to break this data out into the more specific ECS `x509` type in the future. For UI work we are likely fine to parse this on the client and display the CN section in most cases. I did break out the CN into its own field in `x509.subject/issuer.common_name`. However, if we do want to aggregate on issuer in the future it's good to have the full distinguished name to do that on.

This PR also refactors some `libbeat` code around parsing TLS versions and adds test coverage there as well.

```json
{
	"tls": {
		"certificate_not_valid_after": "2020-07-16T03:15:39Z",
		"certificate_not_valid_before": "2019-08-16T01:40:25Z",
		"server": {
			"hash": {
				"sha1": "b7b4b89ef0d0caf39d223736f0fdbb03c7b426f1",
				"sha256": "12b00d04db0db8caa302bfde043e88f95baceb91e86ac143e93830b4bbec726d"
			},
			"x509": {
				"issuer": {
					"common_name": "GlobalSign CloudSSL CA - SHA256 - G3",
					"distinguished_name": "CN=GlobalSign CloudSSL CA - SHA256 - G3,O=GlobalSign nv-sa,C=BE"
				},
				"not_after": "2020-07-16T03:15:39Z",
				"not_before": "2019-08-16T01:40:25Z",
				"public_key_algorithm": "RSA",
				"public_key_size": 2048,
				"serial_number": "26610543540289562361990401194",
				"signature_algorithm": "SHA256-RSA",
				"subject": {
					"common_name": "r2.shared.global.fastly.net",
					"distinguished_name": "CN=r2.shared.global.fastly.net,O=Fastly\\, Inc.,L=San Francisco,ST=California,C=US"
				}
			}
		}
	}
}
```

## How to test this PR locally

Run against TLS/Non-TLS endpoints
andrewvc added a commit to andrewvc/beats that referenced this issue Apr 27, 2020
Work in support of elastic/uptime#161

This patch adds additional ECS [TLS](https://www.elastic.co/guide/en/ecs/current/ecs-tls.html) and [x509](elastic/ecs#762) fields. Note that we are blocked on the x509 fields which are not yet merged into ECS.

Sample output of the `tls.*` fields with this patch is below. Note the somewhat strange nesting of data in `issuer` and `subject`. This is per the ECS spec, but a bit awkward. We may want to break this data out into the more specific ECS `x509` type in the future. For UI work we are likely fine to parse this on the client and display the CN section in most cases. I did break out the CN into its own field in `x509.subject/issuer.common_name`. However, if we do want to aggregate on issuer in the future it's good to have the full distinguished name to do that on.

This PR also refactors some `libbeat` code around parsing TLS versions and adds test coverage there as well.

```json
{
	"tls": {
		"certificate_not_valid_after": "2020-07-16T03:15:39Z",
		"certificate_not_valid_before": "2019-08-16T01:40:25Z",
		"server": {
			"hash": {
				"sha1": "b7b4b89ef0d0caf39d223736f0fdbb03c7b426f1",
				"sha256": "12b00d04db0db8caa302bfde043e88f95baceb91e86ac143e93830b4bbec726d"
			},
			"x509": {
				"issuer": {
					"common_name": "GlobalSign CloudSSL CA - SHA256 - G3",
					"distinguished_name": "CN=GlobalSign CloudSSL CA - SHA256 - G3,O=GlobalSign nv-sa,C=BE"
				},
				"not_after": "2020-07-16T03:15:39Z",
				"not_before": "2019-08-16T01:40:25Z",
				"public_key_algorithm": "RSA",
				"public_key_size": 2048,
				"serial_number": "26610543540289562361990401194",
				"signature_algorithm": "SHA256-RSA",
				"subject": {
					"common_name": "r2.shared.global.fastly.net",
					"distinguished_name": "CN=r2.shared.global.fastly.net,O=Fastly\\, Inc.,L=San Francisco,ST=California,C=US"
				}
			}
		}
	}
}
```

## How to test this PR locally

Run against TLS/Non-TLS endpoints

(cherry picked from commit eb2dc26)
andrewvc added a commit to elastic/beats that referenced this issue Apr 29, 2020
#18029)

* [Heartbeat] Add Additional ECS tls.* fields (#17687)

Work in support of elastic/uptime#161

This patch adds additional ECS [TLS](https://www.elastic.co/guide/en/ecs/current/ecs-tls.html) and [x509](elastic/ecs#762) fields. Note that we are blocked on the x509 fields which are not yet merged into ECS.

Sample output of the `tls.*` fields with this patch is below. Note the somewhat strange nesting of data in `issuer` and `subject`. This is per the ECS spec, but a bit awkward. We may want to break this data out into the more specific ECS `x509` type in the future. For UI work we are likely fine to parse this on the client and display the CN section in most cases. I did break out the CN into its own field in `x509.subject/issuer.common_name`. However, if we do want to aggregate on issuer in the future it's good to have the full distinguished name to do that on.

This PR also refactors some `libbeat` code around parsing TLS versions and adds test coverage there as well.

```json
{
	"tls": {
		"certificate_not_valid_after": "2020-07-16T03:15:39Z",
		"certificate_not_valid_before": "2019-08-16T01:40:25Z",
		"server": {
			"hash": {
				"sha1": "b7b4b89ef0d0caf39d223736f0fdbb03c7b426f1",
				"sha256": "12b00d04db0db8caa302bfde043e88f95baceb91e86ac143e93830b4bbec726d"
			},
			"x509": {
				"issuer": {
					"common_name": "GlobalSign CloudSSL CA - SHA256 - G3",
					"distinguished_name": "CN=GlobalSign CloudSSL CA - SHA256 - G3,O=GlobalSign nv-sa,C=BE"
				},
				"not_after": "2020-07-16T03:15:39Z",
				"not_before": "2019-08-16T01:40:25Z",
				"public_key_algorithm": "RSA",
				"public_key_size": 2048,
				"serial_number": "26610543540289562361990401194",
				"signature_algorithm": "SHA256-RSA",
				"subject": {
					"common_name": "r2.shared.global.fastly.net",
					"distinguished_name": "CN=r2.shared.global.fastly.net,O=Fastly\\, Inc.,L=San Francisco,ST=California,C=US"
				}
			}
		}
	}
}
```


Run against TLS/Non-TLS endpoints

(cherry picked from commit eb2dc26)

* Use non-wildcard field for text

* Remove wildcard type
@justinkambic justinkambic self-assigned this May 6, 2020
@justinkambic
Copy link

Tested this for 7.8.0 test plan and it LGTM. Commenting in lieu of a label.

@andrewvc andrewvc added the test-plan-ok Indicates an issue has been tested for release label May 13, 2020
@zube zube bot removed the [zube]: Done label May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request test-plan test-plan-ok Indicates an issue has been tested for release v7.8.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants