Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IoTEdge Check: Check expired production certs #5699

Merged
merged 9 commits into from
Oct 22, 2021

Conversation

and-rewsmith
Copy link
Contributor

@and-rewsmith and-rewsmith commented Oct 15, 2021

The iotedge check for device ca was added in this PR a long time ago:
https://github.com/Azure/iotedge/pull/1559/files#diff-5b05b8525be1ca907a0bc313fd71312d95a964b5f60d1903aa0fc2992a247027R1339

The PR approached from the angle that we have a warning of using self-signed certs and now we want to add cert expiry information to it. We should be checking cert expiry in general too though, which is what I have changed in this PR.

I have tested this locally and the check errors when the device ca cert is expired. There are currently no tests for this check.

Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

General Guidelines and Best Practices

  • I have read the contribution guidelines.
  • Title of the pull request is clear and informative.
  • Description of the pull request includes a concise summary of the enhancement or bug fix.

Testing Guidelines

  • Pull request includes test coverage for the included changes.
  • Description of the pull request includes
    • concise summary of tests added/modified
    • local testing done.

Draft PRs

  • Open the PR in Draft mode if it is:
    • Work in progress or not intended to be merged.
    • Encountering multiple pipeline failures and working on fixes.

Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned here for the PR title and description

let certificate_info = CertificateValidity::parse(
"Device CA certificate".to_owned(),
check.device_ca_cert_path.clone().unwrap(),
)?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parse call is implemented here:
https://github.com/Azure/iotedge/blob/release/1.1/edgelet/iotedge/src/check/checks/identity_certificate_expiry.rs#L85

Are there any concerns that the parse will fail for certain production certificates? The relevant call the parse function uses is stack_from_pem. As long as it is a pem it should be fine I think.
https://docs.rs/openssl/0.10.23/openssl/x509/struct.X509.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no concern.

} else if not_after < now {
Ok(CheckResult::Failed(
Context::new(format!(
"The device CA cert has expired at {}. Renew the certificate to restore functionality.",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the critical bit I have added. Let me know if I should change the wording.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eustacea is there any recommended industry best practice on how far in advance someone should renew their production certificates before they expire? If so then I'm wondering whether it would be helpful to include a warning check to indicate the production certificate will expire in X days. Also if you have any feedback on wording.

@and-rewsmith I suggest we continue to reference our docs on production best practices for certificates. If someone hits this error then it seems like they'd benefit from the reminder. E.g. The device CA cert expired at {}. Renew the certificate to restore functionality. See https://aka.ms/iotedge-prod-checklist-certs for best practices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a case for production certs expiring within the next 6 months, but can change the time range upon response from @eustacea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@and-rewsmith I'm fine with the new language. I did some brief research and don't believe there is a one-size-fits-all answer to how far in advance customers want to be alerted to production certificate expiry. For example, the docs for Key Vault talk about setting an alert based on either a percentage lifetime remaining or number of days to expiry. Other places where I've seen mention of default values it tends to be 90 days or less (see for example Digicert). I suggest we go with 90 days as the warning threshold.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some industries might have regulations on certs expiry but I'm not aware of a universal recommendation beyond adapting expiry to deployment/solution specific security profile (e.g. from threat modeling). Since we don't have the solution specific detail, would it be possible to specify this value through a config.toml parameter? A default threshold of 90 days sounds reasonable to me.

@and-rewsmith and-rewsmith changed the title debug check IoTEdge Check: Enhance certificate check to check expired production certs Oct 18, 2021
@and-rewsmith and-rewsmith changed the title IoTEdge Check: Enhance certificate check to check expired production certs IoTEdge Check: Check expired production certs Oct 18, 2021
@and-rewsmith and-rewsmith marked this pull request as ready for review October 18, 2021 22:38
arsing
arsing previously approved these changes Oct 18, 2021
Copy link
Member

@arsing arsing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code change looks fine. The error's wording should be checked by Venkat or Micah.

varunpuranik
varunpuranik previously approved these changes Oct 19, 2021
@and-rewsmith and-rewsmith dismissed stale reviews from varunpuranik and arsing via e60a0f2 October 20, 2021 20:29

if autogenerated_certs {
if not_after < now {
CheckResult::Failed(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expired autogenerated certs used to be a warning, but I made it a failure.

nimanch
nimanch previously approved these changes Oct 20, 2021
@kodiakhq kodiakhq bot merged commit 1baad81 into Azure:release/1.1 Oct 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants