Skip to content

Telcodocs 266: Implementing Network Bound Disk Encryption #36403

Merged
mikemckiernan merged 1 commit intoopenshift:mainfrom
StephenJamesSmith:TELCODOCS-266
Oct 6, 2021
Merged

Telcodocs 266: Implementing Network Bound Disk Encryption #36403
mikemckiernan merged 1 commit intoopenshift:mainfrom
StephenJamesSmith:TELCODOCS-266

Conversation

@StephenJamesSmith
Copy link
Contributor

@StephenJamesSmith StephenJamesSmith commented Sep 15, 2021

Applicable for openshift-enterprise 4.9.

Jira: TELCODOCS-266

This PR supersedes PR #35722. Title change, section change. Has already been reviewed by OCP Peer Squad.

Docs Preview: https://deploy-preview-36403--osdocs.netlify.app/openshift-enterprise/latest/security/network_bound_disk_encryption/nbde-about-disk-encryption-technology.html

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 15, 2021
@netlify
Copy link

netlify bot commented Sep 15, 2021

✔️ Deploy Preview for osdocs ready!

🔨 Explore the source changes: 18a8066

🔍 Inspect the deploy log: https://app.netlify.com/sites/osdocs/deploys/615c6cd6cfb51300095c57c5

😎 Browse the preview: https://deploy-preview-36403--osdocs.netlify.app

@StephenJamesSmith
Copy link
Contributor Author

@adellape I closed PR #35722 because of some conflicts. All of your changes are in both of the PRs. Please Ack here.

@lack Removed ZTP topic. Please review and Ack.

@juphoff Please review and QE Ack.

_topic_map.yml Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filename still has ztp- but this document doesn't have any connection to ZTP.

Why is this? Can we remove ztp- from all these filenames?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, ztp is a prefix I use in my filenames for easy tracking in my repo. It does not affect the content of the document at all and is not seen by customers..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @StephenJamesSmith this is a valid concern. The filename is visible as the URL. For example, this file will be displayed in the URL as ztp-nbde-implementation-guide.html. This has implications for SEO and general navigation and findability. For example, I would be worried if a CNV doc was named telco-.

Is it possible to remove ztp and just use nbde if you want to be able to find your files easily?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the latest refactor, this doesn't seem to match a filename at all? Is that going to result in a broken link?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lack Removed "ztp"
from filenames, IDs, and topic comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be:

value: /dev/disk/by-partlabel/root

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change made.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be \ in here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a copy&paste issue. The Google doc where this originated had line breaks and used \ at the end for continuations. This might be improperly formatted now.

Copy link
Contributor Author

@StephenJamesSmith StephenJamesSmith Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the / characters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
...

We aren't using UTF-8 characters in the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was commented out of the assembly, so it doesn't appear in the final version. I removed it completely from the assembly.

Copy link
Member

@lack lack Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text here:

List the current Tang server keys, showing the advertised and unadvertised keys:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Made this a step.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text:

List the current Tang server keys to verify the unadvertised keys are no longer present:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Made this a step.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is output, not command; should it be in its own box?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put it in a new codeblock.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text:

Decrypt the test file created earlier, to verify decryption against the old keys fails:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text:

Query the Tang server for the current advertised key thumbprints:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is output, should it be in its own box?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put it in a new codeblock.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word plaintext is the expected output, not part of the command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text:

Verify that the encryption succeeded and the file can be decrypted to produce the same string "plaintext":

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text:

Check the currently advertised key thumbprint:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text:

Enter the Tang server key directory:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text:

List the current Tang server keys:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text:

List the current Tang server keys to verify the old keys are no longer advertised (they are now hidden files), and new keys are present:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be .Example output

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plaintext is the output, not part of the command

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Member

@lack lack Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be run as root, so # and not $ Or use sudo:

$ sudo yum install tang

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added sudo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be run as root so # and not $, or use sudo:

$sudo dnf install tang

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added sudo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The oc line is command, the rest is output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The oc line is command, the rest is output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Comment on lines 19 to 48
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is made up of 2 commands and 2 output blocks in a single block. It should be split to match the other commands in the document. I'd suggest something like this:

Example of an encryption and decryption attempt with a bad thumbprint:

$ echo "okay" | clevis encrypt tang \
  '{"url":"http://tangserver02:7500","thp":"badtumbprint"}' | \
  clevis decrypt

Example output:

Unable to fetch advertisement: 'http://tangserver02:7500/adv/badthumbprint'!

Example of an encryption and decryption attempt with a good thumbprint:

$ echo "okay" | clevis encrypt tang \
  '{"url":"http://tangserver03:7500","thp":"goodthumbprint"}' | \
  clevis decrypt

Example output
okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 1st line is the command, the rest of this block is the output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@lack
Copy link
Member

lack commented Sep 23, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 23, 2021
@StephenJamesSmith
Copy link
Contributor Author

@vikram-redhat Merge now???

Copy link

@mikemckiernan mikemckiernan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One biggie comment--I suggest a reorg to meet mod docs guidelines. Please let me know what I can clarify.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verification steps do not use a .Procedure block title. Please remove this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is to remove the second sentence because it falls under the ISG guidance, "Claims and recommendations" > "Future product releases":

Avoid claims that are related to the content or timing of a future product or release.

No meaning is lost if you remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Removed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first sentence is awkward. This module does not indicate what procedure to attempt before attempting this discouraged method. Consider checking with tech support to determine if they would prefer keeping this task in a KCS.

If you need to keep this information in the product documentation, maybe something like the following:

"If you are unable to recover network connectivity manually, consider the following steps. Be aware that these steps are discouraged if other methods to recover network connectivity are available."...

At the very least, "highly not recommended" can be replaced with "discouraged."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice rewording. Changed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice use of text after the output to explain the example output briefly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mod docs guidelines for a procedure module indicate to use a single bullet for a one-step procedure. My sugg:

  • You can install...following commands:

** Install the Tang server by using the yum command:
...

** Install the Tang server by using the dnf command:
...

Though, A) bummer that we can't just refer customers to the RHEL 8 procedure and B) strange that those instructions do not show dnf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes made.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as the preceding file--please indicate in the title and first para that this procedure is related to the rekeying procedure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the conversational approach to entertaining what our customer "thinks," but unless our customer already has experience with the technology (and likely does not need product documentation), our customer might prefer learning how to determine if the error is temporary. Maybe...

To determine if the error condition from rekeyting the Tang servers is temporary, perform the following procedure. Temporary...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This might be me, but I don't understand how the customer uses the pod restart policy to restart the pod. The YAML for the tang-rekey daemon set indicates the restart policy is Always. Again, it might just be me, but I am unsure how to use the information from this step.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This originally wasn't written as a procedure, and that may be where some of the oddness comes in.

Really in the case of these temporary error conditions the adminstator should either keep waiting until things succeed (because the DaemonSet will continue to retry until it succeeds), or the administrator should delete the daemonset and not try again until the temporary error condition (network outage / server offline) has been resolved...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lack @mikemckiernan Do we need to add this info to the topic? Something like,
"Generally, when these types of temporary error conditions occur, you can wait until the daemon set succeeds in resolving the error or you can delete the daemon set and not try again until the temporary error condition has been resolved."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 @StephenJamesSmith, I think a customer would appreciate reading the expectations and when to retry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sugg: single-node clusters

I'm unsure if the uncertainty is mine alone...

  • In the intro para, does "server" refer to a Tang server or a cluster node?
  • I have the same uncertainty for the first bullet. Also, is there something that customers can do to ensure that something does not reboot? Or is it "do not reboot <Tang server|cluster node> until..."?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! We've been trying to keep the language consistent (where "server" always means "the tang server" and "node" means "The Openshift SNO node"), and this is a departure from that otherwise consistent language.

First paragraph should should be "and a node reboots".

First bullet should be "If any nodes are still online", and maybe we should change the 2nd sentence since it seems odd to end it with "where these is only one node" (which would be better for consistency) because we already said it's "single node"?

Next bullet should also be changed to "node will remain offline"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made all changes. Deleted "where these is only one node".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stephen, in the mod docs guidelines, there is the suggestion that each assembly documents a user story.

I'm sure there's some flexibility in that guideline, but this assembly pushes it too far. Several user stories are covered in this assembly. To meet the mod docs guidelines, break this into several assemblies and look at the "File Integrity Operator" section of the topic map as an example for how to group the assemblies under a TOC node like "Network-Bound Disk Encryption (NBDE)". You have enough information to justify it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an afterthought on this comment. I failed to mention to use sentence case for this title and in the _topic_map.yml file. This is covered in the Assembly/Module titles and section headings section of the doc guidelines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikemckiernan
I'll have a look at the guidelines, but please know that this was developed under just one user story - TELCODOCS-266. Is it a really large story? Yup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikemckiernan I've restructured this section. Plz review and let me know.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 1, 2021
@StephenJamesSmith StephenJamesSmith force-pushed the TELCODOCS-266 branch 8 times, most recently from 0855de0 to ae24503 Compare October 3, 2021 14:30
Copy link

@mikemckiernan mikemckiernan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stephen, the whole topic is much more approachable after splitting it up a bit. That comes with some (possibly unwelcome) suggestions for more revision. I hope you don't mind too much--after the restructuring, it became clearer.

_topic_map.yml Outdated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Friendly reminder that I thought you mentioned the need to control distros. If you do, then add a Distros: key.

_topic_map.yml Outdated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purely an opinion, I don't think you need "considerations" at the ends of lines 779 and 783.

I can't recall titling guidelines to support my opinion though. Up to you.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 17-20, consider a definition list

TPM2::
Binds the disk...
Tang::
Binds the disk...

Clearly optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to definition list

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This topic does end up sticking a bit like a sore thumb:

  • Can you write a procedure to configure logging to a central logging destination?
  • How does the Tang server log, by default? If that is in the RHEL docs, then, you can add an .Additional resources block title and add link markup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found anything on Tang server logging. However, when it is pulled into ZTP in v4.10, I can check to see if/how ZTP would handle this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might read as a pathetic request, but for parity with the "Recovering keys for a Tang server", my sugg:

  • Title as "Backing up keys for a Tang server"
  • Make it a one-bullet procedure, along the lines of "Back up the contents of the /usr/libexec/tangd-keygen directory."
  • The other sentences can precede the .Procedure block title.
  • Eventually, be prepared for a customers to request some advice about how to perform the backup. Changing to a procedure module might make that adaptation later a little simpler. Maybe.

One content item: The Generating a new Tang server key procedure shows a /usr/libexec/tangd-keygen /var/db/tang command in step 5. I don't know the material, but either line 8 is an executable and not the directory, or the procedure could be incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entertain moving "Tang server location planning" after "include::modules/nbde-using-tang-servers-for-disk-encryption.adoc[leveloffset=+1]"

Then, if there's no better location for "Tang server sizing requirements" move it in this location, after "Tang server location planning."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assembly metadata section of the docs guidelines say that the context need to be unique for each assembly. "about-nbde" is prolly OK.

I don't plan to point that out for the other assemblies.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you feel strongly, the +3 on lines 17 to 21 could be +2 and that'll make them part of the TOC at the top of the page.

Copy link
Contributor Author

@StephenJamesSmith StephenJamesSmith Oct 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be better. Done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My $0.02 is that the conceptual information from the modules on lines 11 to 15 don't have a lot to do with the installation process. See if there is a topic in the "About" page where they fit better.

Conversely, try putting "Installation scenarios" ahead of the first include so that customers are given a few considerations for planning and then read the installation procedure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved "Installation scenarios" ahead of the first include

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file is a duplicate of nbde-tang-server-installation-considerations.adoc. Pick the one that you prefer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing nbde-installing-a-tang-server.adoc

@StephenJamesSmith StephenJamesSmith force-pushed the TELCODOCS-266 branch 2 times, most recently from e1e239c to eb67388 Compare October 5, 2021 14:52
@StephenJamesSmith
Copy link
Contributor Author

StephenJamesSmith commented Oct 5, 2021

@mikemckiernan Please review your latest changes. Are we ready to merge?

Copy link

@mikemckiernan mikemckiernan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

I appreciate all the time and effort it took to revise. I find it more approachable.

@mikemckiernan mikemckiernan merged commit ec2df0b into openshift:main Oct 6, 2021
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 6, 2021
@mikemckiernan
Copy link

/cherrypick enterprise-4.9

@openshift-cherrypick-robot

@mikemckiernan: new pull request created: #37142

Details

In response to this:

/cherrypick enterprise-4.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

branch/enterprise-4.9 lgtm Indicates that a PR is ready to be merged. peer-review-done Signifies that the peer review team has reviewed this PR size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants