Telcodocs 266: Implementing Network Bound Disk Encryption #36403
Telcodocs 266: Implementing Network Bound Disk Encryption #36403mikemckiernan merged 1 commit intoopenshift:mainfrom
Conversation
|
✔️ Deploy Preview for osdocs ready! 🔨 Explore the source changes: 18a8066 🔍 Inspect the deploy log: https://app.netlify.com/sites/osdocs/deploys/615c6cd6cfb51300095c57c5 😎 Browse the preview: https://deploy-preview-36403--osdocs.netlify.app |
3c20529 to
1115053
Compare
_topic_map.yml
Outdated
There was a problem hiding this comment.
The filename still has ztp- but this document doesn't have any connection to ZTP.
Why is this? Can we remove ztp- from all these filenames?
There was a problem hiding this comment.
Again, ztp is a prefix I use in my filenames for easy tracking in my repo. It does not affect the content of the document at all and is not seen by customers..
There was a problem hiding this comment.
Hi @StephenJamesSmith this is a valid concern. The filename is visible as the URL. For example, this file will be displayed in the URL as ztp-nbde-implementation-guide.html. This has implications for SEO and general navigation and findability. For example, I would be worried if a CNV doc was named telco-.
Is it possible to remove ztp and just use nbde if you want to be able to find your files easily?
There was a problem hiding this comment.
With the latest refactor, this doesn't seem to match a filename at all? Is that going to result in a broken link?
There was a problem hiding this comment.
@lack Removed "ztp"
from filenames, IDs, and topic comments.
There was a problem hiding this comment.
This should be:
value: /dev/disk/by-partlabel/root
There was a problem hiding this comment.
change made.
There was a problem hiding this comment.
Should there be \ in here?
There was a problem hiding this comment.
This might be a copy&paste issue. The Google doc where this originated had line breaks and used \ at the end for continuations. This might be improperly formatted now.
There was a problem hiding this comment.
Removed the / characters.
There was a problem hiding this comment.
| … | |
| ... |
We aren't using UTF-8 characters in the docs.
There was a problem hiding this comment.
This file was commented out of the assembly, so it doesn't appear in the final version. I removed it completely from the assembly.
1115053 to
3045dd0
Compare
There was a problem hiding this comment.
Add text here:
List the current Tang server keys, showing the advertised and unadvertised keys:
There was a problem hiding this comment.
Done. Made this a step.
There was a problem hiding this comment.
Add text:
List the current Tang server keys to verify the unadvertised keys are no longer present:
There was a problem hiding this comment.
Done. Made this a step.
There was a problem hiding this comment.
This line is output, not command; should it be in its own box?
There was a problem hiding this comment.
Put it in a new codeblock.
There was a problem hiding this comment.
Add text:
Decrypt the test file created earlier, to verify decryption against the old keys fails:
There was a problem hiding this comment.
Add text:
Query the Tang server for the current advertised key thumbprints:
There was a problem hiding this comment.
This is output, should it be in its own box?
There was a problem hiding this comment.
Put it in a new codeblock.
There was a problem hiding this comment.
The word plaintext is the expected output, not part of the command.
There was a problem hiding this comment.
Add text:
Verify that the encryption succeeded and the file can be decrypted to produce the same string "plaintext":
There was a problem hiding this comment.
Add text:
Check the currently advertised key thumbprint:
There was a problem hiding this comment.
Add text:
Enter the Tang server key directory:
There was a problem hiding this comment.
Add text:
List the current Tang server keys:
There was a problem hiding this comment.
Add text:
List the current Tang server keys to verify the old keys are no longer advertised (they are now hidden files), and new keys are present:
There was a problem hiding this comment.
plaintext is the output, not part of the command
There was a problem hiding this comment.
This needs to be run as root, so # and not $ Or use sudo:
$ sudo yum install tang
There was a problem hiding this comment.
Added sudo
There was a problem hiding this comment.
This needs to be run as root so # and not $, or use sudo:
$sudo dnf install tang
There was a problem hiding this comment.
Added sudo
There was a problem hiding this comment.
The oc line is command, the rest is output.
There was a problem hiding this comment.
The oc line is command, the rest is output.
There was a problem hiding this comment.
This section is made up of 2 commands and 2 output blocks in a single block. It should be split to match the other commands in the document. I'd suggest something like this:
Example of an encryption and decryption attempt with a bad thumbprint:
$ echo "okay" | clevis encrypt tang \ '{"url":"http://tangserver02:7500","thp":"badtumbprint"}' | \ clevis decryptExample output:
Unable to fetch advertisement: 'http://tangserver02:7500/adv/badthumbprint'!Example of an encryption and decryption attempt with a good thumbprint:
$ echo "okay" | clevis encrypt tang \ '{"url":"http://tangserver03:7500","thp":"goodthumbprint"}' | \ clevis decryptExample output
okay
There was a problem hiding this comment.
This 1st line is the command, the rest of this block is the output.
23f5493 to
2a3ae19
Compare
|
/lgtm |
|
@vikram-redhat Merge now??? |
mikemckiernan
left a comment
There was a problem hiding this comment.
One biggie comment--I suggest a reorg to meet mod docs guidelines. Please let me know what I can clarify.
There was a problem hiding this comment.
The verification steps do not use a .Procedure block title. Please remove this line.
There was a problem hiding this comment.
My suggestion is to remove the second sentence because it falls under the ISG guidance, "Claims and recommendations" > "Future product releases":
Avoid claims that are related to the content or timing of a future product or release.
No meaning is lost if you remove it.
There was a problem hiding this comment.
Agreed. Removed.
There was a problem hiding this comment.
The first sentence is awkward. This module does not indicate what procedure to attempt before attempting this discouraged method. Consider checking with tech support to determine if they would prefer keeping this task in a KCS.
If you need to keep this information in the product documentation, maybe something like the following:
"If you are unable to recover network connectivity manually, consider the following steps. Be aware that these steps are discouraged if other methods to recover network connectivity are available."...
At the very least, "highly not recommended" can be replaced with "discouraged."
There was a problem hiding this comment.
Nice rewording. Changed.
There was a problem hiding this comment.
Nice use of text after the output to explain the example output briefly.
There was a problem hiding this comment.
The mod docs guidelines for a procedure module indicate to use a single bullet for a one-step procedure. My sugg:
- You can install...following commands:
** Install the Tang server by using the yum command:
...
** Install the Tang server by using the dnf command:
...
Though, A) bummer that we can't just refer customers to the RHEL 8 procedure and B) strange that those instructions do not show dnf.
There was a problem hiding this comment.
Changes made.
There was a problem hiding this comment.
Same comments as the preceding file--please indicate in the title and first para that this procedure is related to the rekeying procedure.
There was a problem hiding this comment.
I understand the conversational approach to entertaining what our customer "thinks," but unless our customer already has experience with the technology (and likely does not need product documentation), our customer might prefer learning how to determine if the error is temporary. Maybe...
To determine if the error condition from rekeyting the Tang servers is temporary, perform the following procedure. Temporary...
There was a problem hiding this comment.
nit: This might be me, but I don't understand how the customer uses the pod restart policy to restart the pod. The YAML for the tang-rekey daemon set indicates the restart policy is Always. Again, it might just be me, but I am unsure how to use the information from this step.
There was a problem hiding this comment.
This originally wasn't written as a procedure, and that may be where some of the oddness comes in.
Really in the case of these temporary error conditions the adminstator should either keep waiting until things succeed (because the DaemonSet will continue to retry until it succeeds), or the administrator should delete the daemonset and not try again until the temporary error condition (network outage / server offline) has been resolved...
There was a problem hiding this comment.
@lack @mikemckiernan Do we need to add this info to the topic? Something like,
"Generally, when these types of temporary error conditions occur, you can wait until the daemon set succeeds in resolving the error or you can delete the daemon set and not try again until the temporary error condition has been resolved."
There was a problem hiding this comment.
+1 @StephenJamesSmith, I think a customer would appreciate reading the expectations and when to retry.
There was a problem hiding this comment.
Added this.
There was a problem hiding this comment.
sugg: single-node clusters
I'm unsure if the uncertainty is mine alone...
- In the intro para, does "server" refer to a Tang server or a cluster node?
- I have the same uncertainty for the first bullet. Also, is there something that customers can do to ensure that something does not reboot? Or is it "do not reboot <Tang server|cluster node> until..."?
There was a problem hiding this comment.
Good catch! We've been trying to keep the language consistent (where "server" always means "the tang server" and "node" means "The Openshift SNO node"), and this is a departure from that otherwise consistent language.
First paragraph should should be "and a node reboots".
First bullet should be "If any nodes are still online", and maybe we should change the 2nd sentence since it seems odd to end it with "where these is only one node" (which would be better for consistency) because we already said it's "single node"?
Next bullet should also be changed to "node will remain offline"
There was a problem hiding this comment.
Made all changes. Deleted "where these is only one node".
There was a problem hiding this comment.
Stephen, in the mod docs guidelines, there is the suggestion that each assembly documents a user story.
I'm sure there's some flexibility in that guideline, but this assembly pushes it too far. Several user stories are covered in this assembly. To meet the mod docs guidelines, break this into several assemblies and look at the "File Integrity Operator" section of the topic map as an example for how to group the assemblies under a TOC node like "Network-Bound Disk Encryption (NBDE)". You have enough information to justify it.
There was a problem hiding this comment.
I had an afterthought on this comment. I failed to mention to use sentence case for this title and in the _topic_map.yml file. This is covered in the Assembly/Module titles and section headings section of the doc guidelines.
There was a problem hiding this comment.
@mikemckiernan
I'll have a look at the guidelines, but please know that this was developed under just one user story - TELCODOCS-266. Is it a really large story? Yup.
There was a problem hiding this comment.
@mikemckiernan I've restructured this section. Plz review and let me know.
2a3ae19 to
1636727
Compare
0855de0 to
ae24503
Compare
mikemckiernan
left a comment
There was a problem hiding this comment.
Stephen, the whole topic is much more approachable after splitting it up a bit. That comes with some (possibly unwelcome) suggestions for more revision. I hope you don't mind too much--after the restructuring, it became clearer.
_topic_map.yml
Outdated
There was a problem hiding this comment.
Friendly reminder that I thought you mentioned the need to control distros. If you do, then add a Distros: key.
_topic_map.yml
Outdated
There was a problem hiding this comment.
Purely an opinion, I don't think you need "considerations" at the ends of lines 779 and 783.
I can't recall titling guidelines to support my opinion though. Up to you.
There was a problem hiding this comment.
For 17-20, consider a definition list
TPM2::
Binds the disk...
Tang::
Binds the disk...
Clearly optional.
There was a problem hiding this comment.
Changed to definition list
There was a problem hiding this comment.
This topic does end up sticking a bit like a sore thumb:
- Can you write a procedure to configure logging to a central logging destination?
- How does the Tang server log, by default? If that is in the RHEL docs, then, you can add an
.Additional resourcesblock title and addlinkmarkup.
There was a problem hiding this comment.
I haven't found anything on Tang server logging. However, when it is pulled into ZTP in v4.10, I can check to see if/how ZTP would handle this.
There was a problem hiding this comment.
This might read as a pathetic request, but for parity with the "Recovering keys for a Tang server", my sugg:
- Title as "Backing up keys for a Tang server"
- Make it a one-bullet procedure, along the lines of "Back up the contents of the
/usr/libexec/tangd-keygendirectory." - The other sentences can precede the
.Procedureblock title. - Eventually, be prepared for a customers to request some advice about how to perform the backup. Changing to a procedure module might make that adaptation later a little simpler. Maybe.
One content item: The Generating a new Tang server key procedure shows a /usr/libexec/tangd-keygen /var/db/tang command in step 5. I don't know the material, but either line 8 is an executable and not the directory, or the procedure could be incorrect.
There was a problem hiding this comment.
Entertain moving "Tang server location planning" after "include::modules/nbde-using-tang-servers-for-disk-encryption.adoc[leveloffset=+1]"
Then, if there's no better location for "Tang server sizing requirements" move it in this location, after "Tang server location planning."
There was a problem hiding this comment.
The assembly metadata section of the docs guidelines say that the context need to be unique for each assembly. "about-nbde" is prolly OK.
I don't plan to point that out for the other assemblies.
There was a problem hiding this comment.
Unless you feel strongly, the +3 on lines 17 to 21 could be +2 and that'll make them part of the TOC at the top of the page.
There was a problem hiding this comment.
That would be better. Done.
There was a problem hiding this comment.
My $0.02 is that the conceptual information from the modules on lines 11 to 15 don't have a lot to do with the installation process. See if there is a topic in the "About" page where they fit better.
Conversely, try putting "Installation scenarios" ahead of the first include so that customers are given a few considerations for planning and then read the installation procedure.
There was a problem hiding this comment.
Moved "Installation scenarios" ahead of the first include
There was a problem hiding this comment.
I think this file is a duplicate of nbde-tang-server-installation-considerations.adoc. Pick the one that you prefer.
There was a problem hiding this comment.
Removing nbde-installing-a-tang-server.adoc
e1e239c to
eb67388
Compare
eb67388 to
18a8066
Compare
|
@mikemckiernan Please review your latest changes. Are we ready to merge? |
mikemckiernan
left a comment
There was a problem hiding this comment.
/lgtm
I appreciate all the time and effort it took to revise. I find it more approachable.
|
/cherrypick enterprise-4.9 |
|
@mikemckiernan: new pull request created: #37142 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Applicable for openshift-enterprise 4.9.
Jira: TELCODOCS-266
This PR supersedes PR #35722. Title change, section change. Has already been reviewed by OCP Peer Squad.
Docs Preview: https://deploy-preview-36403--osdocs.netlify.app/openshift-enterprise/latest/security/network_bound_disk_encryption/nbde-about-disk-encryption-technology.html