ci: use markdownlint to enforce mkdocs compatibility

mkdocs uses a markdown renderer that is hardcoded to 4 spaces per tab for detecting indentation levels, including ordered- and unordered-lists. Since we cannot easily change the renderer, begin using a markdown linter in CI that will fail if official docs do not adhere to the spacing rules. As a starting point, the markdownlint config does not begin with the default set of checks, which might overwhelm attempts to fix them. Instead, focus on list-tab-spacing rules and a few other highly useful checks. markdownlint also has some gaps in its abilities that allow common Rook doc issues to pass acceptance. Particularly, these gaps involve code blocks that appear in list items. Prettier is a code formatter that can be run in `--check` mode that -- when configured to use tab size 4 -- allows checking for these kinds of doc errors. However, Prettier is very opinionated, and it requires some formatting changes that are non-ideal. For example, unordered lists are also forcefully intended to 4 spaces. Additionally, because of Prettier's formatting, we must now use the below format for callouts, including the space: ``` !!! note callout text ``` These drawbacks are minor compared to the ability to ensure with great accuracy that our `Documentation/` markdown files will render without bugs. Signed-off-by: Blaine Gardner <blaine.gardner@ibm.com>
rook · Apr 26, 2024 · eca9010 · eca9010
1 parent 381556a
commit eca9010
Show file tree

Hide file tree

Showing 80 changed files with 2,668 additions and 2,359 deletions.
diff --git a/.github/workflows/docs-check.yml b/.github/workflows/docs-check.yml
@@ -36,6 +36,25 @@ jobs:
         with:
           python-version: 3.9
 
+      # use markdownlint to check for lint issues
+      - uses: DavidAnson/markdownlint-cli2-action@v16
+        if: always()
+        with:
+          globs: 'Documentation/**/*.md,!Documentation/Helm-Charts'
+          config: ".markdownlint-cli2.cjs"
+          separator: ","
+
+      # markdown lint still allows multi-paragraph list items to be aligned to fewer tab-spaces than
+      # will properly render with mkdocs. use 'prettier' to enforce all markdown should be aligned
+      # to a 4-line boundary
+      - name: Prettify code
+        if: always()
+        uses: creyD/prettier_action@v4.3
+        with:
+          dry: true
+          # --no-error-on-unmatched-pattern -- don't error on symlinks
+          prettier_options: --no-error-on-unmatched-pattern --check **/*.md
+
       - name: Check helm-docs
         run: make check-helm-docs
       - name: Check docs

diff --git a/.markdownlint-cli2.cjs b/.markdownlint-cli2.cjs
@@ -0,0 +1,39 @@
+module.exports = {
+  "config": {
+    "default": false, // no default rules enabled
+    "extends": null, // no default rules enabled
+    "list-indent": true, // all list items must be indented at the same level
+    "ul-indent": {
+      "indent": 4, // mkdocs requires 4 spaces for tabs
+    },
+    "no-hard-tabs": {
+      "spaces_per_tab": 4, // mkdocs requires 4 spaces for tabs
+    },
+    "ol-prefix": {
+      // require fully-numbered lists. this rule helps ensure that code blocks in between ordered
+      // list items (which require surrounding spaces) don't break lists
+      "style": "ordered",
+    },
+    "blanks-around-lists": true, // mkdocs requires blank lines around lists
+    "blanks-around-fences": { // mkdocs requires blank lines around code blocks (fences)
+      "list_items": true, /// ... including in lists
+    },
+    "fenced-code-language": {
+      // enforce code blocks must have language specified
+      // this helps ensure rendering is as intended, and it helps doc-wide searches for code blocks
+      language_only: true,
+    },
+    "no-duplicate-heading": true, // do not allow duplicate headings
+    "link-fragments": true, // validate links to headings within a doc
+    "single-trailing-newline": true, // require single trailing newline in docs
+    "no-multiple-blanks": {
+      "maximum": 1, // prettier does this, but markdownlint provides better feedback in CI
+    },
+
+    // custom rule for rook, defined below
+    "mkdocs-prettier-admonitions": true,
+  },
+ "customRules": [
+    "./tests/scripts/markdownlint-check-callouts.js",
+  ],
+};
diff --git a/.prettierignore b/.prettierignore
@@ -0,0 +1,6 @@
+# use prettier to format docs that are rendered with mkdocs
+# e.g., ONLY format the Documentation directory
+/*
+/Documentation/Helm-Charts/
+!Documentation/
+!README.md
diff --git a/.prettierrc b/.prettierrc
@@ -0,0 +1,6 @@
+{
+  "tabWidth": 4,
+  "useTabs": false,
+  "embeddedLanguageFormatting": "off",
+  "parser": "markdown"
+}
diff --git a/.vscode/extensions.json b/.vscode/extensions.json
@@ -7,7 +7,8 @@
     "redhat.vscode-yaml",
     "yzhang.markdown-all-in-one",
     "timonwong.shellcheck",
-    "ms-python.black-formatter"
+    "ms-python.black-formatter",
+    "esbenp.prettier-vscode"
   ],
   // List of extensions recommended by VS Code that should not be recommended for users of this workspace.
   "unwantedRecommendations": []

diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -16,4 +16,15 @@
     "editor.defaultFormatter": "ms-python.black-formatter",
     "editor.formatOnSave": true
   },
+  "[markdown]": {
+    "editor.tabSize": 4,
+    "editor.detectIndentation": false,
+    "editor.defaultFormatter": "esbenp.prettier-vscode",
+  },
+  "markdownlint.lintWorkspaceGlobs": [
+    "Documentation/**/*.md",
+    "!Documentation/Helm-Charts",
+  ],
+  "markdown.extension.list.indentationSize": "inherit",
+  "prettier.enable": true,
 }
diff --git a/Documentation/CRDs/Block-Storage/ceph-block-pool-crd.md b/Documentation/CRDs/Block-Storage/ceph-block-pool-crd.md
@@ -11,6 +11,7 @@ Rook allows creation and customization of storage pools through the custom resou
 For optimal performance, while also adding redundancy, this sample will configure Ceph to make three full copies of the data on multiple nodes.
 
 !!! note
+
     This sample requires *at least 1 OSD per node*, with each OSD located on *3 different nodes*.
 
 Each OSD must be located on a different node, because the [`failureDomain`](ceph-block-pool-crd.md#spec) is set to `host` and the `replicated.size` is set to `3`.
@@ -53,13 +54,15 @@ spec:
 ```
 
 !!! important
+
     The device classes `primaryDeviceClass` and `secondaryDeviceClass` must have at least one OSD associated with them or else the pool creation will fail.
 
 ### Erasure Coded
 
 This sample will lower the overall storage capacity requirement, while also adding redundancy by using [erasure coding](#erasure-coding).
 
 !!! note
+
     This sample requires *at least 3 bluestore OSDs*.
 
 The OSDs can be located on a single Ceph node or spread across multiple nodes, because the [`failureDomain`](ceph-block-pool-crd.md#spec) is set to `osd` and the `erasureCoded` chunk settings require at least 3 different OSDs (2 `dataChunks` + 1 `codingChunks`).
@@ -139,7 +142,7 @@ See the official rbd mirror documentation on [how to add a bootstrap peer](https
 
 Imagine the following topology with datacenters containing racks and then hosts:
 
-```text
+```console
 .
 ├── datacenter-1
 │   ├── rack-1
@@ -177,61 +180,68 @@ spec:
 
 ### Metadata
 
-* `name`: The name of the pool to create.
-* `namespace`: The namespace of the Rook cluster where the pool is created.
+-   `name`: The name of the pool to create.
+-   `namespace`: The namespace of the Rook cluster where the pool is created.
 
 ### Spec
 
-* `replicated`: Settings for a replicated pool. If specified, `erasureCoded` settings must not be specified.
-    * `size`: The desired number of copies to make of the data in the pool.
-    * `requireSafeReplicaSize`: set to false if you want to create a pool with size 1, setting pool size 1 could lead to data loss without recovery. Make sure you are *ABSOLUTELY CERTAIN* that is what you want.
-    * `replicasPerFailureDomain`: Sets up the number of replicas to place in a given failure domain. For instance, if the failure domain is a datacenter (cluster is
-stretched) then you will have 2 replicas per datacenter where each replica ends up on a different host. This gives you a total of 4 replicas and for this, the `size` must be set to 4. The default is 1.
-    * `subFailureDomain`: Name of the CRUSH bucket representing a sub-failure domain. In a stretched configuration this option represent the "last" bucket where replicas will end up being written. Imagine the cluster is stretched across two datacenters, you can then have 2 copies per datacenter and each copy on a different CRUSH bucket. The default is "host".
-* `erasureCoded`: Settings for an erasure-coded pool. If specified, `replicated` settings must not be specified. See below for more details on [erasure coding](#erasure-coding).
-    * `dataChunks`: Number of chunks to divide the original object into
-    * `codingChunks`: Number of coding chunks to generate
-* `failureDomain`: The failure domain across which the data will be spread. This can be set to a value of either `osd` or `host`, with `host` being the default setting. A failure domain can also be set to a different type (e.g. `rack`), if the OSDs are created on nodes with the supported [topology labels](../Cluster/ceph-cluster-crd.md#osd-topology). If the `failureDomain` is changed on the pool, the operator will create a new CRUSH rule and update the pool.
+-   `replicated`: Settings for a replicated pool. If specified, `erasureCoded` settings must not be specified.
+    -   `size`: The desired number of copies to make of the data in the pool.
+    -   `requireSafeReplicaSize`: set to false if you want to create a pool with size 1, setting pool size 1 could lead to data loss without recovery. Make sure you are **ABSOLUTELY CERTAIN** that is what you want.
+    -   `replicasPerFailureDomain`: Sets up the number of replicas to place in a given failure domain. For instance, if the failure domain is a datacenter (cluster is
+        stretched) then you will have 2 replicas per datacenter where each replica ends up on a different host. This gives you a total of 4 replicas and for this, the `size` must be set to 4. The default is 1.
+    -   `subFailureDomain`: Name of the CRUSH bucket representing a sub-failure domain. In a stretched configuration this option represent the "last" bucket where replicas will end up being written. Imagine the cluster is stretched across two datacenters, you can then have 2 copies per datacenter and each copy on a different CRUSH bucket. The default is "host".
+-   `erasureCoded`: Settings for an erasure-coded pool. If specified, `replicated` settings must not be specified. See below for more details on [erasure coding](#erasure-coding).
+    -   `dataChunks`: Number of chunks to divide the original object into
+    -   `codingChunks`: Number of coding chunks to generate
+-   `failureDomain`: The failure domain across which the data will be spread. This can be set to a value of either `osd` or `host`, with `host` being the default setting. A failure domain can also be set to a different type (e.g. `rack`), if the OSDs are created on nodes with the supported [topology labels](../Cluster/ceph-cluster-crd.md#osd-topology). If the `failureDomain` is changed on the pool, the operator will create a new CRUSH rule and update the pool.
     If a `replicated` pool of size `3` is configured and the `failureDomain` is set to `host`, all three copies of the replicated data will be placed on OSDs located on `3` different Ceph hosts. This case is guaranteed to tolerate a failure of two hosts without a loss of data. Similarly, a failure domain set to `osd`, can tolerate a loss of two OSD devices.
 
     If erasure coding is used, the data and coding chunks are spread across the configured failure domain.
 
     !!! caution
+
         Neither Rook, nor Ceph, prevent the creation of a cluster where the replicated data (or Erasure Coded chunks) can be written safely. By design, Ceph will delay checking for suitable OSDs until a write request is made and this write can hang if there are not sufficient OSDs to satisfy the request.
-* `deviceClass`: Sets up the CRUSH rule for the pool to distribute data only on the specified device class. If left empty or unspecified, the pool will use the cluster's default CRUSH root, which usually distributes data over all OSDs, regardless of their class. If `deviceClass` is specified on any pool, ensure that it is added to every pool in the cluster, otherwise Ceph will warn about pools with overlapping roots.
-* `crushRoot`: The root in the crush map to be used by the pool. If left empty or unspecified, the default root will be used. Creating a crush hierarchy for the OSDs currently requires the Rook toolbox to run the Ceph tools described [here](http://docs.ceph.com/docs/master/rados/operations/crush-map/#modifying-the-crush-map).
-* `enableRBDStats`: Enables collecting RBD per-image IO statistics by enabling dynamic OSD performance counters. Defaults to false. For more info see the [ceph documentation](https://docs.ceph.com/docs/master/mgr/prometheus/#rbd-io-statistics).
-* `name`: The name of Ceph pools is based on the `metadata.name` of the CephBlockPool CR. Some built-in Ceph pools
-  require names that are incompatible with K8s resource names. These special pools can be configured
-  by setting this `name` to override the name of the Ceph pool that is created instead of using the `metadata.name` for the pool.
-  Only the following pool names are supported: `.nfs`, `.mgr`, and `.rgw.root`. See the example
-  [builtin mgr pool](https://github.com/rook/rook/blob/master/deploy/examples/pool-builtin-mgr.yaml).
-* `application`: The type of application set on the pool. By default, Ceph pools for CephBlockPools will be `rbd`,
-  CephObjectStore pools will be `rgw`, and CephFilesystem pools will be `cephfs`.
-
-* `parameters`: Sets any [parameters](https://docs.ceph.com/docs/master/rados/operations/pools/#set-pool-values) listed to the given pool
-    * `target_size_ratio:` gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool, for more info see the [ceph documentation](https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size)
-    * `compression_mode`: Sets up the pool for inline compression when using a Bluestore OSD. If left unspecified does not setup any compression mode for the pool. Values supported are the same as Bluestore inline compression [modes](https://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#inline-compression), such as `none`, `passive`, `aggressive`, and `force`.
-
-* `mirroring`: Sets up mirroring of the pool
-    * `enabled`: whether mirroring is enabled on that pool (default: false)
-    * `mode`: mirroring mode to run, possible values are "pool" or "image" (required). Refer to the [mirroring modes Ceph documentation](https://docs.ceph.com/docs/master/rbd/rbd-mirroring/#enable-mirroring) for more details.
-    * `snapshotSchedules`: schedule(s) snapshot at the **pool** level. One or more schedules are supported.
-        * `interval`: frequency of the snapshots. The interval can be specified in days, hours, or minutes using d, h, m suffix respectively.
-        * `startTime`: optional, determines at what time the snapshot process starts, specified using the ISO 8601 time format.
-    * `peers`: to configure mirroring peers. See the prerequisite [RBD Mirror documentation](ceph-rbd-mirror-crd.md) first.
-        * `secretNames`:  a list of peers to connect to. Currently **only a single** peer is supported where a peer represents a Ceph cluster.
-
-* `statusCheck`: Sets up pool mirroring status
-    * `mirror`: displays the mirroring status
-        * `disabled`: whether to enable or disable pool mirroring status
-        * `interval`: time interval to refresh the mirroring status (default 60s)
-
-* `quotas`: Set byte and object quotas. See the [ceph documentation](https://docs.ceph.com/en/latest/rados/operations/pools/#set-pool-quotas) for more info.
-    * `maxSize`: quota in bytes as a string with quantity suffixes (e.g. "10Gi")
-    * `maxObjects`: quota in objects as an integer
+
+-   `deviceClass`: Sets up the CRUSH rule for the pool to distribute data only on the specified device class. If left empty or unspecified, the pool will use the cluster's default CRUSH root, which usually distributes data over all OSDs, regardless of their class. If `deviceClass` is specified on any pool, ensure that it is added to every pool in the cluster, otherwise Ceph will warn about pools with overlapping roots.
+-   `crushRoot`: The root in the crush map to be used by the pool. If left empty or unspecified, the default root will be used. Creating a crush hierarchy for the OSDs currently requires the Rook toolbox to run the Ceph tools described [here](http://docs.ceph.com/docs/master/rados/operations/crush-map/#modifying-the-crush-map).
+-   `enableRBDStats`: Enables collecting RBD per-image IO statistics by enabling dynamic OSD performance counters. Defaults to false. For more info see the [ceph documentation](https://docs.ceph.com/docs/master/mgr/prometheus/#rbd-io-statistics).
+-   `name`: The name of Ceph pools is based on the `metadata.name` of the CephBlockPool CR. Some built-in Ceph pools
+    require names that are incompatible with K8s resource names. These special pools can be configured
+    by setting this `name` to override the name of the Ceph pool that is created instead of using the `metadata.name` for the pool.
+    Only the following pool names are supported: `.nfs`, `.mgr`, and `.rgw.root`. See the example
+    [builtin mgr pool](https://github.com/rook/rook/blob/master/deploy/examples/pool-builtin-mgr.yaml).
+-   `application`: The type of application set on the pool. By default, Ceph pools for CephBlockPools will be `rbd`,
+    CephObjectStore pools will be `rgw`, and CephFilesystem pools will be `cephfs`.
+
+-   `parameters`: Sets any [parameters](https://docs.ceph.com/docs/master/rados/operations/pools/#set-pool-values) listed to the given pool
+
+    -   `target_size_ratio:` gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool, for more info see the [ceph documentation](https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size)
+    -   `compression_mode`: Sets up the pool for inline compression when using a Bluestore OSD. If left unspecified does not setup any compression mode for the pool. Values supported are the same as Bluestore inline compression [modes](https://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#inline-compression), such as `none`, `passive`, `aggressive`, and `force`.
+
+-   `mirroring`: Sets up mirroring of the pool
+
+    -   `enabled`: whether mirroring is enabled on that pool (default: false)
+    -   `mode`: mirroring mode to run, possible values are "pool" or "image" (required). Refer to the [mirroring modes Ceph documentation](https://docs.ceph.com/docs/master/rbd/rbd-mirroring/#enable-mirroring) for more details.
+    -   `snapshotSchedules`: schedule(s) snapshot at the **pool** level. One or more schedules are supported.
+        -   `interval`: frequency of the snapshots. The interval can be specified in days, hours, or minutes using d, h, m suffix respectively.
+        -   `startTime`: optional, determines at what time the snapshot process starts, specified using the ISO 8601 time format.
+    -   `peers`: to configure mirroring peers. See the prerequisite [RBD Mirror documentation](ceph-rbd-mirror-crd.md) first.
+        -   `secretNames`: a list of peers to connect to. Currently **only a single** peer is supported where a peer represents a Ceph cluster.
+
+-   `statusCheck`: Sets up pool mirroring status
+
+    -   `mirror`: displays the mirroring status
+        -   `disabled`: whether to enable or disable pool mirroring status
+        -   `interval`: time interval to refresh the mirroring status (default 60s)
+
+-   `quotas`: Set byte and object quotas. See the [ceph documentation](https://docs.ceph.com/en/latest/rados/operations/pools/#set-pool-quotas) for more info.
+
+    -   `maxSize`: quota in bytes as a string with quantity suffixes (e.g. "10Gi")
+    -   `maxObjects`: quota in objects as an integer
 
     !!! note
+
         A value of 0 disables the quota.
 
 ### Add specific pool properties
@@ -271,8 +281,8 @@ Here are some examples to illustrate how the number of chunks affects the storag
 
 The `failureDomain` must be also be taken into account when determining the number of chunks. The failure domain determines the level in the Ceph CRUSH hierarchy where the chunks must be uniquely distributed. This decision will impact whether node losses or disk losses are tolerated. There could also be performance differences of placing the data across nodes or osds.
 
-* `host`: All chunks will be placed on unique hosts
-* `osd`: All chunks will be placed on unique OSDs
+-   `host`: All chunks will be placed on unique hosts
+-   `osd`: All chunks will be placed on unique OSDs
 
 If you do not have a sufficient number of hosts or OSDs for unique placement the pool can be created, writing to the pool will hang.
 

diff --git a/Documentation/CRDs/Block-Storage/ceph-block-pool-rados-namespace-crd.md b/Documentation/CRDs/Block-Storage/ceph-block-pool-rados-namespace-crd.md
@@ -6,7 +6,7 @@ This guide assumes you have created a Rook cluster as explained in the main [Qui
 
 RADOS currently uses pools both for data distribution (pools are shared into
 PGs, which map to OSDs) and as the granularity for security (capabilities can
-restrict access by pool).  Overloading pools for both purposes makes it hard to
+restrict access by pool). Overloading pools for both purposes makes it hard to
 do multi-tenancy because it not a good idea to have a very large number of
 pools.
 
@@ -44,8 +44,8 @@ If any setting is unspecified, a suitable default will be used automatically.
 
 ### Metadata
 
-- `name`: The name that will be used for the Ceph BlockPool rados namespace.
+-   `name`: The name that will be used for the Ceph BlockPool rados namespace.
 
 ### Spec
 
-- `blockPoolName`: The metadata name of the CephBlockPool CR where the rados namespace will be created.
+-   `blockPoolName`: The metadata name of the CephBlockPool CR where the rados namespace will be created.