Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update NFS docs #9281

Merged
merged 1 commit into from
Dec 8, 2021
Merged

docs: update NFS docs #9281

merged 1 commit into from
Dec 8, 2021

Conversation

BlaineEXE
Copy link
Member

NFS docs need to be updated to reflect changes between Octopus and
Pacific, how users are affected, how NFS is configured differently
between the Ceph versions, and how to upgrade from one Ceph version to
the other.

Signed-off-by: Blaine Gardner blaine.gardner@redhat.com

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

@BlaineEXE BlaineEXE force-pushed the nfs-docs branch 2 times, most recently from 3bc6980 to f013032 Compare December 7, 2021 20:37
@BlaineEXE BlaineEXE marked this pull request as ready for review December 7, 2021 20:37

## Overview

Rook allows exporting NFS shares of the filesystem or object store through the CephNFS custom resource definition. This will spin up a cluster of [NFS Ganesha](https://github.com/nfs-ganesha/nfs-ganesha) servers that coordinate with one another via shared RADOS objects. The servers will be configured for NFSv4.1+ access, as serving earlier protocols can inhibit responsiveness after a server restart.
> **WARNING**: We do not recommend using NFS in Ceph v16.2.0 through v16.2.6. If you are using Ceph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving the warning after the opening paragraph? Then we can start off with a description about the feature before we warn them not to use it. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍


## Overview

Rook allows exporting NFS shares of the filesystem or object store through the CephNFS custom resource definition. This will spin up a cluster of [NFS Ganesha](https://github.com/nfs-ganesha/nfs-ganesha) servers that coordinate with one another via shared RADOS objects. The servers will be configured for NFSv4.1+ access, as serving earlier protocols can inhibit responsiveness after a server restart.
> **WARNING**: We do not recommend using NFS in Ceph v16.2.0 through v16.2.6. If you are using Ceph
> v15, please wait to upgrade until v16.2.7 is released.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement will hopefully be obsolete in the next few days when it's released. What if we just assume it's released so we don't need to revisit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. 👍


The following sample will create a two-node active-active cluster of NFS Ganesha gateways. The recovery objects are stored in a RADOS pool named `myfs-data0` with a RADOS namespace of `nfs-ns`.
## Samples
The following sample assumes Ceph v15 and will create a two-node active-active cluster of NFS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we actually assuming v16.2.7 now, instead of v15?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change this and the example manifest I think

When a CephNFS is first created, all NFS daemons within the CephNFS cluster will share a
configuration with no exports defined.

### For Ceph v15
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to document v15 exports? Seems like we should concentrate docs on 16.2.7+.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still support v15, and I still took the time to write the documentation, so I don't see the hurt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks more like a Ceph documentation rather than a Rook at this point 😅

Documentation/ceph-nfs-crd.md Show resolved Hide resolved
Documentation/ceph-nfs-crd.md Show resolved Hide resolved
Documentation/ceph-nfs-crd.md Show resolved Hide resolved
@@ -72,55 +82,395 @@ spec:
priorityClassName:
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BlaineEXE : I have a generic question, is it planned to support CRs for Exports in near future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seb and I believe that exports are something that should be enabled by a new mode added to the Ceph-CSI driver rather than CRs.


## Overview
Rook allows exporting NFS shares of the filesystem or object store through the CephNFS custom
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "object store" are you referring to rgw-nfs? If so, it's not using CephFS but simply RGW.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will clarify by changing this to CephFilesystem and CephObjectStore. I'm not sure what rgw-nfs is?

NFS configuration is stored in a Ceph pool so that it is highly available and protected. How that is
configured changes depending on the Ceph version. Configuring the pool is done via the `rados` config.

> **WARNING**: Do not use error corrected (EC) pools for NFS. NFS-Ganesha uses OMAP which is not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error corrected pools? Did you mean Erasure Coded pools? It's the first time I'm hearing this term in Ceph... If I missed something, how do people know they are using an error corrected pool?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES LOL!

I'll add a link

When a CephNFS is first created, all NFS daemons within the CephNFS cluster will share a
configuration with no exports defined.

### For Ceph v15
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks more like a Ceph documentation rather than a Rook at this point 😅

ensure the necessary Ceph mgr modules are enabled and that the Ceph orchestrator backend is set to
Rook.
```console
ceph mgr module enable rook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we recommend setting the enabled module in the mgr spec instead of jumping in the toolbox? Also I think the porch rook backend might be enabled already...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have to use the toolbox to create the exports anyway. Because of that, IMO, it's simpler just to say "enable this in the toolbox" so we don't have to cross-link to the CephCluster. But this is a nuance I think we could talk about more. Maybe that would be fine for a follow-up? @leseb

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and I don't feel really strong about it too, also later on you recommend disabling the module so it's probably not worth editing the CR for this :).

```
mount -t nfs4 -o proto=tcp <nfs-service-ip>:/ <mount-location>
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unecessary blank line?


## Upgrading from Ceph v15 to v16
We do not recommend using NFS in Ceph v16.2.0 through v16.2.6 due to bugs in Ceph's NFS
implementation. If you are using Ceph v15, please wait to upgrade until Ceph v16.2.7 is released.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v16.2.7 is available now.

started. Scaling down the cluster requires that clients be migrated from servers that will be
eliminated to others. That process is currently a manual one and should be performed before reducing
the size of the cluster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unecessary blank line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been keeping two newlines between the major lvl-2 headers to help with navigating the raw .md file a bit easier.

# RADOS namespace where NFS client recovery data is stored in the pool.
namespace: nfs-ns

# For Ceph v15, use the block here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care for the -test.taml? The cluster-test.yaml points to v16.2 already so I don't think we need to repeat this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can remove the changes in nfs-test.yaml

Documentation/ceph-nfs-crd.md Show resolved Hide resolved
Rook allows exporting NFS shares of the filesystem or object store through the CephNFS custom
resource definition. This will spin up a cluster of
[NFS Ganesha](https://github.com/nfs-ganesha/nfs-ganesha) servers that coordinate with one another
via shared RADOS objects. The servers will be configured for NFSv4.1+ access only, as serving
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is mention nfsv4.1+ but all the below examples is defined with nfsv4

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine in the two other places where it appears. The .1+ isn't strictly necessary to know for those since v4.1 is also still v4.

Comment on lines -13 to -14
# RADOS namespace where NFS client recovery data is stored in the pool.
namespace: nfs-ns
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did remove this from nfs-test.yaml because it is only relevant for octopus.

@BlaineEXE BlaineEXE force-pushed the nfs-docs branch 2 times, most recently from 24108e2 to 27a0a6e Compare December 8, 2021 16:12
NFS docs need to be updated to reflect changes between Octopus and
Pacific, how users are affected, how NFS is configured differently
between the Ceph versions, and how to upgrade from one Ceph version to
the other.

Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Documentation/ceph-nfs-crd.md Show resolved Hide resolved
rados:
# The Ganesha pool spec. Must use replication.
poolConfig:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the poolConfig be under rados? If so, the comment above on line 8 is confusing about how the rados settings aren't necessary now.

Copy link
Member Author

@BlaineEXE BlaineEXE Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we do have time to change this before cutting the 1.8 release since it was added recently by Seb. IMO, I would put poolConfig on the top level, like you are suggesting. I think having a rados block seems a little too specific. Let's bring @leseb into this discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I'll merge this since I think changing the spec is orthogonal and good for a follow-up PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, we still have time before cutting 1.8. I guess this used to be valid while the RadosSpec made sense, but not anymore. +1 from moving to the top-level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder whether we should have this here or not. If there are multiple CephNFS specs, they may contend with each other to set different replication settings on the same .nfs pool. Maybe we should just remove this config and allow .nfs to be managed by @travisn 's suggestion here: #9209 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but I agree the multiple CR is a concern. However, is this a valid use case? Should we support it? Especially if they all end up using the same pools.
I'm not really keen on leaving the pool management to the user, the nfs implementation is already complex enough (at least the migration).

@BlaineEXE BlaineEXE merged commit d20692a into rook:master Dec 8, 2021
@BlaineEXE BlaineEXE deleted the nfs-docs branch December 8, 2021 17:00
BlaineEXE added a commit that referenced this pull request Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants