Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph-disk: support osd new #15432

Merged
merged 18 commits into from Jul 18, 2017

Conversation

Projects
None yet
6 participants
@ghost
Copy link

ghost commented Jun 2, 2017

No description provided.

@ghost ghost added core feature labels Jun 2, 2017

@ghost ghost requested a review from liewegas Jun 2, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 2, 2017

It looks like there is no need for an architecture change. Since osd new is idempotent, it can happen while setting up the lockbox and once more when activating. There is not even a need to keep the OSD id since it will be returned in both cases as long as the OSD uuid and the secrets are identicals.

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 2, 2017

@liewegas --replace-osd-id should really be --osd-id don't you think ? ceph osd new will not mind that a non existent osd-id is provided. Or will it ?

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Jun 2, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 2, 2017

@liewegas could you please take a quick look at 6111728 and let me know if it makes sense ?

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Jun 2, 2017

@ghost ghost requested a review from jecluis Jun 2, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 2, 2017

we can't leave it in memory because activate will happen in another invocation of ceph-disk, via systemd most likely. I'll check for ways to securely delete a file on ext4.

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Jun 3, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 3, 2017

Here is an idea to never store the dmcrypt secrets on file. When creating the lockbox we store them in memory and pipe them to osd new. We also write the secrets that the OSD will need for activation (i.e. we only store the cephx_secret part of the secrets file, not the rest).

During activation, if there is a lockbox matching the OSD uuid, we get the secrets from it and move it to the OSD metadata partition. If there is no lockbox we create a secret (it does not matter that it is stored or not because it essentially contains the same information as the keyring anyway). Using that secret file (either coming from the lockbox or created during activation) we obtain the OSD id.

Does that sound sensible ?

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 3, 2017

I don't see in which scenario storing the OSD id obtained while initializing the lockbox would be useful ? It needs to be obtained from the mon when there is no lockbox. If there is a lockbox we could make a special case and not get the OSD id from the mon if the OSD id is found in the lockbox. But we would need to keep the call to the mon anyway so it would be a special case with no real advantage.

Maybe you have another workflow in mind and I don't see it ?

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 3, 2017

Oh, I forgot that osd new will only be idempotent if the secrets are exactly the same. So there is an upside to your idea of storing the OSD id after all. I get it now :-) It would go like this:

During prepare, if there is a lockbox only:

  • call osd new with the secrets.json pipe to it and not store in a file
  • write the OSD id in the whoami file of the lockbox
  • write a keyring containing only the cephx_secret in the lockbox

During activation,

  • if there is a lockbox, move the OSD id and the keyring from the lockbox to the metadata partition and do not call osd new
  • if there is no lockbox, call osd new with a secrets.json file

Ok ?

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 3, 2017

updated the draft implementation to keep the secret.json in memory (does not test or run but I have a good feeling about it ;-)

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 5, 2017

rebased & resolve trivial conflict in header ordering

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 6, 2017

reset --hard e38ca14
cherry-pick e50af6e ae8e959 f4ca9db 2dd8af1 # and resolve conflicts semi-intuitively...

@@ -8996,6 +8996,7 @@ bool OSDMonitor::prepare_command_impl(MonOpRequestRef op,
goto reply;
}

rdata.append(ss);

This comment has been minimized.

Copy link
@jecluis

jecluis Jun 6, 2017

Member

this patch can be dropped.

This comment has been minimized.

Copy link
@ghost

ghost Jun 6, 2017

Author

ack

This comment has been minimized.

Copy link
@ghost

ghost Jun 6, 2017

Author

dropped it

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 6, 2017

unless someone objects I'll move on with testing based on the current draft implementation

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Jun 6, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 7, 2017

@jecluis I added a test that does the following (see https://github.com/ceph/ceph/pull/15432/files#diff-bcd18ee54cb39db59a6d811120b705acR297 for the actual details)

  • ceph-osd --mkfs
  • id=$(ceph osd new UUID)

Works fine, shows in the crushmap and is useable

  • kill ceph-osd.$id
  • ceph osd purge osd.$id

Disapear from the crushmap as expected

  • remove the data directory
  • ceph-osd --mkfs
  • ceph osd new UUID $id

Shows in the crushmap but not in ceph df. And it is not used. Did I do something obviously wrong ?

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Jun 7, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 7, 2017

Nevermind... the test was a) writing an object to a pool with a single OSD, b) destroying the OSD, c) creating a new OSD, d) trying to overwrite the same lost object. It did not go very well...

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 7, 2017

I'll wait until http://tracker.ceph.com/issues/20208 is fixed. Something is going wrong but I can't figure it out because ceph-mgr keeps dying.

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 8, 2017

Applied 79776f5 and it fixes the ceph pg dump issue, great.

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 8, 2017

We're getting closer ! I did the wrong thing with --osd-id : won't work with Lockbox. Other than this it passes tests locally.

Next step is to pass the ceph-disk suite. If all is well it should run unmodified. I don't think it needs an additional test since the --osd-id option tests were added to make check. Unless I'm mistaken there is no need for integration test to verify it.

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 9, 2017

passes the ceph-disk suite manually, almost there ;-)

@ghost ghost changed the title WIP: ceph-disk and osd new DNM: ceph-disk and osd new Jun 9, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jun 9, 2017

@wjwithagen are you ok with c0ea4bc ?

ldachary added some commits Jun 8, 2017

ceph-disk: implement Secrets,LockboxSecrets
Supporting the JSON format accepted by osd new.

Signed-off-by: Loic Dachary <loic@dachary.org>
tests: ceph-disk tests are not shell
Signed-off-by: Loic Dachary <loic@dachary.org>
vstart: it's OK to have very little free space
Signed-off-by: Loic Dachary <loic@dachary.org>
ceph-disk: _dmcrypt_map uses command_with_stdin
Instead of an inlined version of it.

Signed-off-by: Loic Dachary <loic@dachary.org>
ceph-disk: copyright dates updates
Signed-off-by: Loic Dachary <loic@dachary.org>
tests: make ceph-disk timeout environement configurable
Signed-off-by: Loic Dachary <loic@dachary.org>
tests: generalize FreeBSD timeout special case
Signed-off-by: Loic Dachary <loic@dachary.org>
tests: ceph-disk reorganize write tests
Signed-off-by: Loic Dachary <loic@dachary.org>
tests: ceph-disk add ceph-mgr during activation tests
Signed-off-by: Loic Dachary <loic@dachary.org>
ceph-disk: use osd new instead of osd create
When using --dmcrypt, the lockbox stores the OSD id and the cephx secret
that will be used when activating the OSD.

When activating, the OSD id is copied from the lockbox if available,
otherwise it is obtained from osd new.

Add support for re-using an OSD id via the --osd-id option to prepare.

Signed-off-by: Loic Dachary <loic@dachary.org>
tests: ceph-disk test for prepare --osd-id
Signed-off-by: Loic Dachary <loic@dachary.org>
tests: ceph-disk destroy needs --purge
The former semantic of ceph-disk destroy is now implemented with the
--purge flag. Use that for the ceph-disk suite.

Signed-off-by: Loic Dachary <loic@dachary.org>
doc: ceph-disk: fix typos
Signed-off-by: Loic Dachary <loic@dachary.org>
@ghost

This comment has been minimized.

Copy link
Author

ghost commented Jul 14, 2017

@tchaikov fixed & repushed, thanks !

@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Jul 15, 2017

/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh: line 74:  7890 Terminated              rados --pool $poolname put $objname $dir/ORIGINAL
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh:85: rados_put:  return 1

this happens occasionally...

retest this please.

@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Jul 15, 2017

@alfredodeza what are you expecting to be revised/added in this PR by "needs-doc"? "destroy" is an existing sub-command, and ceph-disk is self-documented. so i guess we are ready to merge it, aren't we?

@alfredodeza

This comment has been minimized.

Copy link
Contributor

alfredodeza commented Jul 17, 2017

@tchaikov I see there are a few things that are missing (docs-wise). Here are two that I found:

  • --purge is introduced by this PR but it isn't mentioned anywhere in docs
  • you mentioned destroy is an existing command, but again, I couldn't find it anywhere (not even on the man page)

Not sure what you mean by ceph-disk being self-documented? Like everything is in the --help menus? ceph-disk does so much, in so many different variants that I don't think we can keep everything in --help. It might be insurmountable to require full docs for everything, but at least we should make the effort of adding documentation for things that are being introduced or changed.

@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Jul 17, 2017

  • --purge is introduced by this PR but it isn't mentioned anywhere in docs
  • you mentioned destroy is an existing command, but again, I couldn't find it anywhere (not even on the man page)
usage: ceph-disk destroy [-h] [--cluster NAME] [--destroy-by-id <id>]
                         [--dmcrypt-key-dir KEYDIR] [--zap] [--purge]
                         [PATH]

Destroy the OSD located at PATH.  It removes the OSD from the
cluster and marks it destroyed. An OSD must be down before it
can be destroyed. Once it is destroyed, a new OSD can be created
in its place, reusing the same OSD id and position (e.g. after
a failed HDD or SSD is replaced).  Alternatively, if the
--purge option is also specified, the OSD is removed from the
CRUSH map and the OSD id is deallocated.

positional arguments:
  PATH                  path to block device or directory

optional arguments:
  -h, --help            show this help message and exit
  --cluster NAME        cluster name to assign this disk to
  --destroy-by-id <id>  ID of OSD to destroy
  --dmcrypt-key-dir KEYDIR
                        directory where dm-crypt keys are stored (If you don't
                        know how it work, dont use it. we have default value)
  --zap                 option to erase data and partition
  --purge               option to remove OSD from CRUSH map and deallocate the
                        id

Not sure what you mean by ceph-disk being self-documented? Like everything is in the --help menus?

yeah, see the paste above

ceph-disk does so much, in so many different variants that I don't think we can keep everything in --help.

by "we can", you mean we should?

It might be insurmountable to require full docs for everything, but at least we should make the effort of adding documentation for things that are being introduced or changed.

no, i don't really think it's worthy to keep two copies of the help message in sync. and d355a38 explains the rationale.

@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Jul 18, 2017

@alfredodeza does it make sense? sorry for pushing you, but it'd be great if we can have this change in luminous.

@alfredodeza

This comment has been minimized.

Copy link
Contributor

alfredodeza commented Jul 18, 2017

@tchaikov an option description in --help doesn't mean it is fully documented :( Most of the flags in ceph-disk have very short descriptions in its help output which don't exemplify what they do.

We currently have ceph-disk documentation online. Do you mean to say we shouldn't add to it because the help menus in the CLI are sufficient? When would it be OK to add to the online docs?

As a user, it seems unfair to expect one to go on all the sub-commands running --help to understand what the tool offers, and have a very different situation when browsing the docs.

For example, the 'osd removal' section doesn't mention this:
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/

And neither does the 'ceph-disk' section of the manual deployment.
http://docs.ceph.com/docs/master/install/manual-deployment/#short-form

If the online docs tell a different story than the help menu in the tool, where is the benefit? How can we improve this situation so that a user isn't surprised by options/workflows that are not explained unless they run --help ?

@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Jul 18, 2017

Most of the flags in ceph-disk have very short descriptions in its help output which don't exemplify what they do.

we can always address this in anther PR. but i don't think it's in the scope of this one.

For example, the 'osd removal' section doesn't mention this:

see #16314

As a user, it seems unfair to expect one to go on all the sub-commands running --help to understand what the tool offers, and have a very different situation when browsing the docs.

agreed. so the man pages offers the info of what each subcommand can do, but only briefly.

@alfredodeza

This comment has been minimized.

Copy link
Contributor

alfredodeza commented Jul 18, 2017

This can be merged as-is, but now that -i can take - to read from stdin in master it might be worth to update to that?

pr #16359

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Jul 18, 2017

Let's do that as a follow-on cleanup

@liewegas liewegas merged commit fd9582f into ceph:master Jul 18, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details
@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Jul 19, 2017

@alfredodeza for the -i - change, see #16362

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.