Skip to content
This repository has been archived by the owner on Nov 7, 2019. It is now read-only.

7584 Improve 'zpool labelclear' command #424

Closed
wants to merge 15 commits into from

Conversation

martymac
Copy link

Hi,

This patch improves the 'zpool labelclear' command.

See :

https://www.illumos.org/issues/7584

for a detailed description and test cases.

Best regards,

Ganael.

@@ -198,6 +198,7 @@ SYMBOL_VERSION SUNW_1.1 {
nvlist_alloc;
nvlist_dup;
nvlist_free;
nvlist_invalidate;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't add the symbols to already existing public version, either move it to SUNWprivate_1.1, or provide new public version.

free(label);
return (-1);
for (l = start; l < end; l++) {
if ((check == B_TRUE) || (cherry == B_TRUE)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

treat these as booleans that they are, i.e., if (check || cherry).

n = VDEV_LABELS / 2;
break;
case 'i':
index = strtoll(optarg, &end, 10);
Copy link

@ghost ghost Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about using our new friend, strtonum(), here? :-)

@@ -105,6 +105,9 @@
.Op Ar interval Op Ar count
.Nm
.Cm labelclear
.Op Fl b | Fl e | Fl i Ar index
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we please make it (here, usage, description) zfs labelclear [-cfm] [-b|-e|-i index] device?

@@ -2355,6 +2355,18 @@ nvlist_common(nvlist_t *nvl, char *buf, size_t *buflen, int encoding,
}

int
nvlist_invalidate(char *buf)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this function be taking the size of the buffer in question that we're modifying? I realize that we're only modifying the first byte at this time which in theory should always be valid for a char *, but seems like if this ever changes, that'll be helpful as the function is making an implicit assumption about the size.

@martymac
Copy link
Author

Thanks for your feedback! I've just committed changes. Can you check them out ?

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me except for the small nit in the Makefile, thanks!

INCS += -I../../common/zfs -I$(STATCOMMONDIR)

C99MODE= -xc99=%all
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the $(C99_ENABLE) macro instead for both.

@martymac
Copy link
Author

Hi,

Thanks for your feedback.

C99LMODE seems to be derived from C99MODE (see Makefile.master) so setting C99MODE only should be enough ? I've modified the Makefile accordingly.

Regards,

Ganael.

@ahrens
Copy link
Member

ahrens commented Jul 25, 2017

This adds quite a lot of flexibility to an already obscure subcommand, making it even harder to know how to use correctly. Is there any reason you wouldn't always want the -cm behavior? What would you think about changing zpool labelclear (without any flags) to always make the minimal modification, and only to labels which we detect are actual ZFS labels? That way it would always behave like your -cm, without the need for 2 additional options.

@martymac
Copy link
Author

Hi Matthew,

Thanks for your feedback.

My original idea was to extend the command while keeping its default behaviour. The reason is simple : that command is quite new in OpenZFS world but has already existed on FreeBSD for about 6 years now, see :

https://svnweb.freebsd.org/base?view=revision&revision=224171

People are already used to it and changing its default behaviour will probably break things (scripts, brains, ...).

That's just my 2 cents by I would personally prefer keeping it that way to avoid -more- confusion :)

@ahrens
Copy link
Member

ahrens commented Jul 25, 2017

@martymac What could break if we change the default behavior? It seems like it would still have the same semantic of leaving the disk without any valid labels.

@martymac
Copy link
Author

What I mean is that people are probably used to the fact that this command blindly zeroes 512KiB at the beginning and the end of the disk and may (ab)use that property in a way or another (e.g. in an install script to wipe other kinds of metadata - shame on me, I remember having done that myself).

Also, if you make -cm the default (and only) available methods, there will be no easy way of "full cleaning" labels. Yes, this can always be replaced by a dd pass but the end of the disk is trickier to reach than a simple labelclear. We could revert options meaning : i.e. make -cm the default and provide 2 options, one for 'not checking before erasing' and another for the 'full mode', but that will leave the same complexity while changing users habits.

@martymac
Copy link
Author

@ahrens Hi Matthew, what do you think about my previous comments ? Can we leave the default options as they are ?

@ahrens
Copy link
Member

ahrens commented Aug 23, 2017

I'm not convinced that people are "used to the fact that this command blindly zeroes 512KiB at the beginning and the end of the disk". I think that this command is seldom-used, and when it's used it's for the documented purpose: "Removes ZFS label information." That said, I'm open to evidence to the contrary.

I agree that with my suggestion, "there will be no easy way of "full cleaning" labels", but I don't know of the use case for that. I don't consider "abusing ... to wipe other kinds of metadata" a legitimate use case.

(one can still override checks using option '-f')
(option '-m' becomes option '-w' to zero labels entirely as was done before)
(an exported pool already needs option -f to clear labels: we cannot use a
single -f to handle invalid labels as well)
@martymac
Copy link
Author

@ahrens Hi Matthew,

Well, I have modified the patch to try to keep simplicity as well as flexibility :

  • As you suggest, always check label validity by default (but one can use the -f flag twice to override that behaviour)
  • Use the minimal mode by default but provide an option (-w) to wipe (zero) labels entirely as was done before
  • To maintain ZFS library compatible, zpool_clear_label() still wipes the full labels and does not check for label validity

I think it is better that way : the base command remains simple but one can always bypass checks and wipe labels if necessary. What do you think ?

Best regards,

Ganael.

@martymac
Copy link
Author

martymac commented Sep 1, 2017

@ahrens Any thoughts on that new patch ?

@ahrens
Copy link
Member

ahrens commented Sep 6, 2017

Sorry for the latency, I was on vacation last week.

That design sounds OK to me. I (or someone else familiar with this code) still need to review the code. @yuripv do you want to take another look?

@@ -47,6 +48,7 @@
#include <zone.h>
#include <zfs_prop.h>
#include <sys/fs/zfs.h>
#include <sys/vdev_impl.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that this is causing the compilation (lint) error (see http://jenkins.open-zfs.org/blue/organizations/jenkins/openzfs%2Fopenzfs/detail/PR-424/4/pipeline). Is this needed just for VDEV_LABELS? Maybe we should move that to a different header file (e.g. zfs.h)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is needed just for VDEV_LABELS. You mean moving VDEV_LABELS definition to sys/fs/zfs.h ? I don't know what that would imply for other parts of the code :/ Help would be welcome here :)

*
* -f Force clearing the label for the vdevs which are
* members of the exported or foreign pools. Also consider
* seemingly invalid labels as valid ones.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this now say, "if specified twice, clear even seemingly invalid labels"?

break;
case 'i':
index = strtonum(optarg, 0, VDEV_LABELS - 1, &errstr);
if(errstr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cstyle: add space after if, and add explicit != NULL:
if (strerr != NULL) {

case 'f':
if(force)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add space after if, and add braces for multi-line body (even if only one statement)

.Bl -tag -width Ds
.It Fl f
Treat exported or foreign devices as inactive.
Force mode: treat exported or foreign devices as inactive.
Specify twice to treat invalid labels as valid ones.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about, Specify twice to clear even seemingly-invalid labels.

@martymac
Copy link
Author

martymac commented Sep 7, 2017

Hi Matthew,

I've pushed fixes following your recommendations, thanks :)

Only remains warnings about abd stuff. Any hint ?

@ahrens
Copy link
Member

ahrens commented Sep 7, 2017

You mean moving VDEV_LABELS definition to sys/fs/zfs.h ?

Yes.

- this removes compilation warnings about static unused abd_* functions related
to inclusion of sys/vdev_impl.h.
- while here, also remove definition of -unused- VDEV_BEST_LABEL.
@martymac
Copy link
Author

martymac commented Sep 8, 2017

@ahrens My last commit should fix compilation warnings.

@martymac
Copy link
Author

Compilation warnings have been fixed and all checks have passed.

@ahrens @rmustacc @yuripv Do you have additional comments regarding that PR ?

@ahrens
Copy link
Member

ahrens commented Sep 11, 2017

@martymac I'll take a look at your latest changes.

@martymac
Copy link
Author

@ahrens Have you had time to take a deeper look at latest changes ?

@martymac
Copy link
Author

martymac commented Jan 8, 2018

Hi and happy new year :)

I've updated the patch with latest changes from master. If there are no more comments, can it be merged upstream ?

Copy link
Member

@grwilson grwilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns about the ultimate goal for this command. It's unclear if we really want to "wipe" or just "forget" about this device.

@@ -709,8 +751,12 @@ zpool_do_labelclear(int argc, char **argv)
}

if (zpool_read_label(fd, &config) != 0) {
(void) fprintf(stderr,
gettext("failed to read label from %s\n"), vdev);
if (force)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check force_invalid since it's possible for zpool_read_label to return -1 for reasons other than the inability to read the label. If just force is used then we could wipe out the label of a legit pool.

Copy link
Author

@martymac martymac Feb 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @grwilson ,

The original version of FreeBSD's labelclear command allowed to forcibly wipe a label in any case (except when a pool is in use), see :

https://svnweb.freebsd.org/base?view=revision&revision=224171

but that "feature" disappeared when the labelclear command was imported from upstream :

https://svnweb.freebsd.org/base?view=revision&revision=297760

So this is just a fix to re-add that possibility : if we really want to force a wipe out, I think the command should not prevent us from doing so ; and that becomes even more useful when working with individual labels, as that patch now allows.

if (wipe) {
(void) memset(&label, 0, sizeof (vdev_label_t));
} else {
if (nvlist_invalidate(buf, buflen) != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to prevent zpool import from showing the pool, then we should set the TXG value to 0 which is how ZFS clears a label while still leaving some information around for debugging.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main goal of the patch is to allow wiping a single label with minimal changes.

Acting on nvs encoding type (introducing a new type "invalid") allows touching a single byte, which is great because it minimizes the risks of damaging another FS that would have been created over the label. Acting on txg would touch 8 bytes and greatly improve the chances to break something. Also, as a side effect, acting on nvs encoding type keeps last txg value available for debugging.

@grwilson
Copy link
Member

grwilson commented Feb 9, 2018

One of the guiding principles for zfs is simple administration and it seems like we're exposing way too many knobs to the administrator. These knobs expose the internals of the product. For example, if I want to clear the labels, why wouldn't I just run zpool labelclear and have the command figure out if we need to clear label 2 and 3 or 0-3 or any combination? In other words if the user wants to clear the labels then clear whatever valid labels we find. If the user specifies -f then clear them all. I like that this change is trying to protect the user but it seems like we can accomplish this without having to make the user figure out the internal details of the product.

This is less of a concern but one that I want to make sure we can discuss -- I still struggle with invalidating the nvlist encoding vs setting txg = 0. Yes, it's 1 byte vs 8 bytes so we have a smaller chance of impacting any software that is using that disk but the chance is not 0% in either case. I would argue that setting the value of txg=0 at least provides some diagnostics with existing tools and possibly some recovery opportunities that are not available with the nvlist invalidate case. For example, if invalidating the nvlist impacts the software running on that disk, how would you ever know that the disk was once used by zfs and has been invalidated? You would end up with corruption with no way of diagnosing what caused it. We could enhance the tools to look for the invalid encoding making this less of an issue. I recognize this is not a new problem since the current implementation "wipes" the labels. This is why I wonder if we ever need to "wipe" the labels or do we just want zfs to forget about this device? Do we know of cases where we have needed to "wipe" the label?

@jpaetzel
Copy link

jpaetzel commented Feb 9, 2018

I'll chime in here.

There have been a number of cases in the past where I have wanted the moral equivalent to dd if=/dev/zero of=/dev/da0 bs=1M

Typically, and this may be FreeBSD specific, it's a case of ZFS thinks the device is unused, then geom attaches to the device, then ZFS magically sees that oh yes, there is a label there.

An example might be where you GPT partition and put a UFS filesystem on a disk that used to have ZFS on it.

From a ZFS user perspective:

I'd like it if there was a ZF safe and friendly "forget about this device" as well as a "nuke it from orbit" option that was faster than dd. (Yes, you can avoid running dd on the whole disk if you calculate the number of sectors on the disk and use the appropriate seek argument, but sometimes that takes enough time to figure out that I just let the dd run and go do something else for a while)

@martymac
Copy link
Author

Hi @grwilson,

As a recall, the original problem that led to the patch is described here :

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204622

These knobs expose the internals of the product.

Yes, but I don't think exposing the internals of the product is a problem here because it does not make the command more complex : for the average user, its default action is still the same (invalidating all four labels). We just provide an administrator the possibility to go beyond and avoid the hassle of using dd plus a seek argument (this is a kind of simplification, but for the administrator) as @jpaetzel explains.

why wouldn't I just run zpool labelclear and have the command figure out if we need to clear label 2 and 3 or 0-3 or any combination?

I do not always want to let ZFS choose the labels for me because I may want to minimally invalidate them at the beginning of a device (to avoid issues described at the link above) and totally wipe (zeroe) them at the end (e.g. to produce a clean disk image). The current implementation simply does not provide enough flexibility for that.

I still struggle with invalidating the nvlist encoding vs setting txg = 0
[...] I wonder if we ever need to "wipe" the labels or do we just want zfs to forget about this device?

As @jpaetzel wrote, I think we want to be able to do both. The default behaviour is to minimally invalidate labels, but an option is provided to wipe them entirely if one needs it.

Regarding the idea of invalidating a label using its encoding style, you're right, touching 1 byte can break another FS too, but it's always better than modifying 8 bytes (and even better than what is done currently : zeroing 4 * 256 KiB). Also, as you noticed, debugging tools could be expanded to handle that kind of invalid encoding.

Best regards,

Ganael.

@martymac
Copy link
Author

Hi @grwilson,
Do you have any other concerns / comments about that patch ?

@martymac
Copy link
Author

Hi @ahrens, @grwilson,

Are there news about that patch ?

Best regards,
Ganael.

@behlendorf
Copy link

@martymac @ahrens @grwilson @yuripv this is an issue we've definitely seen on the Linux side as well and would like to see resolved.

I've read though the comments above it sounds like there's still some needed discussion over exactly which command lines options to support. However, there is general agreement that zpool labelclear should not be allowed to zero an offset without verifying it contains a valid label.

In order to help move this issue forward I've opened PR openzfs/zfs#8500 against ZFS on Linux which solely addresses that concern. No new options are added and only valid labels are zeroed. Additional improvements are left as future work. This change is in need of reviewers if you have the time. Thanks!

martymac added a commit to martymac/zfs that referenced this pull request Mar 29, 2019
That patch follows :
  openzfs#8500
and is a port to ZoL of:
  openzfs/openzfs#424

It adds the following:
  - ability to clear a specific label (options -b, -e and -i)
  - label invalidation using a single-byte modification (wipe with option -w)
  - option -ff to force invalidation even for invalid labels
martymac added a commit to martymac/zfs that referenced this pull request Mar 29, 2019
That patch follows:
  openzfs#8500
and is a port to ZoL of:
  openzfs/openzfs#424

It adds the following:
  - ability to clear a specific label (options -b, -e and -i)
  - label invalidation using a single-byte modification (wipe with option -w)
  - option -ff to force invalidation even for invalid labels

OpenZFS-issue: https://www.illumos.org/issues/7584
@martymac
Copy link
Author

Hi @behlendorf,

Sorry for lagging. If you are still interested in other parts of my patch, here is a port to ZoL :

openzfs/zfs#8548

There are probably still things to discuss about, probably :

  • do we want to handle single-byte invalidation ?
  • merge -f and -ff options together ?

but that PR can serve as a basis.

Note : I could not review your code in time but it introduces a small mistake : pwrite64() should also skip (2 * VDEV_PAD_SIZE) when reading from label, but as you write zeroes, that does not matter in your patch. That needed to be fixed in my patch to get the single-byte invalidation work.

Best regards,

@ahrens
Copy link
Member

ahrens commented Oct 31, 2019

This repo will be archived in the coming week, and therefore this PR is being closed. Alternate processes for contributing ZFS changes (including those in open PR's) include following the illumos procedures, or opening a PR against ZFSonLinux.

The reasoning behind this are outlined in an email to the developer@open-zfs.org mailing list, which is reproduced below:

The OpenZFS code repository on github (http://github.com/openzfs/openzfs) is a clone of the illumos repo, with basically identical code. The OpenZFS repo made it easier to contribute ZFS code to illumos, by leveraging the github pull request and code review processes, and by automatically building illumos and running the ZFS Test Suite.

Unfortunately, the automated systems have atrophied and we lack the effort and interest to maintain it. Meanwhile, the illumos code review and contribution process has been working well for a lot of ZFS changes (notably including ports from Linux).

Since the utility of this repo has decreased, and the volunteer workforce isn't available to maintain it, we will be archiving http://github.com/openzfs/openzfs in the coming week. Thank you to everyone who helped maintain this infrastructure, and to those who leveraged it to contribute over 500 commits to ZFS on illumos! Alternate processes for contributing ZFS changes (including those in open PR's) include following the illumos procedures, or opening a PR against ZFSonLinux.

The OpenZFS github organization (http://github.com/openzfs) will continue to exist. I will make an announcement about its future role at the OpenZFS Developer Summit on Monday.

Let me know if you have any questions,
--matt

@ahrens ahrens closed this Oct 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
6 participants