Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent duplicated xattr between SA and dir #4153

Closed
wants to merge 1 commit into from

Conversation

tuxoko
Copy link
Contributor

@tuxoko tuxoko commented Dec 31, 2015

When replacing an xattr would cause overflowing in SA, we would fallback
to xattr dir. However, current implementation don't clear the one in SA,
so we would end up with duplicated SA.

For example, running the following script on an xattr=sa filesystem
would cause duplicated "user.1".

-- dup_xattr.sh begin --

!/bin/sh

Modified from efault_reproducer.sh by @nedbass

randbase64()
{
dd if=/dev/urandom bs=1 count=$1 2>/dev/null | openssl enc -a -A
}

file=$1
touch $file
setfattr -h -n user.1 -v randbase64 5000 $file
setfattr -h -n user.2 -v randbase64 20000 $file
setfattr -h -n user.3 -v randbase64 20000 $file
setfattr -h -n user.1 -v randbase64 20000 $file
getfattr -m. -d $file
-- dup_xattr.sh end --

Also, when a filesystem is switch from xattr=sa to xattr=on, it will
never modify those in SA. This would cause strange behavior like, you
cannot delete an xattr, or setxattr would cause duplicate and the result
would not match when you getxattr.

For example, the following shell sequence.

-- shell begin --
$ sudo zfs set xattr=sa pp/fs0
$ touch zzz
$ setfattr -n user.test -v asdf zzz
$ sudo zfs set xattr=on pp/fs0
$ setfattr -x user.test zzz
setfattr: zzz: No such attribute
$ getfattr -d zzz
user.test="asdf"
$ setfattr -n user.test -v zxcv zzz
$ getfattr -d zzz
user.test="asdf"
user.test="asdf"
-- shell end --

We fix this behavior, by first finding where the xattr resides before
setxattr. Then, after we successfully updated the xattr in one location,
we will clear the other location. Note that, because update and clear
are not in single tx, we could still end up with duplicated xattr. But
by doing setxattr again, it can be fixed.

Signed-off-by: Chunwei Chen david.chen@osnexus.com

@tuxoko
Copy link
Contributor Author

tuxoko commented Dec 31, 2015

Fix #3472

@behlendorf
Copy link
Contributor

The approach looks reasonable to me. There might be a slightly performance penalty for set since we're now always checking the directory and the SA but it's the price we need to pay for correctness. The get path should still be fast which is the one primarily care about.

@nedbass
Copy link
Contributor

nedbass commented Jan 4, 2016

There might be a slightly performance penalty for set since we're now always checking the directory and the SA but it's the price we need to pay for correctness.

We could always add a "strict" SA mode for xattrs if the performance penalty is significant. One could set xattr=saonly, for example, to tell ZFS to never use dir-based xattrs, allowing the extra checks to be skipped.

@tuxoko
Copy link
Contributor Author

tuxoko commented Jan 4, 2016

"xattr=saonly" sounds like a reasonable option if we need it, but it will need to be fixed at filesystem creation.

@tuxoko
Copy link
Contributor Author

tuxoko commented Jan 4, 2016

I'll elaborate a bit more.

I don't think this is needed. It must be set via __zpl_xattr_where()->zpl_xattr_get_sa().

If zpl_xattr_set_sa failed, it will flush the z_xattr_cached. Then, after zpl_xattr_set_dir success, we need to call zpl_xattr_set_sa again to clear it. So the second turn need it.

Isn't the (value == NULL && where & XATTR_IN_SA case handled below on line 602?

The issue is when we need to clear an xattr in SA while xattr=on. If we don't add this case, the first zpl_xattr_set_sa will be skipped. Then later, zpl_xattr_get_dir will failed with ENOENT.

When replacing an xattr would cause overflowing in SA, we would fallback
to xattr dir. However, current implementation don't clear the one in SA,
so we would end up with duplicated SA.

For example, running the following script on an xattr=sa filesystem
would cause duplicated "user.1".

-- dup_xattr.sh begin --
randbase64()
{
        dd if=/dev/urandom bs=1 count=$1 2>/dev/null | openssl enc -a -A
}

file=$1
touch $file
setfattr -h -n user.1 -v `randbase64 5000` $file
setfattr -h -n user.2 -v `randbase64 20000` $file
setfattr -h -n user.3 -v `randbase64 20000` $file
setfattr -h -n user.1 -v `randbase64 20000` $file
getfattr -m. -d $file
-- dup_xattr.sh end --

Also, when a filesystem is switch from xattr=sa to xattr=on, it will
never modify those in SA. This would cause strange behavior like, you
cannot delete an xattr, or setxattr would cause duplicate and the result
would not match when you getxattr.

For example, the following shell sequence.

-- shell begin --
$ sudo zfs set xattr=sa pp/fs0
$ touch zzz
$ setfattr -n user.test -v asdf zzz
$ sudo zfs set xattr=on pp/fs0
$ setfattr -x user.test zzz
setfattr: zzz: No such attribute
$ getfattr -d zzz
user.test="asdf"
$ setfattr -n user.test -v zxcv zzz
$ getfattr -d zzz
user.test="asdf"
user.test="asdf"
-- shell end --

We fix this behavior, by first finding where the xattr resides before
setxattr. Then, after we successfully updated the xattr in one location,
we will clear the other location. Note that, because update and clear
are not in single tx, we could still end up with duplicated xattr. But
by doing setxattr again, it can be fixed.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
@tuxoko
Copy link
Contributor Author

tuxoko commented Jan 4, 2016

Update: change _where() to use pointer to return where. And delete curly braces for one liner.

@nedbass
Copy link
Contributor

nedbass commented Jan 5, 2016

I'm just wondering if this can be reworked to export an interface like __zpl_xattr_where() for use by Lustre. Perhaps it's not practical to move this interface out of the ZPL. But as it stands the Lustre ZFS-OSD will have to replicate this functionality to fix the same bug.

@lundman
Copy link
Contributor

lundman commented Jan 5, 2016

Not too fond of xattr=saonly, that just seems to complicate matters needlessly. What about making xattr=sa be a create-time only option? Then you can ignore dir-based xattrs completely. (Wasn't xattr=sa's original design focusing on speed?)

@tuxoko
Copy link
Contributor Author

tuxoko commented Jan 5, 2016

@lundman xattr=sa as it currently stands will still use dir when sa is full. So you can't ignore it even if it's fixed at creation.

Regarding the performance penalty, in order for __zpl_xattr_where to have meaningful penalty, you need to already have xattr directory when xattr=sa. And such thing will happen only when SA is full, which should be rare.

@ilovezfs
Copy link
Contributor

ilovezfs commented Jan 5, 2016

The maximum size xattr could be set lower in the case of xattr=sa, rather than failing over to dir, so that's not an especially compelling argument.

@tuxoko
Copy link
Contributor Author

tuxoko commented Jan 5, 2016

@ilovezfs
We cannot break existing users, the only way to add extra limitation is to add new config mode. Unless xattr=sa has always been marked experimental and users are at their own peril until now.

@behlendorf
Copy link
Contributor

@nedbass I'd love to rework the interface for Lustre but I don't think it's practical. We'd need to introduce some new generic structure for this. Maybe someday, but not today.

@lundman I've mixed feeling about adding a xattr=saonly, but that's really a discussion for another time. We can't change the long standing expected behavior of xattr=sa. xattrs are allowed to spill from the sa region to directories for a variety of reasons. We should just address the corner cases which were missed in the original implementation as this patch does.

This patch LGTM but I'd like to see a few more people review and test this before merging if possible.

@dweeezil
Copy link
Contributor

dweeezil commented Jan 8, 2016

I've not been following this closely but it sounds somewhat related to my 0e08566 patch from some time ago. I forgot which issue and/or pull request this was associated with. Tha patch has been sitting idle for a long time now but I do remember that it worked properly, and the main missing feature was that it didn't remove empty xattr directories.

The main reason I'm chiming in on this issue is not to revisit my patch, but because at that time, I think the general consensus was that ultimately, feature flags were the way to go for various xattr storage methods. I remember proposing other xattr=XXX settings as well. Clearly per-dataset feature flags would be very helpful. Eventually, it would be nice to go back to only xattr=on and xattr=off.

@behlendorf
Copy link
Contributor

This patch addresses the issues we're seeing today without changing the existing behavior so after some additional manual testing I've merged it to master. Let's move the discussion of adding per-dataset features flags, which is a great idea, to another issue.

21f604d Prevent duplicated xattr between SA and dir

goulvenriou pushed a commit to Alyseo/zfs that referenced this pull request Jan 17, 2016
When replacing an xattr would cause overflowing in SA, we would fallback
to xattr dir. However, current implementation don't clear the one in SA,
so we would end up with duplicated SA.

For example, running the following script on an xattr=sa filesystem
would cause duplicated "user.1".

-- dup_xattr.sh begin --
randbase64()
{
        dd if=/dev/urandom bs=1 count=$1 2>/dev/null | openssl enc -a -A
}

file=$1
touch $file
setfattr -h -n user.1 -v `randbase64 5000` $file
setfattr -h -n user.2 -v `randbase64 20000` $file
setfattr -h -n user.3 -v `randbase64 20000` $file
setfattr -h -n user.1 -v `randbase64 20000` $file
getfattr -m. -d $file
-- dup_xattr.sh end --

Also, when a filesystem is switch from xattr=sa to xattr=on, it will
never modify those in SA. This would cause strange behavior like, you
cannot delete an xattr, or setxattr would cause duplicate and the result
would not match when you getxattr.

For example, the following shell sequence.

-- shell begin --
$ sudo zfs set xattr=sa pp/fs0
$ touch zzz
$ setfattr -n user.test -v asdf zzz
$ sudo zfs set xattr=on pp/fs0
$ setfattr -x user.test zzz
setfattr: zzz: No such attribute
$ getfattr -d zzz
user.test="asdf"
$ setfattr -n user.test -v zxcv zzz
$ getfattr -d zzz
user.test="asdf"
user.test="asdf"
-- shell end --

We fix this behavior, by first finding where the xattr resides before
setxattr. Then, after we successfully updated the xattr in one location,
we will clear the other location. Note that, because update and clear
are not in single tx, we could still end up with duplicated xattr. But
by doing setxattr again, it can be fixed.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#3472
Closes openzfs#4153
behlendorf pushed a commit that referenced this pull request Jan 29, 2016
When replacing an xattr would cause overflowing in SA, we would fallback
to xattr dir. However, current implementation don't clear the one in SA,
so we would end up with duplicated SA.

For example, running the following script on an xattr=sa filesystem
would cause duplicated "user.1".

-- dup_xattr.sh begin --
randbase64()
{
        dd if=/dev/urandom bs=1 count=$1 2>/dev/null | openssl enc -a -A
}

file=$1
touch $file
setfattr -h -n user.1 -v `randbase64 5000` $file
setfattr -h -n user.2 -v `randbase64 20000` $file
setfattr -h -n user.3 -v `randbase64 20000` $file
setfattr -h -n user.1 -v `randbase64 20000` $file
getfattr -m. -d $file
-- dup_xattr.sh end --

Also, when a filesystem is switch from xattr=sa to xattr=on, it will
never modify those in SA. This would cause strange behavior like, you
cannot delete an xattr, or setxattr would cause duplicate and the result
would not match when you getxattr.

For example, the following shell sequence.

-- shell begin --
$ sudo zfs set xattr=sa pp/fs0
$ touch zzz
$ setfattr -n user.test -v asdf zzz
$ sudo zfs set xattr=on pp/fs0
$ setfattr -x user.test zzz
setfattr: zzz: No such attribute
$ getfattr -d zzz
user.test="asdf"
$ setfattr -n user.test -v zxcv zzz
$ getfattr -d zzz
user.test="asdf"
user.test="asdf"
-- shell end --

We fix this behavior, by first finding where the xattr resides before
setxattr. Then, after we successfully updated the xattr in one location,
we will clear the other location. Note that, because update and clear
are not in single tx, we could still end up with duplicated xattr. But
by doing setxattr again, it can be fixed.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #3472
Closes #4153
goulvenriou pushed a commit to Alyseo/zfs that referenced this pull request Feb 4, 2016
When replacing an xattr would cause overflowing in SA, we would fallback
to xattr dir. However, current implementation don't clear the one in SA,
so we would end up with duplicated SA.

For example, running the following script on an xattr=sa filesystem
would cause duplicated "user.1".

-- dup_xattr.sh begin --
randbase64()
{
        dd if=/dev/urandom bs=1 count=$1 2>/dev/null | openssl enc -a -A
}

file=$1
touch $file
setfattr -h -n user.1 -v `randbase64 5000` $file
setfattr -h -n user.2 -v `randbase64 20000` $file
setfattr -h -n user.3 -v `randbase64 20000` $file
setfattr -h -n user.1 -v `randbase64 20000` $file
getfattr -m. -d $file
-- dup_xattr.sh end --

Also, when a filesystem is switch from xattr=sa to xattr=on, it will
never modify those in SA. This would cause strange behavior like, you
cannot delete an xattr, or setxattr would cause duplicate and the result
would not match when you getxattr.

For example, the following shell sequence.

-- shell begin --
$ sudo zfs set xattr=sa pp/fs0
$ touch zzz
$ setfattr -n user.test -v asdf zzz
$ sudo zfs set xattr=on pp/fs0
$ setfattr -x user.test zzz
setfattr: zzz: No such attribute
$ getfattr -d zzz
user.test="asdf"
$ setfattr -n user.test -v zxcv zzz
$ getfattr -d zzz
user.test="asdf"
user.test="asdf"
-- shell end --

We fix this behavior, by first finding where the xattr resides before
setxattr. Then, after we successfully updated the xattr in one location,
we will clear the other location. Note that, because update and clear
are not in single tx, we could still end up with duplicated xattr. But
by doing setxattr again, it can be fixed.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes openzfs#3472
Closes openzfs#4153
@behlendorf behlendorf added this to the 0.6.5.5 milestone Mar 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants