Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/chain_xattr: cope with xfs limit of 254 bytes for inlined xattrs. #4070

Merged
merged 4 commits into from May 29, 2015

Conversation

liewegas
Copy link
Member

No description provided.

@liewegas
Copy link
Member Author

@mslovy how does this look? (untested, and really needs some tests in test/objectstore/chain_xattr.cc, if you have the time!)

@loic-bot
Copy link

FAIL: the output of run-make-check.sh on centos-7 for 0839435 is http://paste2.org/pysp2VEE

:octocat: Sent from GH.

@loic-bot
Copy link

FAIL: the output of run-make-check.sh on centos-7 for f7d6fb4 is http://paste2.org/4hFvgYgm

:octocat: Sent from GH.

@liewegas
Copy link
Member Author

added tests, passes make check

@loic-bot
Copy link

FAIL: the output of run-make-check.sh on centos-7 for 9904310 is http://paste2.org/H0Jk4Gy9

:octocat: Sent from GH.

@loic-bot
Copy link

SUCCESS: the output of run-make-check.sh on centos-7 for d4acd74 is http://paste2.org/MLWfEvGx

:octocat: Sent from GH.

@mslovy
Copy link
Contributor

mslovy commented Mar 19, 2015

@liewegas reasonable, I agree that it becomes awkward if it breaks into too many 250bytes attrs. Xfs does linear search if the number of xattr items is small. But too many xattr items will result in a BTree Extent Struct (which is also not inline) in order to improve the search speed. I remember the critical point it changes to a BTree Format is 10, but not sure now. So We may set CHAIN_XATTR_SHORT_LEN_THRESHOLD this value smaller (say, 1000 or 750)?

@yuyuyu101
Copy link
Member

I wonder whether 254 bytes or 10 items spill out value are configurated or hard-coded?

It looks a little inflexible to let hardcoded a spill out value in ceph? maybe we can detect it via xfs command?

@liewegas
Copy link
Member Author

changed it to 1000 (4x 250byte stripes) and repushed.

i don't think it's a good idea to make this tunable, since the read path explicitly looks for multiples of the stripe length and we will fail to read old ones if the values change. it is annoying that this is working around an XFS thing, though. :/ but, that's what almost everyone is using with FileStore..

@loic-bot
Copy link

SUCCESS: the output of run-make-check.sh on centos-7 for 1465a67 is http://paste2.org/gIfBbb5D

:octocat: Sent from GH.

@yuyuyu101
Copy link
Member

Oh, I think my meaning is whether 254 bytes and 10 items can be configurated in xfs side. So if we can improve 254 to 512 bytes, it should be enough to store ceph internal xattr in one call.

@liewegas
Copy link
Member Author

oh. in xfs the limit is fixed because of the on-disk format. they are
using a single byte to encode the length of the xattr value. the inode
size can change, which effects how small attrs we can store... right now
we recommend (and the tools use) 2k.


r = sys_getxattr(fn, raw_name, (char *)val + pos, chunk_size);
if (i && r == -ENODATA) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENODATA may be ENOATTR better

BSD, Linux, DARWIN all with return ENOATTR and ENOATTR is defined to be ENODATA in Linux

@liewegas
Copy link
Member Author

liewegas commented Apr 9, 2015

@athanatos I think we should run this through qa?

@ghost
Copy link

ghost commented Apr 9, 2015

please ignore the last bot failure (network problem)

@athanatos
Copy link
Contributor

error: os/chain_xattr.cc:141:20: 'ENOATTR' was not declared in this scope

@yuyuyu101
Copy link
Member

@liewegas It seemed that we need to wrap ENOATTR or ENODATA at lower level:

  1. return ENOATTR, while ENODATA is defined and has other usages: NetBSD
    and MacOS X

  2. return ENODATA, while ENOATTR is not defined: Linux

  3. return ENOATTR, while ENODATA is not defined: FreeBSD

Linux: ENOATTR is defined to be a synonym for ENODATA in <attr/xattr.h>

Since only linux doesn't define ENOATTR and <attr/xattr.h> is included in attr package, maybe we define ENOATTR to ENODATA in common/xattr.c for linux?

@liewegas
Copy link
Member Author

liewegas commented Apr 17, 2015 via email

@liewegas
Copy link
Member Author

error: os/chain_xattr.cc:141:20: 'ENOATTR' was not declared in this scope

Oh, i see.

Yeah, let's do

#ifndef ENOATTR

define ENOATTR ENODATA

#endif

in our errno.h?

@yuyuyu101
Copy link
Member

Yeah, it looks ok.

If we wrote an xattr that was a multiple of a chunk, we will try to read
the next chunk and get ENODATA.  If that happens bail out of the loop and
assume we've read the whole thing.

Backport: hammer, firefly
Signed-off-by: Sage Weil <sage@redhat.com>
XFS has a hard limit of 255 (or 254?) bytes for xattrs to be inlined in the
inode due to a single byte for the length in the on-disk format.  If we
have an xattr that is a bit bigger than that it will get kicked out into a
separate extent, but if we stripe it across a few shorter xattrs it can
be inlined when the xfs inode is e.g. 2K.

Adjust the chain_xattr logic to capture this case.  Note that we are doing
this unconditionally regardless of fs type, but that is probably okay
given that most users use XFS and the cost isn't huge.

Reported-by: Ning Yao <yaoning@ruijie.com.cn>
Signed-off-by: Sage Weil <sage@redhat.com>
ENOATTR == ENODATA on Linux; other systems return ENOATTR.

Signed-off-by: Sage Weil <sage@redhat.com>
...if it isn't already defined.  Haomai reported seeing this somewhere.

Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas
Copy link
Member Author

added the define and rebased

athanatos pushed a commit that referenced this pull request May 29, 2015
os/chain_xattr: cope with xfs limit of 254 bytes for inlined xattrs.

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
Reviewed-by: Ning Yao <zay11022@gmail.com>
@athanatos athanatos merged commit d7fabeb into master May 29, 2015
@athanatos athanatos deleted the wip-chain-xattr branch May 29, 2015 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants