New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jewel mds: order directories by hash and fix simultaneous readdir races #9655
Commits on Jun 12, 2016
-
client: simplify 'offset in frag'
don't distinguish leftmost frag from other frags. always use 2 as first entry's offset. Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit 6572c2a) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
client: don't allocate dir_result_t::buffer dynamically
Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit c41ceb9) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
client: save readdir result into dir_result_t directly
Current code saves the readdir result into MedaRequest, then updates dir_result_t according to MetaRequest. I can't see any reason why we need to do this. Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit db5d60d) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
mds: sort dentries in CDir in hash order
This gives us stable ordering of dentries. (Previously ordering of dentries changes after directory gets fragmented) Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit f483224) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
mds: define end/complete in readdir reply as single u16 flags
so that we can introduce new flags for readdir reply. Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit 92cfbdf) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
client: fix cached readdir after seekdir
Client::seekdir doesn't reset dirp->at_cache_name for a forward seek within same frag. So the dentry with name == at_cache_name may not be the one prior to the readdir postion. Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit 0e32115) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
client: record 'offset' for each entry of dir_result_t::buffer
This is preparation for using hash value as dentry 'offset' Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit bd6546e) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
client: using hash value to compose dentry offset
If MDS sorts dentries in dirfrag in hash order, we use hash value to compose dentry offset. dentry offset is: (0xff << 52) | ((24 bits hash) << 28) | (the nth entry hash hash collision) This offset is stable across directory fragmentation. Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit 680766e) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
mds: don't reset readdir offset if client supports hash order dentry
Now the ordering of dentries is stable across directory fragmentation. There is no need to reset readdir offset if directory get fragmented in the middle of readdir. Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit 98a01af) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
ceph_test_libcephfs: check order of entries in readdir result
Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit 9b17d14) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
client: move dir_{release,ordered}_count into class Inode
We close Inode::dir when it's empty. Once closing the dir, we lose track of {release,ordered}_count. This causes direcotry to be wrongly marked as complete. (dir is trimmed to empty in the middle of readdir) Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit 235fcf6) Signed-off-by: Greg Farnum <gfarnum@redhat.com
-
client: fix simultaneous readdirs race
Current readdir code uses list to track the order of the dentries in readdir replies. When handling a readdir reply, it pushes the resulting dentries to the back of directory's dentry_list. After readdir finishes, the dentry_list reflects how MDS sorts dentries. This method is racy when there are simultaneous readdirs. The fix is use vector instead of list to trace how dentries are sorted in its parent directory. As long as shared_gen doesn't change, each dentry is at fixed position of the vector. So cocurrent readdirs do not affect each other. Fixes: http://tracker.ceph.com/issues/15508 Signed-off-by: Yan, Zheng <zyan@redhat.com> (cherry picked from commit 9d297c5) Signed-off-by: Greg Farnum <gfarnum@redhat.com