Skip to content

Commit 6cccc7d

Browse files
committed
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph updates from Sage Weil: "This includes both the first pile of Ceph patches (which I sent to torvalds@vger, sigh) and a few new patches that add support for fscache for Ceph. That includes a few fscache core fixes that David Howells asked go through the Ceph tree. (Thanks go to Milosz Tanski for putting this feature together) This first batch of patches (included here) had (has) several important RBD bug fixes, hole punch support, several different cleanups in the page cache interactions, improvements in the truncate code (new truncate mutex to avoid shenanigans with i_mutex), and a series of fixes in the synchronous striping read/write code. On top of that is a random collection of small fixes all across the tree (error code checks and error path cleanup, obsolete wq flags, etc)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (43 commits) ceph: use d_invalidate() to invalidate aliases ceph: remove ceph_lookup_inode() ceph: trivial buildbot warnings fix ceph: Do not do invalidate if the filesystem is mounted nofsc ceph: page still marked private_2 ceph: ceph_readpage_to_fscache didn't check if marked ceph: clean PgPrivate2 on returning from readpages ceph: use fscache as a local presisent cache fscache: Netfs function for cleanup post readpages FS-Cache: Fix heading in documentation CacheFiles: Implement interface to check cache consistency FS-Cache: Add interface to check consistency of a cached object rbd: fix null dereference in dout rbd: fix buffer size for writes to images with snapshots libceph: use pg_num_mask instead of pgp_num_mask for pg.seed calc rbd: fix I/O error propagation for reads ceph: use vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem ceph: allow sync_read/write return partial successed size of read/write. ceph: fix bugs about handling short-read for sync read mode. ceph: remove useless variable revoked_rdcache ...
2 parents 255ae3f + a8d436f commit 6cccc7d

File tree

27 files changed

+1396
-189
lines changed

27 files changed

+1396
-189
lines changed

Documentation/filesystems/caching/backend-api.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -299,6 +299,15 @@ performed on the denizens of the cache. These are held in a structure of type:
299299
enough space in the cache to permit this.
300300

301301

302+
(*) Check coherency state of an object [mandatory]:
303+
304+
int (*check_consistency)(struct fscache_object *object)
305+
306+
This method is called to have the cache check the saved auxiliary data of
307+
the object against the netfs's idea of the state. 0 should be returned
308+
if they're consistent and -ESTALE otherwise. -ENOMEM and -ERESTARTSYS
309+
may also be returned.
310+
302311
(*) Update object [mandatory]:
303312

304313
int (*update_object)(struct fscache_object *object)

Documentation/filesystems/caching/netfs-api.txt

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ This document contains the following sections:
3232
(9) Setting the data file size
3333
(10) Page alloc/read/write
3434
(11) Page uncaching
35-
(12) Index and data file update
35+
(12) Index and data file consistency
3636
(13) Miscellaneous cookie operations
3737
(14) Cookie unregistration
3838
(15) Index invalidation
@@ -433,7 +433,7 @@ to the caller. The attribute adjustment excludes read and write operations.
433433

434434

435435
=====================
436-
PAGE READ/ALLOC/WRITE
436+
PAGE ALLOC/READ/WRITE
437437
=====================
438438

439439
And the sixth step is to store and retrieve pages in the cache. There are
@@ -499,7 +499,7 @@ Else if there's a copy of the page resident in the cache:
499499
(*) An argument that's 0 on success or negative for an error code.
500500

501501
If an error occurs, it should be assumed that the page contains no usable
502-
data.
502+
data. fscache_readpages_cancel() may need to be called.
503503

504504
end_io_func() will be called in process context if the read is results in
505505
an error, but it might be called in interrupt context if the read is
@@ -623,6 +623,22 @@ some of the pages being read and some being allocated. Those pages will have
623623
been marked appropriately and will need uncaching.
624624

625625

626+
CANCELLATION OF UNREAD PAGES
627+
----------------------------
628+
629+
If one or more pages are passed to fscache_read_or_alloc_pages() but not then
630+
read from the cache and also not read from the underlying filesystem then
631+
those pages will need to have any marks and reservations removed. This can be
632+
done by calling:
633+
634+
void fscache_readpages_cancel(struct fscache_cookie *cookie,
635+
struct list_head *pages);
636+
637+
prior to returning to the caller. The cookie argument should be as passed to
638+
fscache_read_or_alloc_pages(). Every page in the pages list will be examined
639+
and any that have PG_fscache set will be uncached.
640+
641+
626642
==============
627643
PAGE UNCACHING
628644
==============
@@ -690,9 +706,18 @@ written to the cache and for the cache to finish with the page generally. No
690706
error is returned.
691707

692708

693-
==========================
694-
INDEX AND DATA FILE UPDATE
695-
==========================
709+
===============================
710+
INDEX AND DATA FILE CONSISTENCY
711+
===============================
712+
713+
To find out whether auxiliary data for an object is up to data within the
714+
cache, the following function can be called:
715+
716+
int fscache_check_consistency(struct fscache_cookie *cookie)
717+
718+
This will call back to the netfs to check whether the auxiliary data associated
719+
with a cookie is correct. It returns 0 if it is and -ESTALE if it isn't; it
720+
may also return -ENOMEM and -ERESTARTSYS.
696721

697722
To request an update of the index data for an index or other object, the
698723
following function should be called:

drivers/block/rbd.c

Lines changed: 19 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1561,11 +1561,12 @@ rbd_img_obj_request_read_callback(struct rbd_obj_request *obj_request)
15611561
obj_request, obj_request->img_request, obj_request->result,
15621562
xferred, length);
15631563
/*
1564-
* ENOENT means a hole in the image. We zero-fill the
1565-
* entire length of the request. A short read also implies
1566-
* zero-fill to the end of the request. Either way we
1567-
* update the xferred count to indicate the whole request
1568-
* was satisfied.
1564+
* ENOENT means a hole in the image. We zero-fill the entire
1565+
* length of the request. A short read also implies zero-fill
1566+
* to the end of the request. An error requires the whole
1567+
* length of the request to be reported finished with an error
1568+
* to the block layer. In each case we update the xferred
1569+
* count to indicate the whole request was satisfied.
15691570
*/
15701571
rbd_assert(obj_request->type != OBJ_REQUEST_NODATA);
15711572
if (obj_request->result == -ENOENT) {
@@ -1574,14 +1575,13 @@ rbd_img_obj_request_read_callback(struct rbd_obj_request *obj_request)
15741575
else
15751576
zero_pages(obj_request->pages, 0, length);
15761577
obj_request->result = 0;
1577-
obj_request->xferred = length;
15781578
} else if (xferred < length && !obj_request->result) {
15791579
if (obj_request->type == OBJ_REQUEST_BIO)
15801580
zero_bio_chain(obj_request->bio_list, xferred);
15811581
else
15821582
zero_pages(obj_request->pages, xferred, length);
1583-
obj_request->xferred = length;
15841583
}
1584+
obj_request->xferred = length;
15851585
obj_request_done_set(obj_request);
15861586
}
15871587

@@ -2167,9 +2167,9 @@ static int rbd_img_request_fill(struct rbd_img_request *img_request,
21672167
struct rbd_obj_request *obj_request = NULL;
21682168
struct rbd_obj_request *next_obj_request;
21692169
bool write_request = img_request_write_test(img_request);
2170-
struct bio *bio_list = 0;
2170+
struct bio *bio_list = NULL;
21712171
unsigned int bio_offset = 0;
2172-
struct page **pages = 0;
2172+
struct page **pages = NULL;
21732173
u64 img_offset;
21742174
u64 resid;
21752175
u16 opcode;
@@ -2207,6 +2207,11 @@ static int rbd_img_request_fill(struct rbd_img_request *img_request,
22072207
rbd_segment_name_free(object_name);
22082208
if (!obj_request)
22092209
goto out_unwind;
2210+
/*
2211+
* set obj_request->img_request before creating the
2212+
* osd_request so that it gets the right snapc
2213+
*/
2214+
rbd_img_obj_request_add(img_request, obj_request);
22102215

22112216
if (type == OBJ_REQUEST_BIO) {
22122217
unsigned int clone_size;
@@ -2248,11 +2253,6 @@ static int rbd_img_request_fill(struct rbd_img_request *img_request,
22482253
obj_request->pages, length,
22492254
offset & ~PAGE_MASK, false, false);
22502255

2251-
/*
2252-
* set obj_request->img_request before formatting
2253-
* the osd_request so that it gets the right snapc
2254-
*/
2255-
rbd_img_obj_request_add(img_request, obj_request);
22562256
if (write_request)
22572257
rbd_osd_req_format_write(obj_request);
22582258
else
@@ -3706,12 +3706,14 @@ static int _rbd_dev_v2_snap_size(struct rbd_device *rbd_dev, u64 snap_id,
37063706
if (ret < sizeof (size_buf))
37073707
return -ERANGE;
37083708

3709-
if (order)
3709+
if (order) {
37103710
*order = size_buf.order;
3711+
dout(" order %u", (unsigned int)*order);
3712+
}
37113713
*snap_size = le64_to_cpu(size_buf.size);
37123714

3713-
dout(" snap_id 0x%016llx order = %u, snap_size = %llu\n",
3714-
(unsigned long long)snap_id, (unsigned int)*order,
3715+
dout(" snap_id 0x%016llx snap_size = %llu\n",
3716+
(unsigned long long)snap_id,
37153717
(unsigned long long)*snap_size);
37163718

37173719
return 0;

fs/cachefiles/interface.c

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -377,6 +377,31 @@ static void cachefiles_sync_cache(struct fscache_cache *_cache)
377377
ret);
378378
}
379379

380+
/*
381+
* check if the backing cache is updated to FS-Cache
382+
* - called by FS-Cache when evaluates if need to invalidate the cache
383+
*/
384+
static bool cachefiles_check_consistency(struct fscache_operation *op)
385+
{
386+
struct cachefiles_object *object;
387+
struct cachefiles_cache *cache;
388+
const struct cred *saved_cred;
389+
int ret;
390+
391+
_enter("{OBJ%x}", op->object->debug_id);
392+
393+
object = container_of(op->object, struct cachefiles_object, fscache);
394+
cache = container_of(object->fscache.cache,
395+
struct cachefiles_cache, cache);
396+
397+
cachefiles_begin_secure(cache, &saved_cred);
398+
ret = cachefiles_check_auxdata(object);
399+
cachefiles_end_secure(cache, saved_cred);
400+
401+
_leave(" = %d", ret);
402+
return ret;
403+
}
404+
380405
/*
381406
* notification the attributes on an object have changed
382407
* - called with reads/writes excluded by FS-Cache
@@ -522,4 +547,5 @@ const struct fscache_cache_ops cachefiles_cache_ops = {
522547
.write_page = cachefiles_write_page,
523548
.uncache_page = cachefiles_uncache_page,
524549
.dissociate_pages = cachefiles_dissociate_pages,
550+
.check_consistency = cachefiles_check_consistency,
525551
};

fs/cachefiles/internal.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,7 @@ extern int cachefiles_set_object_xattr(struct cachefiles_object *object,
235235
struct cachefiles_xattr *auxdata);
236236
extern int cachefiles_update_object_xattr(struct cachefiles_object *object,
237237
struct cachefiles_xattr *auxdata);
238+
extern int cachefiles_check_auxdata(struct cachefiles_object *object);
238239
extern int cachefiles_check_object_xattr(struct cachefiles_object *object,
239240
struct cachefiles_xattr *auxdata);
240241
extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,

fs/cachefiles/xattr.c

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,42 @@ int cachefiles_update_object_xattr(struct cachefiles_object *object,
156156
return ret;
157157
}
158158

159+
/*
160+
* check the consistency between the backing cache and the FS-Cache cookie
161+
*/
162+
int cachefiles_check_auxdata(struct cachefiles_object *object)
163+
{
164+
struct cachefiles_xattr *auxbuf;
165+
struct dentry *dentry = object->dentry;
166+
unsigned int dlen;
167+
int ret;
168+
169+
ASSERT(dentry);
170+
ASSERT(dentry->d_inode);
171+
ASSERT(object->fscache.cookie->def->check_aux);
172+
173+
auxbuf = kmalloc(sizeof(struct cachefiles_xattr) + 512, GFP_KERNEL);
174+
if (!auxbuf)
175+
return -ENOMEM;
176+
177+
auxbuf->len = vfs_getxattr(dentry, cachefiles_xattr_cache,
178+
&auxbuf->type, 512 + 1);
179+
if (auxbuf->len < 1)
180+
return -ESTALE;
181+
182+
if (auxbuf->type != object->fscache.cookie->def->type)
183+
return -ESTALE;
184+
185+
dlen = auxbuf->len - 1;
186+
ret = fscache_check_aux(&object->fscache, &auxbuf->data, dlen);
187+
188+
kfree(auxbuf);
189+
if (ret != FSCACHE_CHECKAUX_OKAY)
190+
return -ESTALE;
191+
192+
return 0;
193+
}
194+
159195
/*
160196
* check the state xattr on a cache file
161197
* - return -ESTALE if the object should be deleted

fs/ceph/Kconfig

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,12 @@ config CEPH_FS
1616

1717
If unsure, say N.
1818

19+
if CEPH_FS
20+
config CEPH_FSCACHE
21+
bool "Enable Ceph client caching support"
22+
depends on CEPH_FS=m && FSCACHE || CEPH_FS=y && FSCACHE=y
23+
help
24+
Choose Y here to enable persistent, read-only local
25+
caching support for Ceph clients using FS-Cache
26+
27+
endif

fs/ceph/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ ceph-y := super.o inode.o dir.o file.o locks.o addr.o ioctl.o \
99
mds_client.o mdsmap.o strings.o ceph_frag.o \
1010
debugfs.o
1111

12+
ceph-$(CONFIG_CEPH_FSCACHE) += cache.o

0 commit comments

Comments
 (0)