New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kraken: rgw: rgw_file: fix marker computation #13871
Conversation
@mattbenjamin This passed an RGW suite at http://tracker.ceph.com/issues/19009#note-8 Please review/approve if you agree it can be merged. |
@mattbenjamin @yehudasa This PR passed another RGW suite at http://tracker.ceph.com/issues/19009#note-18 Please review/approve if you agree it can be merged. |
e30df73
to
afbd9c5
Compare
Rebased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
0103c87
to
ec635ac
Compare
Rebased. This can be merged, provided it passes one last rgw suite and "make check". |
Jenkins re-test this please |
The change which introduced this flag also caused it to be given as the flags argument to RGWLibFS::stat_leaf() when called from rgw_lookup(). This was incorrect: in particular, when a directory is known only as a common prefix of other objects, the AWS namespace mapping convention requires lookup("foo") to match a non-materialized instance of "foo/" (case 2 in RGWLibFS::stat_leaf's stat loop). Fixes: http://tracker.ceph.com/issues/18992 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit e31e9eb)
When a POSIX path <bucket>/foo/ is known only as an implicit path segment from other objects (e.g., <bucket>/foo/bar.txt), a case that would usually arise from S3 upload of such an object, an RGWFileHandle object representing "<bucket>/foo/" will be constructed as needed, with no backing in RGW. This is by design, but subsequently, if a setattr is performed on such a handle, we must be ready to create the object inline with storing the attributes. Fixes: http://tracker.ceph.com/issues/18989 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 55eec1c)
This issue has one root cause in librgw, namely that the marker argument to these requests was incorrectly formatted (though the marker cache was working as intended). Secondarily, for nfs-ganesha users, there is a compounding issue that the RGW fsal was required by "temporary" convention to populate the entire dirent cache for a directory on a single readdir() invocation--the cache_inode/mdcache implementations invariantly pass (before future 2.5 changesets, currently in progress) a null pointer for the start cookie offset, intended to convey this. Fixes: http://tracker.ceph.com/issues/18991 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 2cd60ee)
With change, librgw/rgw_file consumers can provide an invalidation callback, which is used by the library to invalidate directories whose contents should be forgotten. The existing RGWLib GC mechanism is being used to drive this. New configuration params have been added. The main configurable is rgw_nfs_namespace_expire_secs, the expire timeout. Updated post Yehuda review. Fixes: http://tracker.ceph.com/issues/18651 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit deb2c1e) Conflicts: src/rgw/rgw_lib_frontend.h - in class RGWLibProcess : public RGWProcess there was no public method stop() in kraken (now there is)
Fixes: http://tracker.ceph.com/issues/19018 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 4454765)
Bug report and discussion provided by Gui Hecheng <guihecheng@cmss.chinamobile.com> in nfs-ganesha upstream github. Briefly, while a reliable check is potentially costly, it is necessary. Fixes: http://tracker.ceph.com/issues/19270 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit b05f1c6)
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 61482c2)
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 1100a1c)
Fixes: http://tracker.ceph.com/issues/19435 Signed-off-by: Gui Hecheng <guihecheng@cmss.chinamobile.com> (cherry picked from commit cb6808a)
To avoid a string copy in the common mkdir path, handles for proposed buckets currently are staged in the handle table, before being rejected. They need to be destaged, not just marked deleted (because deleted objects are now assumed not to be linked, as of beaeff0). This triggered an unhandled Boost assert when deleting staged handles, as current safe_link mode requires first removing from the FHCache. Fixes: http://tracker.ceph.com/issues/19036 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 6cde812)
RGW NFS fhcache/RGWFileHandle operators assume existence of the full chain of parents from any object to the its fs_root--this is a consequence of the weakly-connected namespace design goal, and not a defect. This change ensures the invariant by taking a parent ref when objects are interned (when a parent ref is guaranteed). Parent refs are returned when objects are destroyed--essentially by the invariant, such a ref must exist. The extra ref is omitted when parent->is_root(), as that node is not in the LRU cache. Fixes: http://tracker.ceph.com/issues/18650 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 0e5299f)
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit ef330f3)
This change includes 3 related changes: 1. add required lock flags for FHCache updates--this is a crash bug under concurrent update/lookup 2. omit to inc/dec refcnt on root filehandles in 2 places--the root handle current is not on the lru list, so it's not valid to do so 3. based on observation of LRU behavior during creates/deletes, update (cohort) LRU unref to move objects to LRU when their refcount falls to SENTINEL_REFCNT--this cheaply primes the current reclaim() mechanism, so very significanty improves space use (e.g., after deletes) in the absence of scans (which is common due to nfs-ganesha caching) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit beaeff0)
Previously we assumed that !deleted handles were resident--there is an observed case where a !deleted handle is !linked. Since we currently use safe_link mode, an is_linked() check is available, and exhaustive. Fixes: http://tracker.ceph.com/issues/19111 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit c0aa515)
Allow callers of rgw_lookup() on objects attested in an rgw_readdir() callback the ability to bypass exact match in RGWLibFS::stat_leaf() case 2, but restore exact match enforcement for general lookups. This preserves required common_prefix namespace behavior, but prevents clients from eerily permitting things like "cd sara0" via partial name match on "sara01." Fixes: http://tracker.ceph.com/issues/19059 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 70ef7d4)
The new type hints optimize object type deduction, when the rgw_lookup is called from an rgw_readdir callback. Fixes: http://tracker.ceph.com/issues/19623 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 2e66c7a)
40de683
to
2b9a77e
Compare
Fixes: http://tracker.ceph.com/issues/18808 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 4ad5a92)
The logic in RGWLibFS::mkdir() validated bucket names, but not object names (though RGWLibFS::create() did so). The negative side effect of this was not creating illegal objects (we won't), but in a) failing with -EIO and b) more importantly, not removing up the proposed object from FHCache, which produced a boost assert when recycled. Fixes: http://tracker.ceph.com/issues/19066 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit eb1cd3b)
These refs won't be returned by nfs-ganesha, and are sufficiently magical that other consumers should be persuaded to understand their specialness. Fixes: http://tracker.ceph.com/issues/19060 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit dea8d1e)
Skip unref after unlink to fix the problem. Signed-off-by: Gui Hecheng <guihecheng@cmss.chinamobile.com> (cherry picked from commit bff2287)
When ::getattr returns -ESTALE, rgw_getattr returns ESTALE, which is a not expected postive. Signed-off-by: Gui Hecheng <guihecheng@cmss.chinamobile.com> (cherry picked from commit 39203cf)
As an ganesha FSAL backend, rgw_file should properly maintain the atime,ctime,mtime properly against operations such as: (read,write) for file (create,unlink,mkdir,rmdir,rename) for dir (setattr) for file and dir Signed-off-by: Gui Hecheng <guihecheng@cmss.chinamobile.com> (cherry picked from commit ac25da2)
Formerly masked in part by the reclaim() action, direct-delete now substitutes for reclaim() iff its LRU lane is over its high-water mark, and in particular, like reclaim() the destructor is certain to see handles still interned on the FHcache when nfs-ganesha is recycling objects from its own LRU. Fixes: http://tracker.ceph.com/issues/19112 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit d51a3b1)
Adjust readdir callback path for new nfs-ganesha chunked readdir, including changes to respect the result of callback to not continue. Pending introduction of offset name hint, our caller will just be completely enumerating, so it is possible to remove the offset map and just keep a last offset. Fixes: http://tracker.ceph.com/issues/19624 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit e0191d7)
If a readdir expire event turns out to be older than last_readdir, just reschedule it (but actually, we should just discard it, as another expire event must be in queue. Fixes: http://tracker.ceph.com/issues/19625 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit 007b745)
Also, fixes link count computation off-by-one, update of state.nlink after computation, link computation reset at start, and a time print in debug log. Fixes: http://tracker.ceph.com/issues/19634 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> link count Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> (cherry picked from commit e0f8026)
I've created a monster! |
@mattbenjamin This should be all the rgw_file fixes that were applied to jewel since kraken was split off from master. With the sole exception of 50955a5 they all applied cleanly. |
This is included in the integration branch now building at https://shaman.ceph.com/builds/ceph/wip-kraken-backports/241c44b083d6e9fc82ff2d1d10379963d65db866/ |
@yehudasa @mattbenjamin This PR passed an rgw suite at http://tracker.ceph.com/issues/19009#note-60 and @mattbenjamin already approved it before I added a bunch more commits. If there are no objections, I will merge it. |
@mattbenjamin This passed another rgw suite at http://tracker.ceph.com/issues/19009#note-65 with two valgrind-related failures that I'm currently re-running. Do you think it can be merged? (I know you already approved it, but I expanded it significantly since then.) |
@smithfarm leaving it for @mattbenjamin |
@smithfarm go for it |
http://tracker.ceph.com/issues/18810
http://tracker.ceph.com/issues/18902
http://tracker.ceph.com/issues/19144
http://tracker.ceph.com/issues/19149
http://tracker.ceph.com/issues/19154
http://tracker.ceph.com/issues/19162
http://tracker.ceph.com/issues/19166
http://tracker.ceph.com/issues/19168
http://tracker.ceph.com/issues/19170
http://tracker.ceph.com/issues/19229
http://tracker.ceph.com/issues/19470
http://tracker.ceph.com/issues/19471
http://tracker.ceph.com/issues/19526
http://tracker.ceph.com/issues/19661
http://tracker.ceph.com/issues/19663
http://tracker.ceph.com/issues/19723
http://tracker.ceph.com/issues/20158
http://tracker.ceph.com/issues/20195