Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: gc use aio #20546

Merged
merged 6 commits into from Mar 1, 2018

Conversation

Projects
None yet
6 participants
@yehudasa
Copy link
Member

commented Feb 22, 2018

No description provided.

yehudasa added some commits Feb 22, 2018

rgw: use aio for gc processing
still need to deal with index cleanup asynchronously

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: use a single gc io manager for all shards
to allow cross shards concurrency

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: trim gc index using aio
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

@yehudasa yehudasa requested a review from mattbenjamin Feb 22, 2018

@yehudasa yehudasa force-pushed the yehudasa:wip-rgw-gc-aio branch from ead95e8 to 29f3be1 Feb 22, 2018

@yehudasa yehudasa changed the title [DNM]: rgw gc use aio rgw: gc use aio Feb 23, 2018

@mattbenjamin
Copy link
Contributor

left a comment

I support, appreciate the fast typing, will test

string tag;
};

list<IO> ios;

This comment has been minimized.

Copy link
@mattbenjamin

mattbenjamin Feb 23, 2018

Contributor

would love to avoid std::list

This comment has been minimized.

Copy link
@yehudasa

yehudasa Feb 23, 2018

Author Member

@mattbenjamin need a fifo, vector will not cut it

};

list<IO> ios;
std::list<string> remove_tags;

This comment has been minimized.

Copy link
@mattbenjamin

mattbenjamin Feb 23, 2018

Contributor

would love to avoid std::list (but I'm not actually suggesting an alternative would be worth the trouble)

This comment has been minimized.

Copy link
@yehudasa

yehudasa Feb 23, 2018

Author Member

will try to replace this one

}

remove_tags.push_back(io.tag);
#define MAX_REMOVE_CHUNK 16

This comment has been minimized.

Copy link
@mattbenjamin

mattbenjamin Feb 23, 2018

Contributor

I get now that this was missing from the chmagnus change; what makes 16 a good magic number?

This comment has been minimized.

Copy link
@yehudasa

yehudasa Feb 23, 2018

Author Member

not too small, not too high... but seriously, this can be now configured, I have no way to tell what is the optimal number. We're bundling this much operations together, more than that (how much more?) seems to me could affect osd availability. Less than that, latency will probably be the biggest factor.

string tag;
};

list<IO> ios;
std::list<string> remove_tags;
map<int, std::list<string> > remove_tags;

This comment has been minimized.

Copy link
@mattbenjamin

mattbenjamin Feb 23, 2018

Contributor

would love to avoid std::map of std::list--in this case, since the index represents a stable "slot", could this same mechanism be made to work with a std::vector<std::list>? That could actually be worth doing, I think.

This comment has been minimized.

Copy link
@yehudasa

yehudasa Feb 23, 2018

Author Member

yeah, can probably do vector.

~RGWGCIOManager() {
for (auto io : ios) {
io.c->release();
}
}

int schedule_io(IoCtx *ioctx, const string& oid, ObjectWriteOperation *op, const string& tag) {
int schedule_io(IoCtx *ioctx, const string& oid, ObjectWriteOperation *op, int index, const string& tag) {
#warning configurable
#define MAX_CONCURRENT_IO 5

This comment has been minimized.

Copy link
@mattbenjamin

mattbenjamin Feb 23, 2018

Contributor

what makes 5 a good magic number?

@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 23, 2018

@yehudasa just one suggestion I'd act on, if it worked: can the std::map of slots work as a vector?

@mattbenjamin mattbenjamin self-assigned this Feb 23, 2018

@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 23, 2018

@yehudasa tested manually, it worked well for me; I think it delivered about the gc throughput at concurrent=5 as the threaded version did w/3 threads, so the tuning seems good

@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 23, 2018

(scheduling a teuthology run)

@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 23, 2018

@yehudasa agree, the interesting one is the std::map<std::liststd::string>, the other uses of list may be already as good as they can be

@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 23, 2018

@yehudasa perhaps the one you mentioned could be a std::deque

@yehudasa yehudasa force-pushed the yehudasa:wip-rgw-gc-aio branch from 29f3be1 to 4634585 Feb 23, 2018

yehudasa added some commits Feb 22, 2018

rgw: make gc concurrenct io size configurable
and another tunable for log trim size

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: gc aio, replace lists with other types
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 23, 2018

@yehudasa just noticed this; we need something like:

diff --git a/src/test/cls_rgw/test_cls_rgw.cc b/src/test/cls_rgw/test_cls_rgw.cc
index 1d72dce2a1..a9242b03e6 100644
--- a/src/test/cls_rgw/test_cls_rgw.cc
+++ b/src/test/cls_rgw/test_cls_rgw.cc
@@ -673,7 +673,7 @@ TEST(cls_rgw, gc_defer)
   ASSERT_EQ(0, truncated);
 
   librados::ObjectWriteOperation op3;
-  list<string> tags;
+  std::vector<std::string> tags;
   tags.push_back(tag);
 
   /* remove chain */
@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 23, 2018

@yehudasa manual testing checks out

@yehudasa

This comment has been minimized.

Copy link
Member Author

commented Feb 23, 2018

@mattbenjamin I'll push a fix

@yehudasa yehudasa force-pushed the yehudasa:wip-rgw-gc-aio branch from 4634585 to 9d37cac Feb 23, 2018

rgw: use vector for remove_tags in gc aio
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 23, 2018

@yehudasa does this need a tracker issue?

@yehudasa

This comment has been minimized.

Copy link
Member Author

commented Feb 23, 2018

@mattbenjamin wouldn't hurt

@yehudasa

This comment has been minimized.

Copy link
Member Author

commented Feb 26, 2018

@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Feb 27, 2018

@yehudasa the only unexpected failure is a known fail of the lifecycle expiration test, which Casey confirms, does not verify actual GC, only bucket index (but is still failing apparently due to a timing issue).

@cbodley cbodley merged commit 855512a into ceph:master Mar 1, 2018

5 checks passed

Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details
@wjin

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

Hi, how fast will it be?
Seems after using aio, it just depends on max_obj and concurrent_io configurations.

I have a requirement here, clients have 10000 qps for write/put, and all objects are set ttl, i.e. one week or month. Clients may read it before expiration(may be a few thousand qps), but never access them after expiration.

We have enough SSDs or NVMEs for osd clusters, disk is not the bottleneck in terms of iops or throughput, but space. So we must trim or delete expired objects aggressively. It must be better than 10000 qps, otherwise the space will use up.

So will this patch be fast enough to trim objects? Is there any other factors that affect the gc speed?

@mattbenjamin

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

@wjin this change actually permits much higher workload contribution from gc--you'd increase concurrent_io to "go faster"; the max_obj value should just be a "good" value for one gc work unit. The factor left limiting gc speed then is the real workload capacity in the cluster.

@wjin

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

@mattbenjamin Thanks for your quick response. We will set up a very "fast" cluster for clients, like 50000 qps so that it does not affect client usage when doing gc. I will try it later, wish it could be in 12.2.5.

@bengland2

This comment has been minimized.

Copy link
Contributor

commented Apr 6, 2018

@mattbenjamin can your PR make the number of concurrent GC requests based on a fraction of number of OSDs in the cluster? Something like max(1, OSDs/10)? The goal is that it would scale naturally without interfering with application workload, and wouldn't require per-site tuning (normally). Any ceph daemon can ask the monitor for the number of OSDs in the cluster using librados, right?

Also, does your PR spread GC request generators across the cluster, for really big clusters (in the hundreds or thousands of OSDs)?

thx -ben

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.