rgw: gc use aio #20546

yehudasa · 2018-02-22T22:33:16Z

No description provided.

still need to deal with index cleanup asynchronously Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

to allow cross shards concurrency Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

mattbenjamin

I support, appreciate the fast typing, will test

mattbenjamin · 2018-02-22T22:44:30Z

src/rgw/rgw_gc.cc

+    string tag;
+  };
+
+  list<IO> ios;


would love to avoid std::list

@mattbenjamin need a fifo, vector will not cut it

mattbenjamin · 2018-02-22T22:44:43Z

src/rgw/rgw_gc.cc

+  };
+
+  list<IO> ios;
+  std::list<string> remove_tags;


would love to avoid std::list (but I'm not actually suggesting an alternative would be worth the trouble)

will try to replace this one

mattbenjamin · 2018-02-22T22:46:57Z

src/rgw/rgw_gc.cc

+    }
+
+    remove_tags.push_back(io.tag);
+#define MAX_REMOVE_CHUNK 16


I get now that this was missing from the chmagnus change; what makes 16 a good magic number?

not too small, not too high... but seriously, this can be now configured, I have no way to tell what is the optimal number. We're bundling this much operations together, more than that (how much more?) seems to me could affect osd availability. Less than that, latency will probably be the biggest factor.

mattbenjamin · 2018-02-22T22:53:22Z

src/rgw/rgw_gc.cc

    string tag;
  };

  list<IO> ios;
-  std::list<string> remove_tags;
+  map<int, std::list<string> > remove_tags;


would love to avoid std::map of std::list--in this case, since the index represents a stable "slot", could this same mechanism be made to work with a std::vector<std::list>? That could actually be worth doing, I think.

yeah, can probably do vector.

mattbenjamin · 2018-02-22T22:54:14Z

src/rgw/rgw_gc.cc

  ~RGWGCIOManager() {
    for (auto io : ios) {
      io.c->release();
    }
  }

-  int schedule_io(IoCtx *ioctx, const string& oid, ObjectWriteOperation *op, const string& tag) {
+  int schedule_io(IoCtx *ioctx, const string& oid, ObjectWriteOperation *op, int index, const string& tag) {
 #warning configurable
 #define MAX_CONCURRENT_IO 5


what makes 5 a good magic number?

mattbenjamin · 2018-02-23T14:46:49Z

@yehudasa just one suggestion I'd act on, if it worked: can the std::map of slots work as a vector?

mattbenjamin · 2018-02-23T16:03:52Z

@yehudasa tested manually, it worked well for me; I think it delivered about the gc throughput at concurrent=5 as the threaded version did w/3 threads, so the tuning seems good

mattbenjamin · 2018-02-23T16:05:16Z

(scheduling a teuthology run)

mattbenjamin · 2018-02-23T16:07:06Z

@yehudasa agree, the interesting one is the std::map<std::liststd::string>, the other uses of list may be already as good as they can be

mattbenjamin · 2018-02-23T16:08:20Z

@yehudasa perhaps the one you mentioned could be a std::deque

and another tunable for log trim size Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

mattbenjamin · 2018-02-23T17:56:34Z

@yehudasa just noticed this; we need something like:

diff --git a/src/test/cls_rgw/test_cls_rgw.cc b/src/test/cls_rgw/test_cls_rgw.cc
index 1d72dce2a1..a9242b03e6 100644
--- a/src/test/cls_rgw/test_cls_rgw.cc
+++ b/src/test/cls_rgw/test_cls_rgw.cc
@@ -673,7 +673,7 @@ TEST(cls_rgw, gc_defer)
   ASSERT_EQ(0, truncated);
 
   librados::ObjectWriteOperation op3;
-  list<string> tags;
+  std::vector<std::string> tags;
   tags.push_back(tag);
 
   /* remove chain */

mattbenjamin · 2018-02-23T18:52:05Z

@yehudasa manual testing checks out

yehudasa · 2018-02-23T19:16:24Z

@mattbenjamin I'll push a fix

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

mattbenjamin · 2018-02-23T19:33:43Z

http://pulpito.ceph.com/mbenjamin-2018-02-23_14:30:28-rgw-wip-rgw-gc-aio-2---basic-smithi/

mattbenjamin · 2018-02-23T21:22:01Z

@yehudasa does this need a tracker issue?

yehudasa · 2018-02-23T21:54:44Z

@mattbenjamin wouldn't hurt

yehudasa · 2018-02-26T19:40:32Z

@mattbenjamin http://tracker.ceph.com/issues/23139

mattbenjamin · 2018-02-27T15:52:40Z

@yehudasa the only unexpected failure is a known fail of the lifecycle expiration test, which Casey confirms, does not verify actual GC, only bucket index (but is still failing apparently due to a timing issue).

wjin · 2018-03-16T13:36:35Z

Hi, how fast will it be?
Seems after using aio, it just depends on max_obj and concurrent_io configurations.

I have a requirement here, clients have 10000 qps for write/put, and all objects are set ttl, i.e. one week or month. Clients may read it before expiration(may be a few thousand qps), but never access them after expiration.

We have enough SSDs or NVMEs for osd clusters, disk is not the bottleneck in terms of iops or throughput, but space. So we must trim or delete expired objects aggressively. It must be better than 10000 qps, otherwise the space will use up.

So will this patch be fast enough to trim objects? Is there any other factors that affect the gc speed?

mattbenjamin · 2018-03-16T13:42:36Z

@wjin this change actually permits much higher workload contribution from gc--you'd increase concurrent_io to "go faster"; the max_obj value should just be a "good" value for one gc work unit. The factor left limiting gc speed then is the real workload capacity in the cluster.

wjin · 2018-03-16T13:55:32Z

@mattbenjamin Thanks for your quick response. We will set up a very "fast" cluster for clients, like 50000 qps so that it does not affect client usage when doing gc. I will try it later, wish it could be in 12.2.5.

bengland2 · 2018-04-06T20:20:22Z

@mattbenjamin can your PR make the number of concurrent GC requests based on a fraction of number of OSDs in the cluster? Something like max(1, OSDs/10)? The goal is that it would scale naturally without interfering with application workload, and wouldn't require per-site tuning (normally). Any ceph daemon can ask the monitor for the number of OSDs in the cluster using librados, right?

Also, does your PR spread GC request generators across the cluster, for really big clusters (in the hundreds or thousands of OSDs)?

thx -ben

yehudasa added 3 commits February 22, 2018 13:31

rgw: use aio for gc processing

1f6dc8c

still need to deal with index cleanup asynchronously Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

rgw: use a single gc io manager for all shards

e60404c

to allow cross shards concurrency Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

rgw: trim gc index using aio

f00ee63

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

yehudasa requested a review from mattbenjamin February 22, 2018 22:33

yehudasa force-pushed the wip-rgw-gc-aio branch from ead95e8 to 29f3be1 Compare February 22, 2018 23:45

yehudasa changed the title ~~[DNM]: rgw gc use aio~~ rgw: gc use aio Feb 23, 2018

batrick added rgw needs-review labels Feb 23, 2018

mattbenjamin approved these changes Feb 23, 2018

View reviewed changes

mattbenjamin self-assigned this Feb 23, 2018

mattbenjamin added the wip-matt-testing label Feb 23, 2018

yehudasa force-pushed the wip-rgw-gc-aio branch from 29f3be1 to 4634585 Compare February 23, 2018 16:41

yehudasa added 2 commits February 23, 2018 08:42

rgw: make gc concurrenct io size configurable

1cceef2

and another tunable for log trim size Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

rgw: gc aio, replace lists with other types

278ca03

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

yehudasa force-pushed the wip-rgw-gc-aio branch from 4634585 to 9d37cac Compare February 23, 2018 19:18

rgw: use vector for remove_tags in gc aio

9d37cac

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

mattbenjamin mentioned this pull request Feb 27, 2018

rgw: change gc process from single thread to thread pool #20088

Closed

cbodley merged commit 855512a into ceph:master Mar 1, 2018

mdw-at-linuxbox mentioned this pull request Mar 12, 2018

rgw; [jewel] change gc process from single thread to thread pool #20850

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rgw: gc use aio #20546

rgw: gc use aio #20546

yehudasa commented Feb 22, 2018

mattbenjamin left a comment

mattbenjamin Feb 22, 2018

yehudasa Feb 23, 2018

mattbenjamin Feb 22, 2018

yehudasa Feb 23, 2018

mattbenjamin Feb 22, 2018

yehudasa Feb 23, 2018

mattbenjamin Feb 22, 2018

yehudasa Feb 23, 2018

mattbenjamin Feb 22, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018 •

edited

mattbenjamin commented Feb 23, 2018

yehudasa commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

yehudasa commented Feb 23, 2018

yehudasa commented Feb 26, 2018

mattbenjamin commented Feb 27, 2018

wjin commented Mar 16, 2018

mattbenjamin commented Mar 16, 2018 •

edited

wjin commented Mar 16, 2018

bengland2 commented Apr 6, 2018

rgw: gc use aio #20546

rgw: gc use aio #20546

Conversation

yehudasa commented Feb 22, 2018

mattbenjamin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018 • edited

mattbenjamin commented Feb 23, 2018

yehudasa commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

mattbenjamin commented Feb 23, 2018

yehudasa commented Feb 23, 2018

yehudasa commented Feb 26, 2018

mattbenjamin commented Feb 27, 2018

wjin commented Mar 16, 2018

mattbenjamin commented Mar 16, 2018 • edited

wjin commented Mar 16, 2018

bengland2 commented Apr 6, 2018

mattbenjamin commented Feb 23, 2018 •

edited

mattbenjamin commented Mar 16, 2018 •

edited