Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: gc use aio #20546

Merged
merged 6 commits into from Mar 1, 2018
Merged

rgw: gc use aio #20546

merged 6 commits into from Mar 1, 2018

Conversation

yehudasa
Copy link
Member

No description provided.

still need to deal with index cleanup asynchronously

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
to allow cross shards concurrency

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
@yehudasa yehudasa changed the title [DNM]: rgw gc use aio rgw: gc use aio Feb 23, 2018
Copy link
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support, appreciate the fast typing, will test

string tag;
};

list<IO> ios;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would love to avoid std::list

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattbenjamin need a fifo, vector will not cut it

};

list<IO> ios;
std::list<string> remove_tags;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would love to avoid std::list (but I'm not actually suggesting an alternative would be worth the trouble)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will try to replace this one

}

remove_tags.push_back(io.tag);
#define MAX_REMOVE_CHUNK 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get now that this was missing from the chmagnus change; what makes 16 a good magic number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not too small, not too high... but seriously, this can be now configured, I have no way to tell what is the optimal number. We're bundling this much operations together, more than that (how much more?) seems to me could affect osd availability. Less than that, latency will probably be the biggest factor.

string tag;
};

list<IO> ios;
std::list<string> remove_tags;
map<int, std::list<string> > remove_tags;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would love to avoid std::map of std::list--in this case, since the index represents a stable "slot", could this same mechanism be made to work with a std::vector<std::list>? That could actually be worth doing, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, can probably do vector.

~RGWGCIOManager() {
for (auto io : ios) {
io.c->release();
}
}

int schedule_io(IoCtx *ioctx, const string& oid, ObjectWriteOperation *op, const string& tag) {
int schedule_io(IoCtx *ioctx, const string& oid, ObjectWriteOperation *op, int index, const string& tag) {
#warning configurable
#define MAX_CONCURRENT_IO 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what makes 5 a good magic number?

@mattbenjamin
Copy link
Contributor

@yehudasa just one suggestion I'd act on, if it worked: can the std::map of slots work as a vector?

@mattbenjamin mattbenjamin self-assigned this Feb 23, 2018
@mattbenjamin
Copy link
Contributor

@yehudasa tested manually, it worked well for me; I think it delivered about the gc throughput at concurrent=5 as the threaded version did w/3 threads, so the tuning seems good

@mattbenjamin
Copy link
Contributor

(scheduling a teuthology run)

@mattbenjamin
Copy link
Contributor

@yehudasa agree, the interesting one is the std::map<std::liststd::string>, the other uses of list may be already as good as they can be

@mattbenjamin
Copy link
Contributor

@yehudasa perhaps the one you mentioned could be a std::deque

and another tunable for log trim size

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
@mattbenjamin
Copy link
Contributor

mattbenjamin commented Feb 23, 2018

@yehudasa just noticed this; we need something like:

diff --git a/src/test/cls_rgw/test_cls_rgw.cc b/src/test/cls_rgw/test_cls_rgw.cc
index 1d72dce2a1..a9242b03e6 100644
--- a/src/test/cls_rgw/test_cls_rgw.cc
+++ b/src/test/cls_rgw/test_cls_rgw.cc
@@ -673,7 +673,7 @@ TEST(cls_rgw, gc_defer)
   ASSERT_EQ(0, truncated);
 
   librados::ObjectWriteOperation op3;
-  list<string> tags;
+  std::vector<std::string> tags;
   tags.push_back(tag);
 
   /* remove chain */

@mattbenjamin
Copy link
Contributor

@yehudasa manual testing checks out

@yehudasa
Copy link
Member Author

@mattbenjamin I'll push a fix

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
@mattbenjamin
Copy link
Contributor

@mattbenjamin
Copy link
Contributor

@yehudasa does this need a tracker issue?

@yehudasa
Copy link
Member Author

@mattbenjamin wouldn't hurt

@yehudasa
Copy link
Member Author

@mattbenjamin
Copy link
Contributor

@yehudasa the only unexpected failure is a known fail of the lifecycle expiration test, which Casey confirms, does not verify actual GC, only bucket index (but is still failing apparently due to a timing issue).

@wjin
Copy link
Contributor

wjin commented Mar 16, 2018

Hi, how fast will it be?
Seems after using aio, it just depends on max_obj and concurrent_io configurations.

I have a requirement here, clients have 10000 qps for write/put, and all objects are set ttl, i.e. one week or month. Clients may read it before expiration(may be a few thousand qps), but never access them after expiration.

We have enough SSDs or NVMEs for osd clusters, disk is not the bottleneck in terms of iops or throughput, but space. So we must trim or delete expired objects aggressively. It must be better than 10000 qps, otherwise the space will use up.

So will this patch be fast enough to trim objects? Is there any other factors that affect the gc speed?

@mattbenjamin
Copy link
Contributor

mattbenjamin commented Mar 16, 2018

@wjin this change actually permits much higher workload contribution from gc--you'd increase concurrent_io to "go faster"; the max_obj value should just be a "good" value for one gc work unit. The factor left limiting gc speed then is the real workload capacity in the cluster.

@wjin
Copy link
Contributor

wjin commented Mar 16, 2018

@mattbenjamin Thanks for your quick response. We will set up a very "fast" cluster for clients, like 50000 qps so that it does not affect client usage when doing gc. I will try it later, wish it could be in 12.2.5.

@bengland2
Copy link
Contributor

@mattbenjamin can your PR make the number of concurrent GC requests based on a fraction of number of OSDs in the cluster? Something like max(1, OSDs/10)? The goal is that it would scale naturally without interfering with application workload, and wouldn't require per-site tuning (normally). Any ceph daemon can ask the monitor for the number of OSDs in the cluster using librados, right?

Also, does your PR spread GC request generators across the cluster, for really big clusters (in the hundreds or thousands of OSDs)?

thx -ben

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants