New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
librados: new style (sharded) object listing #6405
Conversation
const size_t n, | ||
const size_t m, | ||
rados_enumerate_cursor *split_start, | ||
rados_enumerate_cursor *split_finish); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this get a comment documetning it? i would expect a single output argument (the cursor at the approx middle/split point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a minute to remember so docstring definitely needed...
The dual output arguments get you a range from somewhere within the input start/finish, basically (chunk n of m). Maybe "slice" is a better name than "split" here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the C++ API, object_list_{begin,end} is awkward but consistent with the C API and with other list methods... |
Updated with C++ bits. |
The reason I didn't do an STL-style iterator thing is that it's a pretty weird API to hide network IO like that, and exposing the "give me the next N + return an error code" interface adds very little burden to the caller. |
Agreed.. I found the STL-style iterator annoying anyway. |
{ | ||
std::vector<ObjectItem> result; | ||
int r = ioctx.object_list(c, end, 12, &result, &c); | ||
ASSERT_GE(r, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could ASSERT_EQ(r, result.size()) here too
4acc2e8
to
f19390d
Compare
Updated |
loic-bot's failure is weird, because my test branch (including this) is building on the gitbuilders, and when I try an autotools build locally it works as well. It's failing like this: CXXLD unittest_erasure_code_plugin_shec CXXLD unittest_erasure_code_example CXXLD unittest_librados CXXLD unittest_librados_config CXXLD unittest_journal CXXLD unittest_rbd_replay CXXLD unittest_encoding CXXLD unittest_base64 CXXLD unittest_run_cmd CXXLD unittest_simple_spin CXXLD unittest_libcephfs_config CXXLD unittest_mon_moncap CXXLD unittest_mon_pgmap CXXLD unittest_ecbackend CXXLD unittest_osdscrub CXXLD unittest_pglog CXXLD unittest_hitset CXXLD unittest_osd_osdcap CXXLD unittest_chain_xattr CXXLD unittest_lfnindex CXXLD unittest_mds_authcap CXXLD unittest_addrs CXXLD unittest_blkdev CXXLD unittest_bloom_filter CXXLD unittest_histogram CXXLD unittest_prioritized_queue CXXLD unittest_str_map ./.libs/libradostest.a(libradostest_la-test.o): In function `wait_for_healthy(void**)': /home/jenkins/workspace/ceph/LABELS/ceph-centos-7-jenkins/src/test/librados/test.cc:67: undefined reference to `rados_buffer_free' collect2: error: ld returned 1 exit status CXXLD unittest_sharedptr_registry make[4]: *** [unittest_journal] Error 1 make[4]: *** Waiting for unfinished jobs.... make[4]: Leaving directory `/home/jenkins/workspace/ceph/LABELS/ceph-centos-7-jenkins/src' make[3]: *** [check-am] Error 2 make[3]: Leaving directory `/home/jenkins/workspace/ceph/LABELS/ceph-centos-7-jenkins/src' make[2]: *** [check-recursive] Error 1 make[2]: Leaving directory `/home/jenkins/workspace/ceph/LABELS/ceph-centos-7-jenkins/src' make[1]: *** [check] Error 2 make[1]: Leaving directory `/home/jenkins/workspace/ceph/LABELS/ceph-centos-7-jenkins/src' make: *** [check-recursive] Error 1 It also makes no sense that rados_buffer_free would be missing from anything that was linking with libradostest, because that lib was already using rados_mon_command et al. |
It's unittest_journal, which doesn't correctly link with librados |
See a run with no failures attributable to this PR here: http://pulpito.ceph.com/jspray-2015-11-30_02:06:10-fs-wip-testing-jcsp---basic-multi/ This is good to go from my POV, I'll hook DataScan into it in a separate PR. |
Convenience methods to easily get the bounds of the object range that are managed by a given PG. Note that this is a function of the pool pg_num because otherwise the pg_t id only gives us the start position. Signed-off-by: Sage Weil <sage@redhat.com>
For the namespaced PGNLS op, set the handle to the start of the next PG when we reach the end of the current PG. We do this OSD side rather than client side so that the client doesn't have to wait for the OSD's map epoch in order to correctly calculate the next hash in the case of PG splitting. Only do this when the cluster is in bitwise mode to avoid totally weird results when the client is in nibblewise mode--the nibblewise sort order doesn't map contiguous ranges of the hash to PGs. Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: John Spray <john.spray@redhat.com> Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: John Spray <john.spray@redhat.com> Signed-off-by: Sage Weil <sage@redhat.com>
Because some unit tests benefit from having a non-power-of-two pg_num. Signed-off-by: John Spray <john.spray@redhat.com>
Signed-off-by: John Spray <john.spray@redhat.com> Signed-off-by: Sage Weil <sage@redhat.com>
The existing request machinery already maps the hash to a pg; use it. Among other things, it means that a split will make us recalculate and resend using existing code paths. Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: John Spray <john.spray@redhat.com>
Signed-off-by: John Spray <john.spray@redhat.com>
It is possible in pathological cases for the OSD to return 0 items with a non-MAX next value (indicating the objecter should ask for more). In particular, this is easy to reproduce with max_entries = 1 as the OSD will list and ignore the pgmeta object, return 0 items, and have next = the next object. I check back as far as dumpling and the OSD is setting next=MAX at the end of the PG (and result = 1, FWIW), so using next here instead of inferring anything from the number of result items is safe. Signed-off-by: Sage Weil <sage@redhat.com>
These weren't explicitly linking librados, so were prone to missing symbols on changes in librados_test Signed-off-by: John Spray <john.spray@redhat.com>
Oops, this lost the update to unit test linking when I was adding sage's commits. repushed. |
2015-12-03T20:09:44.741 DEBUG:tasks.ceph.mon.d:waiting for process to exit from /a/sage-2015-12-03_13:46:35-rados-wip-sage-testing2---basic-multi/1167298 |
Huh -- I didn't realise we ran unit tests on a thrashing cluster. In this instance it took about 120 seconds for the test's PGs to get created, which exceeded the 30s timeout in wait_for_healthy. Guess I'll crank up that timeout. |
I set this with unperturbed vstart clusters in mind. The tests are sometimes run with thrashers enabled, so things might take much longer to advance. Signed-off-by: John Spray <john.spray@redhat.com>
Pushed update for timeouts. |
ping @liewegas |
@jcsp yeah, just waiting for the lab to come back alive so i can run it through again. |
librados: new style (sharded) object listing Reviewed-by: Sage Weil <sage@redhat.com>
This is a rebase of what was wip-9963-sage