Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librbd: scatter/gather support for the C API #13447

Merged
merged 3 commits into from Feb 20, 2017
Merged

Conversation

dillaman
Copy link

No description provided.

@dillaman
Copy link
Author

retest this please

#define LIBRBD_VER_MINOR 1
#define LIBRBD_VER_EXTRA 11
#define LIBRBD_VER_MAJOR 1
#define LIBRBD_VER_MINOR 12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dillaman Just wondering why minor is not reset when major is increased? And actually what was the reason to bump major? I thought it is only when API incompatibility is introduced, and this does not look like this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I suppose there is no actual version major bump, LIBRBD_VER_MAJOR 0 just was a bug and now it matches librbd SOVERSION. BTW, shouldn't the library VERSION (which is 1.0.0 currently) match major.minor.extra?

@trociny
Copy link
Contributor

trociny commented Feb 19, 2017

@dillaman

beton:~/ceph.ci/build% ./bin/ceph_test_librbd --gtest_repeat=10 --gtest_filter=TestLibRBD.TestScatterGatherIO

Repeating all tests (iteration 1) . . .

Note: Google Test filter = TestLibRBD.TestScatterGatherIO
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from TestLibRBD
seed 1931
[ RUN      ] TestLibRBD.TestScatterGatherIO
using old format
[       OK ] TestLibRBD.TestScatterGatherIO (3102 ms)
/home/mgolub/ceph.ci/src/log/SubsystemMap.h: In function 'bool ceph::logging::SubsystemMap::should_gather(unsigned int, int)' thread 7f1aa4ff9700 time 2017-02-19 10:12:56.878671
/home/mgolub/ceph.ci/src/log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
 ceph version 12.0.0-537-gc1fe62f (c1fe62f57ac0a974470d698a3eb71a685e495122)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f1b404bc582]
 2: (ObjectCacher::flusher_entry()+0xef2) [0x55774206ce82]
 3: (ObjectCacher::FlusherThread::entry()+0xd) [0x55774207976d]
 4: (()+0x76ba) [0x7f1b48f4b6ba]
 5: (clone()+0x6d) [0x7f1b3f4b682d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
zsh: abort (core dumped)  ./bin/ceph_test_librbd --gtest_repeat=10 

@dillaman
Copy link
Author

@trociny unit test fixed by closing the image properly and updated the shared library version for consistency

@@ -204,7 +205,7 @@ void CopyupRequest::send()
<< ", oid " << m_oid
<< ", extents " << m_image_extents
<< dendl;
AioImageRequest<>::aio_read(m_ictx->parent, comp, std::move(m_image_extents),
ImageRequest<>::aio_read(m_ictx->parent, comp, std::move(m_image_extents),
nullptr, &m_copyup_data, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

uint64_t object_no, uint64_t object_off,
const ::SnapContext &snapc,
Context *completion) {
return new ObjectTruncateRequest(util::get_image_ctx(ictx), oid, object_no,
object_off, snapc, completion);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

const ceph::bufferlist &data,
const ::SnapContext &snapc,
Context *completion, int op_flags) {
return new ObjectWriteRequest(util::get_image_ctx(ictx), oid, object_no,
object_off, data, snapc, completion, op_flags);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

uint64_t object_len,
const ::SnapContext &snapc,
Context *completion) {
return new ObjectZeroRequest(util::get_image_ctx(ictx), oid, object_no,
object_off, object_len, snapc, completion);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

Jason Dillaman added 3 commits February 20, 2017 08:19
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The RBD C API now has scatter/gather IO support via the API. The C++
API uses bufferlists which can also support scatter/gather IO.

Fixes: http://tracker.ceph.com/issues/13025
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
@dillaman
Copy link
Author

@trociny updated

@trociny trociny merged commit 357cea2 into ceph:master Feb 20, 2017
@dillaman dillaman deleted the wip-13025 branch February 20, 2017 22:32
@yuyuyu101
Copy link
Member

@dillaman do you consider to use buffer::claim_buffer(https://github.com/ceph/ceph/blob/master/src/common/buffer.cc#L646) to solve data copy problem?

example usage can be found at https://github.com/ceph/ceph/blob/master/src/msg/async/dpdk/DPDKStack.h#L116.

deleter could be a lambda which called when ptr released

@dillaman
Copy link
Author

@yuyuyu101 Cool -- didn't know that was already implemented. I brought up an option similar to this on the ceph-devel list [1] (combined w/ delaying the AioCompletion) and Sage mentioned that it would be possible for the memory reference to be held for a potentially long period of time (where such a completion delay wouldn't be feasible).

[1] https://www.spinics.net/lists/ceph-devel/msg34596.html

@yuyuyu101
Copy link
Member

@dillaman yes, we can't release memory with releasing AioCompletion. I suggest that user can provide with callback as revoke method. this callback can be a member of AioCompletion

@dillaman
Copy link
Author

@yuyuyu101 Hmm -- not sure where you are going w/ this. Once we fire the AioCompletion back to the end-user (e.g. QEMU), there is an expectation that the memory buffers they provided to the read/write operation are now free for re-use by the end-user. Providing a mechanism for "revoking" the buffers doesn't solve the problem because semantically the buffers should be revoked from Ceph-usage before we fire the completion. Even if we provided a way to revoke the buffers, bufferlist doesn't have any locks to protect its ptr collection from concurrent access.

@yuyuyu101
Copy link
Member

Hmm, no, I mean callback is related to the bufferptr not AioCompletion. Just using AioCompletion to pass callback to bufferptr.

void revoke_buffer(char *buffer)
{
}

comp = create_rbd_completion(cct, revoke_buffer);
r = aio_write(image, offset, length, comp, iovecs);
........
....... ceph inside
auto callback = comp->revoke_func;
bl.push_back(claim_buffer(offset, length, iovec->buffer, iovec->buffer, callback { callback(iovec->buffer); }));
complete_completion(comp);
...... later
~bufferptr() -> revoke_buffer;

@dillaman
Copy link
Author

@yuyuyu101 Still not understanding what you are proposing. As highlighted Sage, the bufferptr can be held for a potentially long time after the AioComlletion has completed. In the case of something like QEMU w/ virtio, this memory is coming from the ring buffer between the guest OS and QEMU. Therefore, w/o copying the memory, the only safe thing to do would be to stall the AioCompletion until Ceph is no longer referencing the memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants