New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add timeout to watch/notify #11378
add timeout to watch/notify #11378
Conversation
b6c9dfe
to
ec422d7
Compare
extern "C" int rados_watch3(rados_ioctx_t io, const char *o, uint64_t *handle, | ||
rados_watchcb2_t watchcb, | ||
rados_watcherrcb_t watcherrcb, | ||
uint32_t timeout, | ||
void *arg) | ||
{ | ||
tracepoint(librados, rados_watch2_enter, io, o, handle, watchcb, arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to s/rados_watch3_enter/rados_watch3_enter/
,
and could you update src/tracing/librados.tp
accordingly?
@@ -517,6 +517,7 @@ struct ceph_osd_op { | |||
__le64 ver; /* no longer used */ | |||
__u8 op; /* CEPH_OSD_WATCH_OP_* */ | |||
__u32 gen; /* registration generation */ | |||
__u32 timeout; /* connection timeout */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was wondering how we handled the backward/forward compatibility here. the op struct is wrapped in MOSDOp::ops::op
. and ceph_osd_op
is encoded using the raw encoder, i.e. just copy the memory chunk. so the maximum size of the member of union in ceph_osd_op
is sizeof(writesame)
= (64 + 64 + 64) bits. in other words, as long as the size of all struct are less or equal to 24 bytes, we are safe.
could you add a
static_assert(sizeof(ceph_osd_op) == (2+4+(8+8+8)+4),
"sizeof(ceph_osd_op) breaks the compatibility");
somewhere in this header file in a separated commit? in which,
(2+4+(8+8+8)+4) = (sizeof(ceph_osd_op::op) +
sizeof(ceph_osd_op::flags) +
sizeof(ceph_osd_op::writesame) +
sizeof(ceph_osd_op::payload_len))
@athanatos am i right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tchaikov, you are right. There is compile error after adding the static check. Do I need to find another place to pass timeout to OSD?
Following is the compile error.
In file included from /home/ryne/ceph/src/include/ceph_fs.h:16:0,
from /home/ryne/ceph/src/include/types.h:32,
from /home/ryne/ceph/src/msg/msg_types.h:20,
from /home/ryne/ceph/src/common/entity_name.h:24,
from /home/ryne/ceph/src/common/config.h:24,
from /home/ryne/ceph/src/common/dout.h:20,
from /home/ryne/ceph/src/compressor/AsyncCompressor.cc:15:
/home/ryne/ceph/src/include/rados.h:573:1: error: static assertion failed: sizeof(ceph_osd_op) breaks the compatibility
static_assert(sizeof(ceph_osd_op) == (2+4+(8+8+8)+4),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ryneli i was wrong the largest struct in the union is
struct {
int64_t offset, length;
int64_t truncate_size;
int32_t truncate_seq;
} __attribute__ ((packed)) extent;
so the assert should be
static_assert(sizeof(ceph_osd_op) == (2+4+(2*8+8+4)+4),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tchaikov I have finished these changes. Could you please help to review my code again? Also, this static_assert involves some magic number "2+4+(2*8+8+4)+4". Do I need to add a global constant value to represent this magic number, i.e. "const uint32_t MAX_BYTES_OF_CEPH_OSD_OP = 38;"
uint64_t cookie = op.watch.cookie; | ||
entity_name_t entity = ctx->reqid.name; | ||
uint64_t cookie = op.watch.cookie; | ||
entity_name_t entity = ctx->reqid.name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better off not s/<tab>/ /
, if you are not touching this area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code seems not indent correctly. So I try to modify the indent to align with the code below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ryneli we use tabstop = 8
, so it's okay. and we want to avoid formatting only changes if possible. otherwise it would be more difficult to read the output of git blame
.
250f9c4
to
fae542f
Compare
uint64_t *handle, | ||
rados_watchcb2_t watchcb, | ||
rados_watcherrcb_t watcherrcb, | ||
uint32_t timeout, void *arg) | ||
{ | ||
tracepoint(librados, rados_aio_watch_enter, io, o, completion, handle, watchcb, arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that rados_aio_watch2()
is delegated to do the work. we can probably remove rados_aio_watch_enter
and rados_aio_watch_exit
tracing points, and add rados_aio_watch2_enter
and rados_aio_watch2_exit
to rados_aio_watch2()
.
* @param completion what to do when operation has been attempted | ||
* @param handle where to store the internal id assigned to this watch | ||
* @param watchcb what to do when a notify is received on this object | ||
* @param watcherrcb what to do when the watch session encounters an error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeout
is not documented.
* sizeof(ceph_osd_op::payload_len)) | ||
*/ | ||
#ifdef __cplusplus | ||
static_assert(sizeof(ceph_osd_op) == (2+4+(2*8+8+4)+4), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this static_assert involves some magic number "2+4+(2*8+8+4)+4". Do I need to add a global constant value to represent this magic number, i.e. "const uint32_t MAX_BYTES_OF_CEPH_OSD_OP = 38;"
ahh, i just realized that rados.h
is shared by C and C++ source files! MAX_BYTES_OF_CEPH_OSD_OP
does not seem right. the size of ceph_osd_op
must be exactly 38 bytes. and we can only add more fields at the end of union structs (extent
excluded), and the new behavior should be backward compatible if the newly added field(s) reads zero. this static assert is just a better-than-nothing message to posterity.
but i think the comment is good enough to serve this purpose. what do you think?
* the normal recovery process. If the client loses its connection to | ||
* the primary OSD for a watched object, the watch will be removed | ||
* after 30 seconds. Watches are automatically reestablished when a new | ||
* the normal recovery process. The timeout of the connection between |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep the original doc which is more specific? and amend it like:
If the client loses its connection to the primary OSD for a watched object, the watch will be removed after a timeout configured with
osd_client_watch_timeout
.
?
also, probably you can split the first commit into two separated ones. one is the changes in osd. the other is for exposing the new feature in librados. and then prefix the title of your commit message withs the subcomponent your are changing ? see https://github.com/ceph/ceph/blob/master/SubmittingPatches.rst#preparing-and-sending-patches |
1bac9f2
to
ed178cc
Compare
@tchaikov, I have split the commits and add prefix to my commit message. Is there anything else to modify? |
ctf_integer(uint64_t, handle, handle) | ||
) | ||
) | ||
|
||
TRACEPOINT_EVENT(librados, rados_aio_watch_enter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we can remove rados_watch2_enter
and rados_watch2_exit
now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed both rados_watch2_enter/exit and rados_aio_watch_enter/exit :)
Signed-off-by: Ryne Li <lizhenqiangsnake@gmail.com>
Signed-off-by: Ryne Li <lizhenqiangsnake@gmail.com>
ed178cc
to
e467337
Compare
|
So could I merge it to the master branch now? |
@ryneli, for most non-trivial changes, we'd like to run them through a qa suite before merging them. and in your case, i'd like to run a rados qa suite. that's what the "needs-qa" label is for. |
So could I do something in the QA test or just wait the result? |
i think your change is already covered by existing tests. you can push a branch with your changes to ceph/ceph, and schedule a qa run with enough coverage in sepia after your branch is built. or just wait. normally, it will take 1 week or two. |
Since this is not an emergent change. Maybe I can just wait for the coverage test. Thanks for your review and comments, @tchaikov |
TP_ARGS( | ||
rados_ioctx_t, ioctx, | ||
const char*, oid, | ||
rados_completion_t, completion, | ||
uint64_t*, phandle, | ||
rados_watchcb2_t, callback, | ||
uint32_t timeout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ryneli ping?
@ryneli ping |
e467337
to
a0e0ece
Compare
@liewegas I am so sorry for the delay. I didn't receive the message. Just find the message when I check the status of this PR. |
a0e0ece
to
7a344ef
Compare
@tchaikov I am so sorry for the delay. Due to wrong Github setting, I didn't receive your message. I have corrected the error. Could you please help me check it again? |
@ryneli sure, will do. |
lgtm. |
jenkins, retest this please |
void*, arg), | ||
TP_FIELDS( | ||
ctf_integer_hex(rados_ioctx_t, ioctx, ioctx) | ||
ctf_string(oid, oid) | ||
ctf_integer_hex(rados_completion_t, completion, completion) | ||
ctf_integer_hex(uint64_t, phandle, phandle) | ||
ctf_integer_hex(rados_watchcb2_t, callback, callback) | ||
ctf_inteter(uint32_t, timeout, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ctf_inteter/ctf_integer/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the stupid typo. Do you know how to test librados.tp file? It seems that the librados.tp file isn't be used when I compile the source code locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmake -DWITH_LTTNG=ON ..
make -C src/tracing rados_tp
7a344ef
to
4e23511
Compare
notify_oid = "foo"; | ||
notify_ioctx = &ioctx; | ||
notify_cookies.clear(); | ||
uint32_t timeout = 3; // configured timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http://pulpito.ceph.com/kchai-2016-11-13_05:09:22-rados-wip-kefu-testing---basic-smithi/
quite a few tests with harsh cluster env (high messenger failure rate or thrash interruptions) failed due to -ENOTCONN, i am prolonging the timeout to 12 to see if it helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tests at http://pulpito.ceph.com/kchai-2016-11-13_07:03:13-rados-wip-kefu-testing---basic-smithi/ look promising.
@ryneli could you change the timeout to 12
or longer?
notify_oid = "foo"; | ||
notify_ioctx = &ioctx; | ||
notify_cookies.clear(); | ||
uint32_t timeout = 3; // configured timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tests at http://pulpito.ceph.com/kchai-2016-11-13_07:03:13-rados-wip-kefu-testing---basic-smithi/ look promising.
@ryneli could you change the timeout to 12
or longer?
Signed-off-by: Ryne Li <lizhenqiangsnake@gmail.com>
4e23511
to
e3ca1e6
Compare
@tchaikov I have changed the timeout to 12. |
Hi @tchaikov, I have added a timeout parameter to watch-related API. Could you please help me to review my code? Thanks.