Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add timeout to watch/notify #11378

Merged
merged 3 commits into from Nov 14, 2016
Merged

Conversation

ryneli
Copy link
Contributor

@ryneli ryneli commented Oct 8, 2016

Hi @tchaikov, I have added a timeout parameter to watch-related API. Could you please help me to review my code? Thanks.

@tchaikov tchaikov self-assigned this Oct 8, 2016
extern "C" int rados_watch3(rados_ioctx_t io, const char *o, uint64_t *handle,
rados_watchcb2_t watchcb,
rados_watcherrcb_t watcherrcb,
uint32_t timeout,
void *arg)
{
tracepoint(librados, rados_watch2_enter, io, o, handle, watchcb, arg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want to s/rados_watch3_enter/rados_watch3_enter/,

and could you update src/tracing/librados.tp accordingly?

@@ -517,6 +517,7 @@ struct ceph_osd_op {
__le64 ver; /* no longer used */
__u8 op; /* CEPH_OSD_WATCH_OP_* */
__u32 gen; /* registration generation */
__u32 timeout; /* connection timeout */
Copy link
Contributor

@tchaikov tchaikov Oct 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was wondering how we handled the backward/forward compatibility here. the op struct is wrapped in MOSDOp::ops::op. and ceph_osd_op is encoded using the raw encoder, i.e. just copy the memory chunk. so the maximum size of the member of union in ceph_osd_op is sizeof(writesame) = (64 + 64 + 64) bits. in other words, as long as the size of all struct are less or equal to 24 bytes, we are safe.

could you add a

static_assert(sizeof(ceph_osd_op) == (2+4+(8+8+8)+4),
              "sizeof(ceph_osd_op) breaks the compatibility");

somewhere in this header file in a separated commit? in which,

 (2+4+(8+8+8)+4) =           (sizeof(ceph_osd_op::op) +
                              sizeof(ceph_osd_op::flags) +
                              sizeof(ceph_osd_op::writesame) +
                              sizeof(ceph_osd_op::payload_len))

@athanatos am i right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov, you are right. There is compile error after adding the static check. Do I need to find another place to pass timeout to OSD?

Following is the compile error.

In file included from /home/ryne/ceph/src/include/ceph_fs.h:16:0,
from /home/ryne/ceph/src/include/types.h:32,
from /home/ryne/ceph/src/msg/msg_types.h:20,
from /home/ryne/ceph/src/common/entity_name.h:24,
from /home/ryne/ceph/src/common/config.h:24,
from /home/ryne/ceph/src/common/dout.h:20,
from /home/ryne/ceph/src/compressor/AsyncCompressor.cc:15:
/home/ryne/ceph/src/include/rados.h:573:1: error: static assertion failed: sizeof(ceph_osd_op) breaks the compatibility
static_assert(sizeof(ceph_osd_op) == (2+4+(8+8+8)+4),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryneli i was wrong the largest struct in the union is

        struct {
          int64_t offset, length;
          int64_t truncate_size;
          int32_t truncate_seq;
        } __attribute__ ((packed)) extent;

so the assert should be

static_assert(sizeof(ceph_osd_op) == (2+4+(2*8+8+4)+4),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov I have finished these changes. Could you please help to review my code again? Also, this static_assert involves some magic number "2+4+(2*8+8+4)+4". Do I need to add a global constant value to represent this magic number, i.e. "const uint32_t MAX_BYTES_OF_CEPH_OSD_OP = 38;"

uint64_t cookie = op.watch.cookie;
entity_name_t entity = ctx->reqid.name;
uint64_t cookie = op.watch.cookie;
entity_name_t entity = ctx->reqid.name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better off not s/<tab>/ /, if you are not touching this area.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code seems not indent correctly. So I try to modify the indent to align with the code below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryneli we use tabstop = 8, so it's okay. and we want to avoid formatting only changes if possible. otherwise it would be more difficult to read the output of git blame.

@ryneli ryneli force-pushed the watch_notify_add_timeout branch 4 times, most recently from 250f9c4 to fae542f Compare October 9, 2016 19:35
uint64_t *handle,
rados_watchcb2_t watchcb,
rados_watcherrcb_t watcherrcb,
uint32_t timeout, void *arg)
{
tracepoint(librados, rados_aio_watch_enter, io, o, completion, handle, watchcb, arg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that rados_aio_watch2() is delegated to do the work. we can probably remove rados_aio_watch_enter and rados_aio_watch_exit tracing points, and add rados_aio_watch2_enter and rados_aio_watch2_exit to rados_aio_watch2().

* @param completion what to do when operation has been attempted
* @param handle where to store the internal id assigned to this watch
* @param watchcb what to do when a notify is received on this object
* @param watcherrcb what to do when the watch session encounters an error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout is not documented.

* sizeof(ceph_osd_op::payload_len))
*/
#ifdef __cplusplus
static_assert(sizeof(ceph_osd_op) == (2+4+(2*8+8+4)+4),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this static_assert involves some magic number "2+4+(2*8+8+4)+4". Do I need to add a global constant value to represent this magic number, i.e. "const uint32_t MAX_BYTES_OF_CEPH_OSD_OP = 38;"

ahh, i just realized that rados.h is shared by C and C++ source files! MAX_BYTES_OF_CEPH_OSD_OP does not seem right. the size of ceph_osd_op must be exactly 38 bytes. and we can only add more fields at the end of union structs (extent excluded), and the new behavior should be backward compatible if the newly added field(s) reads zero. this static assert is just a better-than-nothing message to posterity.

but i think the comment is good enough to serve this purpose. what do you think?

* the normal recovery process. If the client loses its connection to
* the primary OSD for a watched object, the watch will be removed
* after 30 seconds. Watches are automatically reestablished when a new
* the normal recovery process. The timeout of the connection between
Copy link
Contributor

@tchaikov tchaikov Oct 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep the original doc which is more specific? and amend it like:

If the client loses its connection to the primary OSD for a watched object, the watch will be removed after a timeout configured with osd_client_watch_timeout.

?

@tchaikov tchaikov added cephfs Ceph File System feature labels Oct 10, 2016
@tchaikov
Copy link
Contributor

also, probably you can split the first commit into two separated ones. one is the changes in osd. the other is for exposing the new feature in librados. and then prefix the title of your commit message withs the subcomponent your are changing ? see https://github.com/ceph/ceph/blob/master/SubmittingPatches.rst#preparing-and-sending-patches

@tchaikov tchaikov added core and removed cephfs Ceph File System labels Oct 10, 2016
@ryneli ryneli force-pushed the watch_notify_add_timeout branch 3 times, most recently from 1bac9f2 to ed178cc Compare October 10, 2016 20:00
@ryneli
Copy link
Contributor Author

ryneli commented Oct 10, 2016

@tchaikov, I have split the commits and add prefix to my commit message. Is there anything else to modify?

ctf_integer(uint64_t, handle, handle)
)
)

TRACEPOINT_EVENT(librados, rados_aio_watch_enter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we can remove rados_watch2_enter and rados_watch2_exit now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed both rados_watch2_enter/exit and rados_aio_watch_enter/exit :)

Signed-off-by: Ryne Li <lizhenqiangsnake@gmail.com>
Signed-off-by: Ryne Li <lizhenqiangsnake@gmail.com>
@tchaikov
Copy link
Contributor

tchaikov commented Oct 12, 2016

lgtm. adding needs-qa label.

@ryneli
Copy link
Contributor Author

ryneli commented Oct 12, 2016

So could I merge it to the master branch now?

@tchaikov
Copy link
Contributor

@ryneli, for most non-trivial changes, we'd like to run them through a qa suite before merging them. and in your case, i'd like to run a rados qa suite. that's what the "needs-qa" label is for.

@ryneli
Copy link
Contributor Author

ryneli commented Oct 12, 2016

So could I do something in the QA test or just wait the result?

@tchaikov
Copy link
Contributor

i think your change is already covered by existing tests. you can push a branch with your changes to ceph/ceph, and schedule a qa run with enough coverage in sepia after your branch is built. or just wait. normally, it will take 1 week or two.

@ryneli
Copy link
Contributor Author

ryneli commented Oct 12, 2016

Since this is not an emergent change. Maybe I can just wait for the coverage test. Thanks for your review and comments, @tchaikov

TP_ARGS(
rados_ioctx_t, ioctx,
const char*, oid,
rados_completion_t, completion,
uint64_t*, phandle,
rados_watchcb2_t, callback,
uint32_t timeout,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryneli ping?

@tchaikov tchaikov removed the needs-qa label Oct 28, 2016
@liewegas liewegas added this to the kraken milestone Nov 1, 2016
@liewegas
Copy link
Member

liewegas commented Nov 4, 2016

@ryneli ping

@ryneli
Copy link
Contributor Author

ryneli commented Nov 6, 2016

@liewegas I am so sorry for the delay. I didn't receive the message. Just find the message when I check the status of this PR.

@ryneli
Copy link
Contributor Author

ryneli commented Nov 8, 2016

@tchaikov I am so sorry for the delay. Due to wrong Github setting, I didn't receive your message. I have corrected the error. Could you please help me check it again?

@tchaikov
Copy link
Contributor

tchaikov commented Nov 8, 2016

@ryneli sure, will do.

@tchaikov
Copy link
Contributor

lgtm.

@tchaikov
Copy link
Contributor

jenkins, retest this please

void*, arg),
TP_FIELDS(
ctf_integer_hex(rados_ioctx_t, ioctx, ioctx)
ctf_string(oid, oid)
ctf_integer_hex(rados_completion_t, completion, completion)
ctf_integer_hex(uint64_t, phandle, phandle)
ctf_integer_hex(rados_watchcb2_t, callback, callback)
ctf_inteter(uint32_t, timeout, timeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ctf_inteter/ctf_integer/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the stupid typo. Do you know how to test librados.tp file? It seems that the librados.tp file isn't be used when I compile the source code locally.

Copy link
Contributor

@tchaikov tchaikov Nov 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryneli

cmake -DWITH_LTTNG=ON ..
make -C src/tracing rados_tp

notify_oid = "foo";
notify_ioctx = &ioctx;
notify_cookies.clear();
uint32_t timeout = 3; // configured timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://pulpito.ceph.com/kchai-2016-11-13_05:09:22-rados-wip-kefu-testing---basic-smithi/

quite a few tests with harsh cluster env (high messenger failure rate or thrash interruptions) failed due to -ENOTCONN, i am prolonging the timeout to 12 to see if it helps.

Copy link
Contributor

@tchaikov tchaikov Nov 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tests at http://pulpito.ceph.com/kchai-2016-11-13_07:03:13-rados-wip-kefu-testing---basic-smithi/ look promising.

@ryneli could you change the timeout to 12 or longer?

notify_oid = "foo";
notify_ioctx = &ioctx;
notify_cookies.clear();
uint32_t timeout = 3; // configured timeout
Copy link
Contributor

@tchaikov tchaikov Nov 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tests at http://pulpito.ceph.com/kchai-2016-11-13_07:03:13-rados-wip-kefu-testing---basic-smithi/ look promising.

@ryneli could you change the timeout to 12 or longer?

Signed-off-by: Ryne Li <lizhenqiangsnake@gmail.com>
@ryneli
Copy link
Contributor Author

ryneli commented Nov 13, 2016

@tchaikov I have changed the timeout to 12.

@tchaikov tchaikov merged commit 473ae1d into ceph:master Nov 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants