Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: show info about mirror daemon instance in image mirror status output #24717

Merged
merged 4 commits into from Nov 6, 2018

Conversation

trociny
Copy link
Contributor

@trociny trociny commented Oct 23, 2018

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

@trociny
Copy link
Contributor Author

trociny commented Oct 23, 2018

@dillaman Displaying info about rbd-mirror daemon instance in 'mirror status' output looks very useful to me, especially when we can run now multiple instances in A-P or A-A mode.

This information is already stored by OSDs, the problem is though how to get it without breaking API. The current implementations breaks it by adding mirror_instance field to mirror_status structure. Do you have any suggestions?

Another question is what and how we would want to output. The output in the current implementation looks like below:

% rbd --cluster cluster1 mirror image status mirror/test     
test:
  global_id:   8175f34e-94ff-4992-bd54-7a6c6073e3a6
  state:       up+replaying [client.4268 192.168.1.248:44468/2978164156]
  description: replaying, master_position=[object_number=3, tag_tid=1, entry_tid=3], mirror_position=[object_number=3, tag_tid=1, entry_tid=3], entries_behind_master=0
  last_update: 2018-10-23 16:54:04
% rbd --cluster cluster1 mirror image status mirror/test --format json --pretty-format
{
    "name": "test",
    "global_id": "8175f34e-94ff-4992-bd54-7a6c6073e3a6",
    "state": "up+replaying",
    "mirror_instance": "client.4268 192.168.1.248:44468/2978164156",
    "description": "replaying, master_position=[object_number=3, tag_tid=1, entry_tid=3], mirror_position=[object_number=3, tag_tid=1, entry_tid=3], entries_behind_master=0",
    "last_update": "2018-10-23 16:54:34"
}

@dillaman
Copy link

@trociny I'm on the fence about whether or not the displaying the Ceph entity name is helpful.

Since the global id is auto-generated upon connection, it doesn't provide much value. The IP address (sans nonce) can at least help narrow down the host -- but you have multiple daemons running on the same host in a container, it's only a partial solution.

However, if the global id is combined w/ the output from ceph service dump, you can get the host name and the "id" -- which ceph-ansible at least keeps unique for this purpose. I would then output the id + hostname (or gid + IP address if dump is missing the process) as its own field in the output instead of trying to overload the "state" line.

As for breaking both the cls and C/C++ API, I think this needs to be prevented at all costs. The entity can at least be safely added to the MirrorImageStatus type in the cls API w/o breaking backwards compatibility. For the C/C++ APIs, you can either (1) create new mirror_image_status_2_t-style types w/ new matching xyz2 API methods, (2) create new xyz2 API methods w/ an additional parameter for the entity, or (3) add a new API method to retrieve the rbd-mirror image entity and only invoke it when the rbd mirror image status action is used (so it only involves a second round-trip and only if the status is up).

@trociny
Copy link
Contributor Author

trociny commented Oct 23, 2018

@dillaman Excellent idea about using id from ceph service dump! The only issue is though that I think we can't match this id to the origin that is currently stored in MirrorImageStatusOnDisk, because we store here entity_inst_t for rbd-mirror remote connection, while the rbd-mirror registers its service in ceph-mgr using local connection.

For the C/C++ APIs, yes I had some ideas how it could be solved, the question actually was what from this list of possible solutions you like more (hate less)? Because I am not very happy with adding xyz2 methods, especially assuming that the only current user of xyz methods is rbd cli and dashboard (I suppose).

Ok I think I like the most what I assume you propose in (3), that is add new methods to API to retrieve service_id:

int rbd_mirror_image_get_service_id(rbd_image_t image,
                                    char *service_id,
                                    size_t service_id_max_length);
int rbd_mirror_image_service_id_list(rados_ioctx_t io_ctx,
                                     const char *start_id, size_t max,
                                     char **image_ids,
                                     char **service_ids,
                                     size_t *len);

And so there will be new cls methods to store/retrieve this info. Both cls and API mirror_image_status methods and structures remain unchanged.

@dillaman
Copy link

@dillaman Excellent idea about using id from ceph service dump! The only issue is though that I think we can't match this id to the origin that is currently stored in MirrorImageStatusOnDisk, because we store here entity_inst_t for rbd-mirror remote connection, while the rbd-mirror registers its service in ceph-mgr using local connection.

Good point since each PoolReplayer will open its own connection to the local cluster. You could just have the PoolReplayer register its gid via some new JSON array in the ServiceDaemon.

For the C/C++ APIs, yes I had some ideas how it could be solved, the question actually was what from this list of possible solutions you like more (hate less)? Because I am not very happy with adding xyz2 methods, especially assuming that the only current user of xyz methods is rbd cli and dashboard (I suppose).

I'd vote for option (3).

Ok I think I like the most what I assume you propose in (3), that is add new methods to API to retrieve service_id:

int rbd_mirror_image_get_service_id(rbd_image_t image,
                                    char *service_id,
                                    size_t service_id_max_length);
int rbd_mirror_image_service_id_list(rados_ioctx_t io_ctx,
                                     const char *start_id, size_t max,
                                     char **image_ids,
                                     char **service_ids,
                                     size_t *len);

And so there will be new cls methods to store/retrieve this info. Both cls and API mirror_image_status methods and structures remain unchanged.

I was originally just thinking just the former and skip the latter -- but I can see how adding this to the dashboard for all images could be nice.

@trociny trociny force-pushed the wip-rbd-mirror-status-instance branch 2 times, most recently from 949b812 to 5791d44 Compare October 25, 2018 14:41
@trociny
Copy link
Contributor Author

trociny commented Oct 25, 2018

@dillaman updated.

  • Right now the instance_id (the list of instance_ids) is returned as additional output param of mirror_image_status_get/list cls methods. It is how it was in my initial implementation and I am not sure if we want to provide a separate cls methods mirror_image_instance_id_get/list for this? In osd side they will do almost the same. On the other hand it looks a little silly now, that we call mirror_image_status_get/list method twice: first time to get status, and the second time to get instance_id, because we have different API methods for this.
  • rbd-mirror has been modified to provide instance_id for every pool replayer in service dump.
  • rbd_mirror_image_get_service_id/service_id_list API methods get the instance_id (the list of instance_ids) from an osd, service status from mgr, and map instance id to mirror-daemon gid (service_id).

@dillaman
Copy link

  • Right now the instance_id (the list of instance_ids) is returned as additional output param of mirror_image_status_get/list cls methods. It is how it was in my initial implementation and I am not sure if we want to provide a separate cls methods mirror_image_instance_id_get/list for this? In osd side they will do almost the same. On the other hand it looks a little silly now, that we call mirror_image_status_get/list method twice: first time to get status, and the second time to get instance_id, because we have different API methods for this.

Again, why not just add the entity_type_t origin to cls::rbd::MirrorImageStatus? It's already stored in the cls::rbd::MirrorImageStatusOnDisk struct. Then you get it for free from the existing cls API methods.

<< " last_update: " << last_update << std::endl;
<< " description: " << status.description << "\n";
if (!service_description.empty()) {
std::cout << " service: " << service_description << "\n";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: is the indent correct here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intentionally incorrect but to make the " service: " filed exactly under " description: ", so I can be sure it has correct number of spaces (and will do if we will change this in future). But if you don't like this I will change.

@@ -292,15 +313,20 @@ int execute_status(const po::variables_map &vm,
formatter->dump_string("global_id", status.info.global_id);
formatter->dump_string("state", state);
formatter->dump_string("description", status.description);
formatter->dump_string("daemon_service_id", service_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if you have the service details, perhaps open a new object section?

@@ -624,6 +649,9 @@ class StatusImageRequest : public ImageRequestBase {
m_formatter->dump_string("state", state);
m_formatter->dump_string("description",
m_mirror_image_status.description);
m_formatter->dump_string("daemon_service_id", service_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ditto here

<< " last_update: " << last_update << std::endl;
<< m_mirror_image_status.description << "\n";
if (!service_description.empty()) {
std::cout << " service: " << service_description << "\n";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: indent correct?

continue;
}
auto &service_id = it.first;
auto &daemon_metada = it.second.get_obj()["metadata"].get_obj();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo on daemon_metada

int aio_mirror_image_promote(bool force, RBD::AioCompletion *c);
int aio_mirror_image_demote(RBD::AioCompletion *c);
int aio_mirror_image_get_info(mirror_image_info_t *mirror_image_info,
size_t info_size, RBD::AioCompletion *c);
int aio_mirror_image_get_status(mirror_image_status_t *mirror_image_status,
size_t status_size, RBD::AioCompletion *c);
int aio_mirror_image_get_service_id(std::string *service_id,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: doesn't seem to be needed/used right now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I was using this initially for rbd mirror pool status, in StatusImageRequest until I realized that I could just call mirror_image_service_id_list once and pass the list to StatusImageRequest.
Ok. I will remove those.

@@ -974,6 +986,10 @@ CEPH_RBD_API int rbd_aio_mirror_image_get_status(rbd_image_t image,
rbd_mirror_image_status_t *mirror_image_status,
size_t status_size,
rbd_completion_t c);
CEPH_RBD_API int rbd_aio_mirror_image_get_service_id(rbd_image_t image,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: doesn't seem to be needed/used right now?

@@ -134,6 +136,135 @@ struct C_ImageGetStatus : public C_ImageGetInfo {
}
};

int map_instance_id_to_service_id(librados::IoCtx& io_ctx,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I really like the idea of each API call issuing a service status dump to the mgr. The rbd CLI can do this one and the dashboard will have all this data available in-memory. Therefore, perhaps the API should just return the daemon's pool instance id and leave it up to the caller to map it back to a service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't like this much too but wanted the API methods to do as mach as possible so we would not duplicate this in cli and dashboard. But after your argument about dashboard already having this data available in-memory I tend to agree, so I will change this just to return the daemon's pool instance id.

json_spirit::mValue status_json_root;
if (json_spirit::read(status_json_str, status_json_root)) {
auto& status = status_json_root.get_obj();
for (auto &iter : status) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dict key should be the pool id, so you can just skip right to the pool's json status if it exists.

@trociny
Copy link
Contributor Author

trociny commented Oct 26, 2018

Again, why not just add the entity_type_t origin to cls::rbd::MirrorImageStatus? It's already stored in the cls::rbd::MirrorImageStatusOnDisk struct. Then you get it for free from the existing cls API methods.

Because I was still not sure if we wanted separate cls methods to retrieve origin, so I left it as it was in my initial implementation (as it did not look so straightforward, see my question below).

Ok it looks like you don't want new cls methods and just want to extend cls::rbd::MirrorImageStatus to include entity_type_t origin. I will do this way though it will not be entirely for free. When I add encode(origin, bl, features) to cls::rbd::MirrorImageStatus::encode I will not be able to use this function in MirrorImageStatusOnDisk::encode anymore, and will need to encoded all MirrorImageStatus fields manually. So we will have two the same structures with just differently encode/decode fields -- one for the disk, and one for the client. Or it can be one MirrorImageStatus and just additional encode/decode_ondisk functions. Or if we want to change on disk format to match the client format then we will need to support the old format for one release, for after upgrade cases when a newer osd will read status stored by an older osd. What do you prefer?

@trociny
Copy link
Contributor Author

trociny commented Oct 26, 2018

So, now, after we have decided just to extend cls::rbd::MirrorImageStatus and just return the daemon's pool instance id (instead of "service_id" obtained from mgr), I tend to think that approach to provide new mirror_image_status_2_t struct and rbd_mirror_image_get_status2/status_list2 API method instead of rbd_mirror_image_get_service_id/service_id_list would be the simplest (straightforward) and the most efficient. I would like to change this way. Do you agree?

@dillaman
Copy link

Because I was still not sure if we wanted separate cls methods to retrieve origin, so I left it as it was in my initial implementation (as it did not look so straightforward, see my question below).

Ok it looks like you don't want new cls methods and just want to extend cls::rbd::MirrorImageStatus to include entity_type_t origin. I will do this way though it will not be entirely for free. When I add encode(origin, bl, features) to cls::rbd::MirrorImageStatus::encode I will not be able to use this function in MirrorImageStatusOnDisk::encode anymore, and will need to encoded all MirrorImageStatus fields manually. So we will have two the same structures with just differently encode/decode fields -- one for the disk, and one for the client. Or it can be one MirrorImageStatus and just additional encode/decode_ondisk functions. Or if we want to change on disk format to match the client format then we will need to support the old format for one release, for after upgrade cases when a newer osd will read status stored by an older osd. What do you prefer?

I am fine w/ a new cls method -- I just don't like the overloading of the existing method return parameters if it can be avoided. Adding two new methods certainly avoids the ugliness of the on-disk vs on-wire structs.

@trociny trociny force-pushed the wip-rbd-mirror-status-instance branch 2 times, most recently from 647ecd0 to 908d6b3 Compare October 29, 2018 07:50
@trociny
Copy link
Contributor Author

trociny commented Oct 29, 2018

@dillaman updated

Note, in python bindings I added id filed to the dict returned by mirror_image_get_status and mirror_image_status_list, so one could map results returned by mirror_image_status_list and ``mirror_image_instance_id_list` calls.

src/cls/rbd/cls_rbd.cc Outdated Show resolved Hide resolved
src/cls/rbd/cls_rbd.cc Outdated Show resolved Hide resolved
src/cls/rbd/cls_rbd.cc Outdated Show resolved Hide resolved
src/cls/rbd/cls_rbd.cc Outdated Show resolved Hide resolved
src/cls/rbd/cls_rbd_client.h Outdated Show resolved Hide resolved
src/librbd/api/Mirror.cc Outdated Show resolved Hide resolved
src/tools/rbd/Utils.cc Outdated Show resolved Hide resolved
src/tools/rbd/action/MirrorImage.cc Outdated Show resolved Hide resolved
src/tools/rbd/action/MirrorPool.cc Outdated Show resolved Hide resolved
@dillaman
Copy link

@trociny Since the CLI additions can only be tested when rbd-mirror daemon is running, can you also add a small test case to qa/workunits/rbd/rbd_mirror.sh?

@trociny trociny force-pushed the wip-rbd-mirror-status-instance branch from 908d6b3 to 74d4ef8 Compare October 30, 2018 14:34
@trociny
Copy link
Contributor Author

trociny commented Oct 31, 2018

@dillaman I was running mirror [1], c_api [2], python [3], and cls [4] tests on teuthology. No issues with the code (or updated qa tests) found, though stress testing "service status" mgr command relieved an issue in ceph-mgr processing the command [5]. It looks like a race when a rbd-mirror daemon is stopping and "service status" command is run at that time. I am planning to work on it.

[1] http://pulpito.ceph.com/trociny-2018-10-30_14:49:18-rbd-wip-mgolub-testing-distro-basic-smithi/
[2] http://pulpito.ceph.com/trociny-2018-10-30_15:16:49-rbd-wip-mgolub-testing-distro-basic-smithi/
[3] http://pulpito.ceph.com/trociny-2018-10-30_15:16:27-rbd-wip-mgolub-testing-distro-basic-smithi/
[4] http://pulpito.ceph.com/trociny-2018-10-30_15:17:15-rbd-wip-mgolub-testing-distro-basic-smithi/
[5] http://tracker.ceph.com/issues/36656

qa/workunits/rbd/rbd_mirror_helpers.sh Outdated Show resolved Hide resolved
src/tools/rbd/action/MirrorImage.cc Outdated Show resolved Hide resolved
@trociny trociny force-pushed the wip-rbd-mirror-status-instance branch from 74d4ef8 to 2f50721 Compare November 1, 2018 18:02
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Mykola Golub <mgolub@suse.com>
@trociny trociny force-pushed the wip-rbd-mirror-status-instance branch from 2f50721 to 15377d3 Compare November 3, 2018 08:24
@trociny
Copy link
Contributor Author

trociny commented Nov 4, 2018

@dillaman thanks, fixed. The teuthology (--filter mirror) run:

http://pulpito.ceph.com/trociny-2018-11-03_17:03:25-rbd-wip-mgolub-testing-distro-basic-smithi/


if (status.up) {
r = image.mirror_image_get_instance_id(&instance_id);
if (r < 0 && r != -ENOENT) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: one last thing, this should probably filter our -EOPNOTSUPP to avoid the case where older OSDs are being used. If the OSDs' don't support this new cls method, don't include the status.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dillaman I am not sure about this. With the current code it will print the error (warning) but still will output the status. So the user will be informed that the information is not full and will know the reason of this. It differs from -ENOENT case, because it means that information was not stored (does not exist), while for -EOPNOTSUPP it means that it may be their but we can't obtain it.

But I don't have strong opinion here and will change it if you still think it is better to suppress the error message in this case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a special-case were to be added for -EOPNOTSUPP, a more friendly message could be provided like rbd: newer release of Ceph OSDs required to map image to rbd-mirror daemon instance (or similar)

while (true) {
std::map<std::string, std::string> ids;
r = rbd.mirror_image_instance_id_list(io_ctx, start_image_id, 1024, &ids);
if (r < 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: same comment here

It is particularly useful when running multiple rbd-mirror instances
in Active-Passive or Active-Active mode.

Signed-off-by: Mykola Golub <mgolub@suse.com>
@trociny trociny force-pushed the wip-rbd-mirror-status-instance branch from 15377d3 to 7d2ffd9 Compare November 5, 2018 20:35
@trociny
Copy link
Contributor Author

trociny commented Nov 5, 2018

@dillaman updated

Copy link

@dillaman dillaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants