Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librbd: global and pool-level config overrides require image refresh to apply #36309

Merged
merged 4 commits into from
Aug 1, 2020

Conversation

dillaman
Copy link

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard backend
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@dillaman
Copy link
Author

dillaman commented Jul 28, 2020

jenkins test make check (unrelated 'TestLibRBD.QuiesceWatchError' failure)

if (r < 0) {
lderr(cct) << "failed to notify clients of pool config update: "
<< cpp_strerror(r) << dendl;
return r;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may be do not consider as fatal and proceed (i.e. return 0)?

And AFAIR "config set" was added in mimic. So it will fail in a nautilus cluster (and it seems you want to backport it to nautilus).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, never mind about "nautilus". It is after "mimic" :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was on the fence about making it return an error. I'll drop it to avoid the potential issue w/ backward compatibility and access rights.

R"({)"
R"("prefix": "config set", )"
R"("who": "global", )"
R"("name": "rbd_config_pool_override_update_timestamp", )"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks very clever though just a little hackish to me.

Alternatively I think we could add "config refresh [key_prefix]" mon command, and the librbd would send "config refresh rbd_config" here. It will not work for older cluster until backported though.

Anyway, I am fine with your approach if you don't like the "config refresh".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to work from the API as well (dashboard). I think adding a second step makes it more error prone to forget that your changes weren't actually applied.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I think I misunderstood your original proposal. I thought you were talking about adding something to the rbd CLI. I'm not sure how that would work if there was no global override for the key. Right now it just sends all applicable config values to the client when updated and the client determines if it was actually updated before invoking the observer callback. To implement something like this, we would need a new message (or flag on the MConfig message) to instruct config::set_mon_vals that key XYZ was updated but we don't know the value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a second step? I meant instead of introducing a "strange" (to users) variable and sending to the mon "set this strange rbd variable to a new value" we would send a newly introduced command "refresh all config variables starting with 'rbd_' prefix" (there was a bug in my previous message: s/rbd_config/rbd_/). And this would trigger the config observer handler to refresh the image. The problem as I said though that this command would be available in new releases only, which is probably a stopper if you want to backport this.

Jason Dillaman added 4 commits July 29, 2020 08:45
The 'ImageCtx::apply_metadata' is guaranteed to be thread-safe but
a future commit will attempt to utilize the cached config override
set.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The config watcher will initially observe all "rbd_" configuration
updates received from the MON that have not been locally overridden
at the pool and/or image level.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The new "dev"-level global config setting will be updated when any
pool-level config override is updated. librbd clients will detect
the new global-level config update and trigger a refresh. This avoids
the need for potentially tens of thousands of librbd clients
registering a watch on the pool metadata object or periodically polling
the pool metadata object for updates.

Fixes: https://tracker.ceph.com/issues/46694
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Copy link
Contributor

@trociny trociny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants