New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: support pool application metadata key/values #15763

Merged
merged 18 commits into from Jul 19, 2017

Conversation

Projects
None yet
6 participants
@dillaman
Contributor

dillaman commented Jun 19, 2017

Outstanding items:

  • auto-tag rgw pools when created by the daemon
  • integrate with new rbd CLI pool creation tools

@dillaman dillaman requested a review from liewegas Jun 19, 2017

@@ -1500,7 +1509,7 @@ void pg_pool_t::encode(bufferlist& bl, uint64_t features) const
return;
}
uint8_t v = 25;
uint8_t v = 26;

This comment has been minimized.

@dillaman

dillaman Jun 19, 2017

Contributor

Need double-check on this logic since I don't understand what the version override on line 1521 is doing.

This comment has been minimized.

@liewegas

liewegas Jun 19, 2017

Member

fwiw that weird condition should instead be written if (!HAVE_FEATURE(features, SERVER_LUMINOUS))

This comment has been minimized.

@dillaman

dillaman Jun 20, 2017

Contributor

Cool -- cleaned up as part of the same commit for readability.

@@ -5072,6 +5072,20 @@ void OSDMonitor::get_pools_health(
detail->push_back(s);
}
}
if (sum.num_objects > 0 && pool.application_metadata.empty()) {

This comment has been minimized.

@dillaman

dillaman Jun 19, 2017

Contributor

Restrict this to post-update require_osd_release >= CEPH_RELEASE_LUMINOUS?

This comment has been minimized.

@liewegas

This comment has been minimized.

@liewegas

liewegas Jun 19, 2017

Member

actually, no.. i think we should condition this on the mon::feature get_required_features() for LUMINOUS.

And we should automagically enable applications for pools we recognized during mon upgrade...

This comment has been minimized.

@dillaman

dillaman Jun 19, 2017

Contributor

Hint where such mon upgrade logic currently exists?

This comment has been minimized.

@liewegas

liewegas Jun 20, 2017

Member

Yeah, nowhere yet. The feature sticky feature bit update is buried in MonmapMonitor somewhere but so far there has been no addition logic as part of that transition.

This comment has been minimized.

@liewegas

liewegas Jun 20, 2017

Member

Ok, since these labels go in teh OSDMap, and since we only encode then when we set the require_osd_release to luminous, we can do the check on the first map that adds the flag in OSDMonitor::encode_pending(). Then it happens exactly once at the end of the upgrade.

This comment has been minimized.

@liewegas

liewegas Jun 20, 2017

Member

(We can't do it earlier than that since we can't encode the result anyway!)

This comment has been minimized.

@dillaman

dillaman Jun 20, 2017

Contributor

OK -- so I can tag the MDS pools using the MDS map and I guess add pool name heuristics (e.g. rbd, cinder, glance, vms => rbd, rgw => rgw) since I can't really query for objects in the OSDs. Sounds good or did you have another idea?

string app;
cmd_getval(g_ceph_context, cmdmap, "app", app);
bool app_exists = (p.application_metadata.count(app) > 0);

This comment has been minimized.

@liewegas

liewegas Jun 19, 2017

Member

all mon commands need to be idempotent, so you should be able to enable something that is already enabled and get success. the checks below need to be updated to allow that...

This comment has been minimized.

@dillaman

dillaman Jun 19, 2017

Contributor

@liewegas I think I did allow that to occur (?) -- the app_exists is really only used for sanity checks but it doesn't prevent enabling of an app if it's already enabled nor disabling of an app if it doesn't exist.

This comment has been minimized.

@liewegas

liewegas Jun 20, 2017

Member

Ah, right.. looks ok!

@liewegas liewegas added the core label Jun 20, 2017

}
// RGW heuristics
if (boost::algorithm::contains(pool_name, ".rgw") ||

This comment has been minimized.

@dillaman

dillaman Jun 20, 2017

Contributor

@yehudasa please review

This comment has been minimized.

@yehudasa
@gregsfortytwo

It would be nice if we could look at at least one tool checking the application metadata before doing operations, so we can see how it's used.
(But mostly I'm worried about the warnings and the auto-enable semantics I commented on.)

<< "custom applications.";
detail->push_back({HEALTH_WARN, ss.str()});
}
}

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 20, 2017

Member

We definitely need to only do this warning for objects which contain data. Otherwise you'll get a warning as soon as you create a pool.

This comment has been minimized.

@dillaman

dillaman Jun 20, 2017

Contributor

Indeed -- should be covered by line 5078

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 21, 2017

Member

Oh yep, misread that.

updated = true;
}
// OpenStack gnocchi

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 20, 2017

Member

Why are we singling out gnocchi as a special thing we recognize? I don't think it belongs here, especially with a fairly common string like "metrics".

This comment has been minimized.

@dillaman

dillaman Jun 20, 2017

Contributor

Only because our ceph-ansible tools created it [1]. I'm not married to the approach, but since we did it to ourselves (potentially), it might be nice to avoid the warning.

[1] https://github.com/ceph/ceph-ansible/blob/79cac8043febfd7bd3f8d76f5e7178f864fcc5b4/roles/ceph-mon/defaults/main.yml#L57

updated = true;
}
if (updated) {

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 20, 2017

Member

So with the right pool name or cross use of pools, a single pool could get tagged with all of these applications. We should prevent that by making sure that if there's a conflict we don't auto-update it.

This comment has been minimized.

@dillaman

dillaman Jun 20, 2017

Contributor

I took the approach that it's no harm to multi-tag when unsure

This comment has been minimized.

@liewegas

liewegas Jun 21, 2017

Member

It looks like the only case here where we're not looking at the pool name is cephfs. If we want to avoid dual-tagging it's probably just a matter of putting && !updated in the other if conditions (or a big else block, or whatever). I don't have much of an opinion either way.

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 21, 2017

Member

My thinking is that if, for instance, we use these tags to inform CephFS that it can assume all objects are owned by itself, then having automatically tagged it with multiple applications is going to be a problem.

This comment has been minimized.

@dillaman

dillaman Jun 21, 2017

Contributor

A problem anymore so as compared to the state of the world today? The current plans are for (1) facilitating the dashboard (and related), and (2) restricting access to pools from applications not registered to use the pool (not currently covered in this PR since it's just a stepping stone).

This comment has been minimized.

@liewegas

liewegas Jun 21, 2017

Member

It seems like the real question is whether cephfs wants to enforce a constraint that it owns the entire pool and doesn't share. If so, then enforce that here and now and break currently-sharing clusters sooner rather than later. If sharing is maybe okay (e.g., because cephfs just needs to know about it to, say, make scrub more tolerant) then multi-tag?

This comment has been minimized.

@dillaman

dillaman Jun 21, 2017

Contributor

Perhaps just don't enable any applications if two or more heuristics match a given pool? That will cause the health warning to activate and call attention to the operator to resolve the ambiguity.

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 21, 2017

Member

Yeah, that's what I was thinking.

@@ -869,6 +869,31 @@ COMMAND("osd pool get-quota " \
"name=pool,type=CephPoolname ",
"obtain object or byte limits for pool",
"osd", "r", "cli,rest")
COMMAND("osd pool application enable " \

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 20, 2017

Member

I'm not wedded to it, but we discussed having checked commands for enabling cephfs|rbd|rgw since we know what they should look like and it would be tedious for an admin to typo them. Did you look at that?

This comment has been minimized.

@dillaman

dillaman Jun 20, 2017

Contributor

At the CDM, we discussed updating "ceph fs new", the built-in RGW pool creation, and adding new rbd CLI for creating pools to avoid having users pass around magic strings. The magic strings are still available for "advanced" use-cases, but it would be great to hide all of this from the typical end-user.

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 21, 2017

Member

Yeah, hopefully, but that's obviously not the case right now so we shoulda void degrading our UI as much as possible as we add these new constraints for them to deal with?

This comment has been minimized.

@dillaman

dillaman Jun 21, 2017

Contributor

OK, I am not sure what you are looking for here, then on this specific line.

This comment has been minimized.

@dillaman

dillaman Jun 21, 2017

Contributor

FYI -- CephFS integration is here [1] and rgw/rbd is forthcoming (I added them as todos above) since I needed a librados API.

[1] bba43fc

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jun 21, 2017

Member

Oh, sorry. I mean that in addition to this set of generics that take whatever string the admin gives us, we can also have

osd pool application enable-rbd

etc. or maybe that just seems to simply repetitive? But it would mean we can tweak the internal representations a bit more (like adding more metadata when we enable pools) and existing deployment tools would still function; it would protect against typoes; it would just generally seem a little Moreno polished.

This comment has been minimized.

@liewegas

liewegas Jun 21, 2017

Member

IMO the generic and the 'rbd pool-create' (or whatever) is sufficient; we don't need a third.

requests have been satisfied

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Jun 21, 2017

I think you've got a few extra commit updates since I last looked at this but what I've seen LGTM. Thanks for your patience @dillaman. :)

@dillaman

This comment has been minimized.

Contributor

dillaman commented Jun 21, 2017

@gregsfortytwo indeed -- I wanted early feedback on the mon stuff since it's my first foray.

only reviewed rgw related change

@dillaman

This comment has been minimized.

Contributor

dillaman commented Jul 8, 2017

Passing all subsystem test suites now (i.e. failure unrelated to this PR).

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Jul 8, 2017

what's the background of this pr? is it used to indicate osd/objectstore the usage of object and pick up suitable behavior?

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Jul 8, 2017

@dillaman dillaman added this to the luminous milestone Jul 17, 2017

@jecluis

This comment has been minimized.

Member

jecluis commented Jul 17, 2017

Just to make sure I'm not misunderstanding the scope of this patch set, this is merely meant as non-binding, informational metadata, right? It doesn't appear to be a requirement, at least enforced by the monitor, that a given pool be used for a given application, nor any validation of the application's name or any of the k/v pairs that may eventually be set.
Regardless, it would be nice to document the mon commands in the clearest way possible.

#define dout_subsys ceph_subsys_mon
#define OSD_PG_CREATING_PREFIX "osd_pg_creating"
namespace {
const uint32_t MAX_POOL_APPLICATIONS = 4;

This comment has been minimized.

@jecluis

jecluis Jul 17, 2017

Member

what's the reasoning behind bounding the number of applications to 4?

This comment has been minimized.

@dillaman

dillaman Jul 17, 2017

Contributor

As discussed in the CDM, we want to prevent an end-user doing something silly. In reality, it should really be one app per pool since that's the only tested configuration.

This comment has been minimized.

@jecluis

jecluis Jul 18, 2017

Member

We already check if there's an app for a given pool, in which case we require a '--yes-i-really-mean-it', iirc. This seems, therefore, a bit redundant. I don't mean we should not have a maximum, but making it configurable instead of hardcoded could be useful to some (not that I know of any, but I can imagine people wanting to use the same pool for any number of applications that play well with each other, for instance, may it be via librados or something else).

Anyway, I was mostly looking for the reasoning and your reply did just that. Thanks.

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jul 18, 2017

Member

We're starting to get more prescriptive about sane defaults (like the max object size); here we just don't see a good reason for people to mix usage on a pool. If they're doing a mix of custom librados apps that's fine; they can give them the same name. But mixing cephfs and rgw is a good way to confuse recovery tools and have hard-to-manage performance issues, so let's discourage it as much as possible!

In particular, each app increases the size of the OSDMap. You don't get to inflate it as much as you want.

@dillaman

This comment has been minimized.

Contributor

dillaman commented Jul 19, 2017

@liewegas doc updates pushed

dillaman added some commits Jun 19, 2017

mon: store application metadata in pg_pool_t
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mon: added new "osd pool application" commands
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mon: health warning if in-use pools don't have application enabled
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mon: heuristics for auto-enabling pool applications upon upgrade
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mon: auto-enable CephFS on pools when using "fs new"/"fs add_data_pool"
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mon: default 'rbd' pool to the 'rbd' application
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
librados: new API to manipulate pool application metadata
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
librados: added async application enable API
RGW has numerous pool creation calls, one of which utilizes
an async interface. This adds support for RGW's use-case.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman added some commits Jun 23, 2017

rgw: auto-tag created pools as owned by rgw
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
rbd: add new 'pool init' action for initializing a RBD pool
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mon: disable application metadata on cache tier pools
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
test: enable pool applications for new pools
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
test: skip pool application metadata tests if OSDs not at min luminous
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
mon: disable application health warnings for upgrade test cases
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
test/librados_test_stub: added new application metadata APIs
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
doc: document pool tags in rados pool operations doc
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
doc: fixed warning on rbd quick start link reference
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
PendingReleaseNotes: included details for the pool tags
Signed-off-by: Jason Dillaman <dillaman@redhat.com>

@liewegas liewegas merged commit adfe940 into ceph:master Jul 19, 2017

2 of 4 checks passed

make check running make check
Details
make check (arm64) running make check
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details

@dillaman dillaman deleted the dillaman:wip-pool-tags branch Jul 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment