Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: Do not archive metadata by default #11051

Merged
merged 1 commit into from Sep 19, 2016
Merged

Conversation

prallabh
Copy link
Contributor

Fixes: http://tracker.ceph.com/issues/17256
Signed-off-by: Pavan Rallabhandi PRallabhandi@walmartlabs.com

@prallabh
Copy link
Contributor Author

@cbodley: can you please take a look

@mattbenjamin
Copy link
Contributor

@prallabh this seems reasonable, Casey had described using a pool name rather than a boolean (Friday standup).

@prallabh
Copy link
Contributor Author

@mattbenjamin @cbodley let me know if you suggest other means, felt this to be easier.

@cbodley
Copy link
Contributor

cbodley commented Sep 12, 2016

@prallabh thanks for taking this on!

i tend to prefer using the zone configuration to enable/disable this, for a few reasons:

  • if you use a config option, you could run into confusing situations where a zone has multiple gateways running with different settings - that would lead to some subset of metadata objects being present in the heap, depending on which gateway performed the operation
  • if the config option is set to false when the zone is created (fix_pool_names() is only called once) but changed to true later on, RGWZoneParams::metadata_heap will remain empty, and attempts to use it will lead to errors
  • adding a new config option that defaults to off would change the behavior of existing clusters without the admin making a conscious decision, and would potentially prevent them from taking advantage of any future features built on the metadata heap

my suggestion was to allow the RGWZoneParams::metadata_heap field to be empty, which the gateways would interpret as 'disabled'. so RGWMetadataManager::store_in_heap() and remove_from_heap() would just return success if (heap_pool.name.empty())

we'd also want this to default to empty, which you've done with metadata_heap = ""; in fix_pool_names(). there are also a couple places during upgrade (replace_region_with_zonegroup()) where we make sure that metadata_heap is not empty, so we'd want to remove those as well

unfortunately, the radosgw-admin commands to change zone pool names are not very user-friendly, so we'll eventually want to look at making that easier. you currently have to manually edit the json format like so:

$ radosgw-admin zone get --rgw-zone=name > zone.json
[edit zone.json to change pool names]
$ radosgw-admin zone set --rgw-zone=name --infile=zone.json
$ radosgw-admin period update --commit

@prallabh
Copy link
Contributor Author

@cbodley thanks for the reasoning, can you please check if it is inline now.

@cbodley cbodley self-assigned this Sep 13, 2016
@@ -1492,7 +1492,7 @@ int RGWZoneParams::fix_pool_names()
}

domain_root = fix_zone_pool_name(pool_names, name, ".rgw.data.root", domain_root.name);
metadata_heap = fix_zone_pool_name(pool_names, name, ".rgw.meta", metadata_heap.name);
metadata_heap = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in testing, i see that this does work to turn off the metadata_heap by default. however, it also prevents us from using zone set --infile zone.json to turn it back on - we decode the new value for metadata_heap, then call fix_pool_names() and clear it again. i was able to get the zone set to work with the following change:

-  metadata_heap = fix_zone_pool_name(pool_names, name, ".rgw.meta", metadata_heap.name);
+  if (!metadata_heap.name.empty()) { // optional
+    metadata_heap = fix_zone_pool_name(pool_names, name, ".rgw.meta", metadata_heap.name);
+  }

Copy link
Contributor Author

@prallabh prallabh Sep 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, my bad, I couldn't get around testing that with vstart, that said, with the above suggested change, I guess we don't need to default to empty string, isn't it?

Fixes: http://tracker.ceph.com/issues/17256
Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
@prallabh
Copy link
Contributor Author

@cbodley can you please check now

@prallabh
Copy link
Contributor Author

@cbodley was curious to know the status of this, was hoping if this could make it to 10.2.3 in time or is it too late.

@cbodley cbodley merged commit 96fde20 into ceph:master Sep 19, 2016
@cbodley
Copy link
Contributor

cbodley commented Sep 19, 2016

@prallabh thanks again! i've flagged http://tracker.ceph.com/issues/17256 for backport to jewel, but i'm not sure about the timing for 10.2.3

@prallabh prallabh deleted the wip-17256 branch September 20, 2016 02:40
@hansbogert
Copy link
Contributor

What does this mean for an end-user, i.e, ceph-administrator? Can I remove the meta pool and expect things to not break?

@prallabh
Copy link
Contributor Author

prallabh commented Dec 7, 2016

Yeah, with this fix in one can remove the meta pool. If you are on builds that do not have this fix, one can remove the contents of the meta pool but not the pool.

@cbodley
Copy link
Contributor

cbodley commented Dec 7, 2016

@hansbogert You can remove the pool, but if your zone's metadata_heap field is still set, radosgw will continue trying to store entries there (and will likely recreate the pool in doing so).

If you do want to disable the heap permanently, you can edit the zone entry to set metadata_heap to an empty string:

$ radosgw-admin zone get --rgw-zone=default > zone.json
[edit zone.json]
$ radosgw-admin zone set --rgw-zone=default --infile=zone.json

This needs to be documented, thanks for pointing it out - I've opened a doc bug for this at http://tracker.ceph.com/issues/18174.

@hansbogert
Copy link
Contributor

@cbodley Maybe I should address this broader: Your [@cbodley] main gripe in the mailinglist [1] was that metadata stays there even if a bucket is deleted. This resolution of optionally disabling the meta pool, IMO, does not address that issue, although having the option is of course better than the previous situation.

[1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/012914.html

@cbodley
Copy link
Contributor

cbodley commented Dec 7, 2016

@hansbogert right. after disabling the metadata_heap pool in the zone, the pool itself should be deleted to reclaim the storage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants