Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: replace trash delay option, add rbd trash purge command #18323

Merged
merged 1 commit into from Jan 23, 2018

Conversation

thmour
Copy link

@thmour thmour commented Oct 16, 2017

There was a discussion about the trash delay option (that is in seconds) to be replaced with something else. @dvanders proposes a /bin/date style input of the form of 2017-12-30, now, 1 week ago, etc. It is a human-readable text and it can't be confused with something like 60*24*7. This pull request implements it, by calling /bin/date safely (exec*).

There is also a need for a batch removal of images in the trash that have expired. It can be used on Openstack as a cron job. The name of the command is "purge" and deletes every image with expired deferment end time from the defined pool. There is also an option --older-than implemented to remove images with deferment end time older than a given date.

This pull request is still a work in progress and we would like to get some feedback possibly from @dillaman and @rjfd

Thanks

@@ -0,0 +1,36 @@
#include <stdio.h>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are existing wrappers around execv like common/SubProcess.h

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I was searching for them

@@ -260,7 +260,7 @@ CEPH_RBD_API int rbd_rename(rados_ioctx_t src_io_ctx, const char *srcname,
const char *destname);

CEPH_RBD_API int rbd_trash_move(rados_ioctx_t io, const char *name,
uint64_t delay);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the API is released, we cannot create a breaking change. The human-readable parsing can probably be limited to the rbd CLI since any API users will have no issue with supplying a derived time in seconds.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this comment is valid -- from an API perspective, it doesn't make much sense to have a negative delay. If some uses the rbd CLI to specify a time in the past, it should be warned/prohibited.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our use cases we need to set the expiration times to the past for test purposes, debugging and averting mistakes that cost space, we don't like to have unexpected things. In the end, a date is passed to the trashspec, so why do we have to use an offset as an argument instead of just passing the desired date as an absolute value? Because now to set a protected until date for an image, I have to get the date, subtract from now, pass the offset to the function so that it adds the offset from now to store the date that I initially had. With having the "uint64_t delay" as "time_t protected_until", I could just put "time(NULL)+5" in C or "int(time.time())+5" in Python2 and have the same effect. This positive offset has really no flexibility and serves a narrow variety of use cases.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that we cannot change a released API -- so now such a change would require a rbd_trash_move2(..., bool expire_time_valid, time_t expire_time)-like method. The reason that an offset delay was chosen was it solved the issue of specifying a time and also indicating whether or not a "protection period" was required (i.e. any value greater than zero).

@@ -322,6 +325,23 @@ class utime_t {
bdt.tm_hour, bdt.tm_min, bdt.tm_sec);
}

static int invoke_date(const std::string date_str, utime_t *result) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: pass date_str by const reference

@@ -36,8 +36,8 @@ void get_move_arguments(po::options_description *positional,
po::options_description *options) {
at::add_image_spec_options(positional, options, at::ARGUMENT_MODIFIER_NONE);
options->add_options()
(at::DELAY.c_str(), po::value<uint64_t>(),
"time delay in seconds until effectively remove the image");
(at::EXPIRES_IN.c_str(), po::value<std::string>()->default_value("1 week"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why not default to "now" since "1 week" otherwise seems arbitrary?

Copy link
Author

@thmour thmour Oct 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still in discussion, we wanted some safe date in case someone mistakenly trashes an image and a trash purge cron job runs after a second and deletes it (and then the image can't be restored). It can be changed to a "1 day", or "now" if it makes more sense.

@@ -151,6 +150,8 @@ std::string delete_status(time_t deferment_end_time) {
std::stringstream ss;
if (now < deferment_end_time) {
ss << "protected until " << time_str;
} else {
ss << "expired at " << time_str;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: lots of extra spaces

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it results in clean readable output, imagine many "expired at" and "protected until" one after another

[11:06][root@xxx (ceph_dev:ceph/build*10) build]# bin/rbd trash list --pool mypool --long
ID           NAME SOURCE DELETED_AT               STATUS                                   
10d174b0dc51 bar  USER   Thu Oct 12 10:51:23 2017 expired at      Fri Oct 13 10:51:23 2017 
10d72ae8944a faz  USER   Wed Oct 18 11:06:08 2017 protected until Fri Oct 20 11:06:08 2017 
10eb2ae8944a foo  USER   Mon Oct 16 11:25:10 2017 expired at      Mon Oct 16 11:25:10 2017 
11032ae8944a baz  USER   Wed Oct 18 11:01:02 2017 protected until Thu Oct 19 11:01:02 2017 
11262ae8944a laz  USER   Wed Oct 18 11:05:19 2017 expired at      Tue Oct 17 11:05:19 2017 

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... true, but if using JSON/XML formatted output, the spacing is unnecessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, thanks again!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries -- thx

@thmour thmour force-pushed the thmour_rbdtrash branch 3 times, most recently from fd29c28 to e556b31 Compare October 23, 2017 11:28
@thmour
Copy link
Author

thmour commented Oct 25, 2017

I now replaced the delay option only in the cli, I also added a trash_move2 function to the c/cpp api instead of breaking the current one. Is it better now?

@dillaman
Copy link

It still seems odd to me to have an API take an English-parseable parameter to specify the delay -- plus that parameter is then fed to a external application to actually parse it. That's why I think I would still vote for leaving the trash_move API as it is but then incorporate the user-friendly parsing within the rbd CLI (which passes the binary delay to the API).

@thmour
Copy link
Author

thmour commented Oct 26, 2017

You are right. I should move this functionality to the cli scope. I just used it in the internal.cc to be able to refer to the past (because the delay argument is unsigned), so I can debug the purge command, but letting delay accept negative values could result in the same flexibility as a date string.

@thmour thmour force-pushed the thmour_rbdtrash branch 3 times, most recently from 4834988 to 3afdc40 Compare November 7, 2017 14:25
@thmour
Copy link
Author

thmour commented Nov 7, 2017

From the api i just changed the delay from uint64_t to int64_t, I use now the /bin/date only on the rbd cli

@thmour thmour force-pushed the thmour_rbdtrash branch 6 times, most recently from 5c9f96e to b48c3f4 Compare November 23, 2017 08:31
@thmour
Copy link
Author

thmour commented Nov 23, 2017

I left the api intact, I focus on the cli now. There is a feature needed called threshold on rbd purge that tries to bring the pool to a certain data usage by removing the oldest trashed images based on the deferment end time. When I use rados.get_pool_stats after the rbd.trash_remove I don't see the updated pool data usage. Is there any better way than this do...while to do this task?

@dillaman
Copy link

@thmour The usage stats are periodically updated over long periods (>30 seconds) so you wouldn't be able to rely on them within a tight loop.

@thmour thmour force-pushed the thmour_rbdtrash branch 2 times, most recently from 4caeae1 to 2d39651 Compare December 1, 2017 15:54
@thmour
Copy link
Author

thmour commented Dec 1, 2017

@dillaman I changed the algorithm to get the disk usage of each image and sum them to reach the threshold instead of deleting images one by one and polling the usage until it reaches the threshold. It seems to have a satisfactory performance (deleted 100GB images with ~50GB disk usage, took a couple of seconds per image). Is it ready to merged? Does the purge command need some kind of test units? I ask because I see no tests for the cli commands, other than the "help.t" for the command output, because the other commands are tested through their equivalent API calls.

@dillaman
Copy link

dillaman commented Dec 4, 2017

@thmour CLI commands don't currently get unit tests but they do have integration tests [1]

[1] https://github.com/ceph/ceph/blob/master/qa/workunits/rbd/cli_generic.sh#L386

@thmour thmour force-pushed the thmour_rbdtrash branch 3 times, most recently from 7161389 to 570e6cc Compare December 6, 2017 13:41
@thmour
Copy link
Author

thmour commented Dec 6, 2017

@dillaman Added a integration test for the trash purge. Anything else?

@dillaman
Copy link

dillaman commented Dec 7, 2017

@thmour Can you run "git rebase origin/master" and "git push thmour thmour_rbdtrash -f" to remove that merge commit?

done

#it should remove 2 images to go under 50% pool data usage
rbd trash purge --threshold 0.5
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this cannot be hard-coded since different cluster test environments will have different usages.

@@ -107,7 +107,7 @@ namespace librbd {
int remove(librados::IoCtx& io_ctx, const std::string &image_name,
const std::string &image_id, ProgressContext& prog_ctx,
bool force=false, bool from_trash_remove=false);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: added white space?

@@ -81,7 +81,9 @@ static const std::string PRETTY_FORMAT("pretty-format");
static const std::string VERBOSE("verbose");
static const std::string NO_ERROR("no-error");

static const std::string DELAY("delay");
static const std::string EXPIRES_AT("expires-at");
static const std::string OLDER_THAN("older-than");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: perhaps for clarity the option should be expired-before since it's based on the expiration time and not the deletion time?

@@ -81,7 +81,9 @@ static const std::string PRETTY_FORMAT("pretty-format");
static const std::string VERBOSE("verbose");
static const std::string NO_ERROR("no-error");

static const std::string DELAY("delay");
static const std::string EXPIRES_AT("expires-at");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since these arguments are really only used by Trash.cc, we can move them all there while making these changes.

@@ -31,13 +31,14 @@ namespace trash {
namespace at = argument_types;
namespace po = boost::program_options;

typedef librbd::trash_image_info_t trash_info_t;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why rename the type?


std::vector<trash_info_t> trash_entries;
r = rbd.trash_list(io_ctx, trash_entries);
if (r < 0) { return r; }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use multiple lines

if(bytes_to_free >= bytes_threshold) break;
}
}
else {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: move else { back up to previous line

rados.get_pool_stats(vec, stats);

uint64_t bytes_to_free = 0;
auto bytes_threshold = (uint64_t)(stats[pool_name].num_bytes * (1 - threshold));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the threshold is based on the used space of the pool and not the max available space? It seems if you are trying to ensure a pool has X% free space available, this calculation should be based upon max available. This information is available by sending a mon_command of {'prefix': 'df', 'format': 'json'} via librados and using json_spirit to find the pool and parsing out max_avail.

if(r < 0) continue;

uint64_t img_size; curr_img.size(&img_size);
r = curr_img.diff_iterate2(nullptr, 0, img_size, false, false,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change that second false to true you will enable fast-diff space calculation which will be much faster at the expense of rounding to 4MB object sizes.

if (vm.find(at::OLDER_THAN) != vm.end()) {
utime_t new_time;
r = utime_t::invoke_date(vm[at::OLDER_THAN].as<std::string>(), &new_time);
if (r < 0) return r;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add error message when date fails

@thmour thmour force-pushed the thmour_rbdtrash branch 4 times, most recently from b7a44c4 to 756d365 Compare December 11, 2017 14:26
@thmour
Copy link
Author

thmour commented Dec 11, 2017

@dillaman A big thanks for your comments!
I added, filled and trashed 4x64M images in my vstart cluster, they filled the 'test' pool by 2.81%.
Then I executed:

$ bin/rbd trash purge test --threshold 0.015 #1.5%
Removing images: 100% complete...done.
$ bin/ceph df
GLOBAL:
    SIZE      AVAIL     RAW USED     %RAW USED 
    30.2G     26.4G        3.77G         12.50 
POOLS:
    NAME                  ID     USED      %USED     MAX AVAIL     OBJECTS 
    cephfs_data_a         1          0         0         8.70G           0 
    cephfs_metadata_a     2      2.19K         0         8.70G          21 
    test                  3       128M      1.42         8.70G          39 

Works as intended. As the threshold test can't be hardcoded I added to the test the trivial 0 and 1 values.

@dillaman dillaman changed the title [WIP] rbd trash: replace delay option, add rbd trash purge command rbd: replace trash delay option, add rbd trash purge command Dec 11, 2017


json_spirit::mValue json;
json_spirit::read_or_throw(outbl.to_str(), json);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would be nicer to not throw and instead log an error and return an error code if you couldn't parse the result

librbd::RBD rbd;

std::vector<librbd::trash_image_info_t> trash_entries;
r = rbd.trash_list(io_ctx, trash_entries);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that purge should probably only handle images where the source == USER to avoid deleting RBD-internally created trash entries (i.e. rbd-mirror deferred deletions and/or migrating images).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link

@dillaman dillaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

if(bytes_to_free >= bytes_threshold) break;
}
} else {
time_t expire_ts = ceph_clock_gettime();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you rebase your PR since this method was renamed to clock_gettime?

@dillaman
Copy link

dillaman commented Jan 2, 2018

@thmour Can you also update the cli_generic.sh test since it's now failing due to the removal of the --delay optional [1]?

[1] http://qa-proxy.ceph.com/teuthology/jdillaman-2018-01-02_14:09:46-rbd-wip-jd-testing-distro-basic-smithi/2016487/teuthology.log

@dillaman
Copy link

dillaman commented Jan 2, 2018

retest this please

Replaced the delay argument for the trash move
command with a string acceptable by /bin/date, e.g.:

$ rbd trash move --pool foo --image bar --expires-in "2 weeks"

Added a "rbd trash purge" command that deletes any expired
image from the trash, has also a command to alter the current
expiration date with the "--older-than" argument which accepts
again a valid argument for /bin/date, e.g.:

$rbd trash purge mypool --older-than "2017-08-20"

There is also the "threshold" argument which tries to remove the
oldest trashed images (by deferment end time) until the pool space
is freed up to a percentage point, e.g.:

$ rbd trash purge mypool --threshold 0.9

If mypool uses 1GB it will try to remove trashed images until the
pool usage becomes equal to or lower than 900MB.

Signed-off-by: Theofilos Mouratidis <t.mour@cern.ch>
@dillaman dillaman merged commit 4dbcd4d into ceph:master Jan 23, 2018
@thmour thmour deleted the thmour_rbdtrash branch January 24, 2018 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants