New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd: replace trash delay option, add rbd trash purge command #18323
Conversation
5b87db8
to
f2406fc
Compare
src/include/o_execv.h
Outdated
@@ -0,0 +1,36 @@ | |||
#include <stdio.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there are existing wrappers around execv
like common/SubProcess.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I was searching for them
src/include/rbd/librbd.h
Outdated
@@ -260,7 +260,7 @@ CEPH_RBD_API int rbd_rename(rados_ioctx_t src_io_ctx, const char *srcname, | |||
const char *destname); | |||
|
|||
CEPH_RBD_API int rbd_trash_move(rados_ioctx_t io, const char *name, | |||
uint64_t delay); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the API is released, we cannot create a breaking change. The human-readable parsing can probably be limited to the rbd CLI since any API users will have no issue with supplying a derived time in seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this comment is valid -- from an API perspective, it doesn't make much sense to have a negative delay. If some uses the rbd CLI to specify a time in the past, it should be warned/prohibited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our use cases we need to set the expiration times to the past for test purposes, debugging and averting mistakes that cost space, we don't like to have unexpected things. In the end, a date is passed to the trashspec, so why do we have to use an offset as an argument instead of just passing the desired date as an absolute value? Because now to set a protected until date for an image, I have to get the date, subtract from now, pass the offset to the function so that it adds the offset from now to store the date that I initially had. With having the "uint64_t delay" as "time_t protected_until", I could just put "time(NULL)+5" in C or "int(time.time())+5" in Python2 and have the same effect. This positive offset has really no flexibility and serves a narrow variety of use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that we cannot change a released API -- so now such a change would require a rbd_trash_move2(..., bool expire_time_valid, time_t expire_time)
-like method. The reason that an offset delay was chosen was it solved the issue of specifying a time and also indicating whether or not a "protection period" was required (i.e. any value greater than zero).
src/include/utime.h
Outdated
@@ -322,6 +325,23 @@ class utime_t { | |||
bdt.tm_hour, bdt.tm_min, bdt.tm_sec); | |||
} | |||
|
|||
static int invoke_date(const std::string date_str, utime_t *result) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: pass date_str
by const reference
src/tools/rbd/action/Trash.cc
Outdated
@@ -36,8 +36,8 @@ void get_move_arguments(po::options_description *positional, | |||
po::options_description *options) { | |||
at::add_image_spec_options(positional, options, at::ARGUMENT_MODIFIER_NONE); | |||
options->add_options() | |||
(at::DELAY.c_str(), po::value<uint64_t>(), | |||
"time delay in seconds until effectively remove the image"); | |||
(at::EXPIRES_IN.c_str(), po::value<std::string>()->default_value("1 week"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why not default to "now" since "1 week" otherwise seems arbitrary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still in discussion, we wanted some safe date in case someone mistakenly trashes an image and a trash purge cron job runs after a second and deletes it (and then the image can't be restored). It can be changed to a "1 day", or "now" if it makes more sense.
src/tools/rbd/action/Trash.cc
Outdated
@@ -151,6 +150,8 @@ std::string delete_status(time_t deferment_end_time) { | |||
std::stringstream ss; | |||
if (now < deferment_end_time) { | |||
ss << "protected until " << time_str; | |||
} else { | |||
ss << "expired at " << time_str; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: lots of extra spaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it results in clean readable output, imagine many "expired at" and "protected until" one after another
[11:06][root@xxx (ceph_dev:ceph/build*10) build]# bin/rbd trash list --pool mypool --long
ID NAME SOURCE DELETED_AT STATUS
10d174b0dc51 bar USER Thu Oct 12 10:51:23 2017 expired at Fri Oct 13 10:51:23 2017
10d72ae8944a faz USER Wed Oct 18 11:06:08 2017 protected until Fri Oct 20 11:06:08 2017
10eb2ae8944a foo USER Mon Oct 16 11:25:10 2017 expired at Mon Oct 16 11:25:10 2017
11032ae8944a baz USER Wed Oct 18 11:01:02 2017 protected until Thu Oct 19 11:01:02 2017
11262ae8944a laz USER Wed Oct 18 11:05:19 2017 expired at Tue Oct 17 11:05:19 2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... true, but if using JSON/XML formatted output, the spacing is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry, thanks again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries -- thx
fd29c28
to
e556b31
Compare
I now replaced the delay option only in the cli, I also added a trash_move2 function to the c/cpp api instead of breaking the current one. Is it better now? |
It still seems odd to me to have an API take an English-parseable parameter to specify the delay -- plus that parameter is then fed to a external application to actually parse it. That's why I think I would still vote for leaving the |
You are right. I should move this functionality to the cli scope. I just used it in the internal.cc to be able to refer to the past (because the delay argument is unsigned), so I can debug the purge command, but letting delay accept negative values could result in the same flexibility as a date string. |
4834988
to
3afdc40
Compare
From the api i just changed the delay from uint64_t to int64_t, I use now the /bin/date only on the rbd cli |
5c9f96e
to
b48c3f4
Compare
I left the api intact, I focus on the cli now. There is a feature needed called threshold on rbd purge that tries to bring the pool to a certain data usage by removing the oldest trashed images based on the deferment end time. When I use |
@thmour The usage stats are periodically updated over long periods (>30 seconds) so you wouldn't be able to rely on them within a tight loop. |
4caeae1
to
2d39651
Compare
@dillaman I changed the algorithm to get the disk usage of each image and sum them to reach the threshold instead of deleting images one by one and polling the usage until it reaches the threshold. It seems to have a satisfactory performance (deleted 100GB images with ~50GB disk usage, took a couple of seconds per image). Is it ready to merged? Does the purge command need some kind of test units? I ask because I see no tests for the cli commands, other than the "help.t" for the command output, because the other commands are tested through their equivalent API calls. |
2d39651
to
f03cbbd
Compare
@thmour CLI commands don't currently get unit tests but they do have integration tests [1] [1] https://github.com/ceph/ceph/blob/master/qa/workunits/rbd/cli_generic.sh#L386 |
7161389
to
570e6cc
Compare
570e6cc
to
41cb593
Compare
@dillaman Added a integration test for the trash purge. Anything else? |
@thmour Can you run "git rebase origin/master" and "git push thmour thmour_rbdtrash -f" to remove that merge commit? |
qa/workunits/rbd/cli_generic.sh
Outdated
done | ||
|
||
#it should remove 2 images to go under 50% pool data usage | ||
rbd trash purge --threshold 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this cannot be hard-coded since different cluster test environments will have different usages.
src/librbd/internal.h
Outdated
@@ -107,7 +107,7 @@ namespace librbd { | |||
int remove(librados::IoCtx& io_ctx, const std::string &image_name, | |||
const std::string &image_id, ProgressContext& prog_ctx, | |||
bool force=false, bool from_trash_remove=false); | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: added white space?
src/tools/rbd/ArgumentTypes.h
Outdated
@@ -81,7 +81,9 @@ static const std::string PRETTY_FORMAT("pretty-format"); | |||
static const std::string VERBOSE("verbose"); | |||
static const std::string NO_ERROR("no-error"); | |||
|
|||
static const std::string DELAY("delay"); | |||
static const std::string EXPIRES_AT("expires-at"); | |||
static const std::string OLDER_THAN("older-than"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: perhaps for clarity the option should be expired-before
since it's based on the expiration time and not the deletion time?
src/tools/rbd/ArgumentTypes.h
Outdated
@@ -81,7 +81,9 @@ static const std::string PRETTY_FORMAT("pretty-format"); | |||
static const std::string VERBOSE("verbose"); | |||
static const std::string NO_ERROR("no-error"); | |||
|
|||
static const std::string DELAY("delay"); | |||
static const std::string EXPIRES_AT("expires-at"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: since these arguments are really only used by Trash.cc
, we can move them all there while making these changes.
src/tools/rbd/action/Trash.cc
Outdated
@@ -31,13 +31,14 @@ namespace trash { | |||
namespace at = argument_types; | |||
namespace po = boost::program_options; | |||
|
|||
typedef librbd::trash_image_info_t trash_info_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why rename the type?
src/tools/rbd/action/Trash.cc
Outdated
|
||
std::vector<trash_info_t> trash_entries; | ||
r = rbd.trash_list(io_ctx, trash_entries); | ||
if (r < 0) { return r; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: use multiple lines
src/tools/rbd/action/Trash.cc
Outdated
if(bytes_to_free >= bytes_threshold) break; | ||
} | ||
} | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: move else {
back up to previous line
src/tools/rbd/action/Trash.cc
Outdated
rados.get_pool_stats(vec, stats); | ||
|
||
uint64_t bytes_to_free = 0; | ||
auto bytes_threshold = (uint64_t)(stats[pool_name].num_bytes * (1 - threshold)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the threshold is based on the used space of the pool and not the max available space? It seems if you are trying to ensure a pool has X% free space available, this calculation should be based upon max available. This information is available by sending a mon_command
of {'prefix': 'df', 'format': 'json'}
via librados and using json_spirit to find the pool and parsing out max_avail
.
src/tools/rbd/action/Trash.cc
Outdated
if(r < 0) continue; | ||
|
||
uint64_t img_size; curr_img.size(&img_size); | ||
r = curr_img.diff_iterate2(nullptr, 0, img_size, false, false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you change that second false
to true
you will enable fast-diff space calculation which will be much faster at the expense of rounding to 4MB object sizes.
src/tools/rbd/action/Trash.cc
Outdated
if (vm.find(at::OLDER_THAN) != vm.end()) { | ||
utime_t new_time; | ||
r = utime_t::invoke_date(vm[at::OLDER_THAN].as<std::string>(), &new_time); | ||
if (r < 0) return r; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: add error message when date
fails
b7a44c4
to
756d365
Compare
@dillaman A big thanks for your comments! $ bin/rbd trash purge test --threshold 0.015 #1.5%
Removing images: 100% complete...done.
$ bin/ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
30.2G 26.4G 3.77G 12.50
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
cephfs_data_a 1 0 0 8.70G 0
cephfs_metadata_a 2 2.19K 0 8.70G 21
test 3 128M 1.42 8.70G 39 Works as intended. As the threshold test can't be hardcoded I added to the test the trivial 0 and 1 values. |
src/tools/rbd/action/Trash.cc
Outdated
|
||
|
||
json_spirit::mValue json; | ||
json_spirit::read_or_throw(outbl.to_str(), json); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: would be nicer to not throw and instead log an error and return an error code if you couldn't parse the result
librbd::RBD rbd; | ||
|
||
std::vector<librbd::trash_image_info_t> trash_entries; | ||
r = rbd.trash_list(io_ctx, trash_entries); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that purge
should probably only handle images where the source == USER
to avoid deleting RBD-internally created trash entries (i.e. rbd-mirror deferred deletions and/or migrating images).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
756d365
to
3ede5fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
src/tools/rbd/action/Trash.cc
Outdated
if(bytes_to_free >= bytes_threshold) break; | ||
} | ||
} else { | ||
time_t expire_ts = ceph_clock_gettime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can you rebase your PR since this method was renamed to clock_gettime
?
@thmour Can you also update the |
retest this please |
498a3b8
to
811a445
Compare
811a445
to
1375ffd
Compare
Replaced the delay argument for the trash move command with a string acceptable by /bin/date, e.g.: $ rbd trash move --pool foo --image bar --expires-in "2 weeks" Added a "rbd trash purge" command that deletes any expired image from the trash, has also a command to alter the current expiration date with the "--older-than" argument which accepts again a valid argument for /bin/date, e.g.: $rbd trash purge mypool --older-than "2017-08-20" There is also the "threshold" argument which tries to remove the oldest trashed images (by deferment end time) until the pool space is freed up to a percentage point, e.g.: $ rbd trash purge mypool --threshold 0.9 If mypool uses 1GB it will try to remove trashed images until the pool usage becomes equal to or lower than 900MB. Signed-off-by: Theofilos Mouratidis <t.mour@cern.ch>
cf42716
to
b9e4aa9
Compare
There was a discussion about the trash delay option (that is in seconds) to be replaced with something else. @dvanders proposes a
/bin/date
style input of the form of2017-12-30
,now
,1 week ago
, etc. It is a human-readable text and it can't be confused with something like60*24*7
. This pull request implements it, by calling/bin/date
safely (exec*).There is also a need for a batch removal of images in the trash that have expired. It can be used on Openstack as a cron job. The name of the command is "purge" and deletes every image with expired deferment end time from the defined pool. There is also an option
--older-than
implemented to remove images with deferment end time older than a given date.This pull request is still a work in progress and we would like to get some feedback possibly from @dillaman and @rjfd
Thanks