Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete executions older than a certain date #976

Closed
danifr opened this issue Oct 17, 2014 · 24 comments

Comments

Projects
None yet
@danifr
Copy link
Contributor

commented Oct 17, 2014

API enhancement

@ahonor ahonor added the enhancement label Oct 17, 2014

@danifr

This comment has been minimized.

Copy link
Contributor Author

commented Oct 20, 2014

Even though I'm not a great developer, I've written a small script to delete > 90 day old entries.
Someone might find it useful.

You can find it here: https://github.com/danifr/danifr-rundeck/blob/master/py_scripts/deleteoldlogs.py

@jippi

This comment has been minimized.

Copy link

commented Mar 7, 2015

👍

A way to auto-delete/archive executions would be highly useful

@urgerestraint

This comment has been minimized.

Copy link

commented Jul 20, 2015

+1

@gentunian

This comment has been minimized.

Copy link

commented Nov 24, 2015

This is a great feature to have. Mostly when you have jobs that run frequently. You may have a lot of executions consuming a lot of disk space, but more important, in your use case old executions could be obsolete info.

@ziurjam

This comment has been minimized.

Copy link

commented Nov 18, 2016

Any updates on this feature? Has it been added to the latest version of Rundeck 2.6.11?

@alexfouche

This comment has been minimized.

Copy link

commented Nov 28, 2016

I have 2.6.11, and in recent, at the bottom right, there is a bulk delete and a select all.
Unfortunately, this is useless, since select all only selects the 20 entries on the screen, and not really all history of jobs runs !

I have created #2203 for this precise issue of the select all behaviour`

@gentunian

This comment has been minimized.

Copy link

commented Nov 28, 2016

@alexfouche that's how is supposed to work any "select all" feature. Selecting all not in your sight range is a bad UI design. You select all what is on your scope. If pagination is set to 20, then it will select those 20.

That behaviour is extended in any (decent) application. Gmail, any mail reader, and so on.

Also, you cant trick the pagination by changing the query parameter: offset=0&max=20

Try doing something like: activity?offset=0&max=150 and now your scope is 150 items. Selecting all will select all items in your scope.

@XANi

This comment has been minimized.

Copy link

commented Jan 1, 2017

@gentunian Well if you are going to use GMail as an example of "good design", they have option to do that. If you select all messages in scope, you get message like that:

All 100 messages on this page are selected. Select all messages that match this search

and clicking it will also select messages on rest of the pages

@gentunian

This comment has been minimized.

Copy link

commented Jan 2, 2017

Hi @XANi , I'm aware of that. And that would be a nice catch for this one, but is not exactly what @alexfouche is reporting. He is reporting the feature like a bug here #2203 (it's not a bug, it's a feature). What you say could be proposed as enhancement.

@XANi

This comment has been minimized.

Copy link

commented Jan 2, 2017

Well that would depend on use case. I'm guessing that one is just "I have a lot of old job logs I dont care about and I want to clean it up.". But the way of solving it doesn't seem to be that useful, nobody really wants to waste time going thru menus to remove some old logs, even with bulk option.

Systems like Jenkins just have configurable limits per project of number of last jobs to keep (IIRC it was "keep last X executions", "keep last X days", or both). That way if someone doesn't care about history, or logs take too much space it is possible to specify say "keep last 30 days but at most 50 jobs" and forget about it, as even if job is triggered few times in a row (for example debugging job that normally runs only once a day) it will limit it.

@gentunian

This comment has been minimized.

Copy link

commented Jan 2, 2017

It's not about use case, it's about UX design. Deleting all data including hidden data without the explicit user decision is not a desired feature. The user must know what action is taken every time he produces one. You just need to remove ambiguous meaning preventing not desired actions.

"I have a lot of old job logs I dont care about and I want to clean it up.". But the way of solving it
doesn't seem to be that useful, nobody really wants to waste time going thru menus to remove
some old logs, even with bulk option.

Yep, but the system doesn't know about that. You must somehow tell your desires. That's when UX comes to the scene.

Systems like Jenkins just have configurable limits per project of number of last jobs to keep
(IIRC it was "keep last X executions", "keep last X days", or both). That way if someone doesn't care
about history, or logs take too much space it is possible to specify say "keep last 30 days but at
most 50 jobs" and forget about it, as even if job is triggered few times in a row (for example
debugging job that normally runs only once a day) it will limit it.

That's what I'm talking about, and that's what I'm trying to explain. Just the "select all" feature rundeck has is what is expected in a nice UX approach, avoiding selecting all set on behalf of the user just as default, and selecting what your are watching on screen. When you select all your items on sight, you could provide just a simple label as gmail does giving you the option of actually selecting all the items since beginning of time.

Also, you can go for @danifr approach for removing old data, works well and I'm using it as a custom job inside rundeck itself. He provided the gist here and you can modify the code to your needs.

@nkadel-skyhook

This comment has been minimized.

Copy link

commented Jun 25, 2017

Just the "select all" feature rundeck has is what is expected in a nice UX approach,

I'll beg to disagree. It's replacing a common English word with the meaning that this particular interface designer chose, instead, because they think their model of what a user should do is superior to that which users have actually asked for in this thread. If the word does not mean "All", then clarify to "All in this display", because that is what it means.

Forcing users to, individually, activate an out-of-band log deletion or log expiration system directly violates several of the guiding principles from an old essay by Eric Raymond about open source interfaces, the "Luxury of Ignorance", at http://www.catb.org/esr/writings/cups-horror.html . It violates principles 1, 5, and 6, and also violates several of the guidelines I suggested to him and which he elected to add. In particular:

  • Are there settings you can do from the command line or hand-editing config files that cannot be done from the GUI?
@gentunian

This comment has been minimized.

Copy link

commented Jun 25, 2017

we will disagree forever :)

Select all should never select things far away from what you are seeing. You are not playing with backend, you are using a UI. I will agree with you if the context is a command line utility and you "select all" to delete. But this is a UI, your context of ALL is what you are watching or the list you have paged. Also, I will agree that a specific action should be taken to delete all just like gmail has.

cheers!

@alexfouche

This comment has been minimized.

Copy link

commented Jun 25, 2017

well, i believe the context is also what the label or sentence says. It is written "select all", not "select all items on screen or visible", nor "select ALL past logs (for the job)". So lacking the precision in the label, i simply believe "all" means what the dictionnary says, which is the whole set of logs, not just the subset which happen to be shown on screen just at that time. So if that is just a matter a context, then please make it explicit on the label and let avoid implicit confusion. Not everyone sees implicit things the same way, because, well, that is implicit, and we all have a different experience of using software, different culture, different native languages, ...
And then once it is explicit that the action will in fact not really delete all past logs, that means that there is an option missing.

In fact, writing just this last sentence makes me believe that "delete all logs on screen" may not be a needed action by the users, nor especially useful by itself. While at the contrary, "delete all and really all logs" is a needed feature. So regardless at the perceived meaning of the word "all" depending of what everyone sees for his own UX/usage context, why not interpret the meaning of "all" in the context of what actually makes sense as a functionnality ?

The fact that we seem to "disagree forever" is unfortunately what seems to happen too many times in the opensource software world when the main developper believes he holds the truth against all other people, and then projects begin to fork away. I am not saying who is right or who is wrong, because of course, we all believe to be right :-P

@gentunian

This comment has been minimized.

Copy link

commented Jun 25, 2017

Don't get me wrong. It's a friendly statement. It has to do with what you say about "not everyone see it that way". I bet you give the UI to a primate and he could not even start using it. You must assume something. For instance, you are assuming your users knows how to open a page, use the mouse and browse the site. It may like you or may not, but you must assume something. That assumption is something like axioms you must take. After you say: "the UI must assume user level of N" you then can derive your UX.

A great example of selecting things from a list is old gmail (not inbox). Are you familiar with it? There's no "select all" text but there's a checkbox that selects all (in your view of the item list) and a text next to the selected item count appears a label saying: "Select all 3,830 conversations in Primary" that will indeed selects all the items out of sight.

I agree with you that the action was not "select all". Instead, it was a checkbox on top of the list. That UX expects a user level of N that already knows what to expect clicking that checkbox. And if he didn't know, clicking on it for the first time it will show him what it does. But again, it's assumed that from a list of item holding checkboxes, a checkbox in headers will select all (from that view). And also, it's assumed that the user will learn how it's done.

Also, I agree with you that rundeck lacks of that feature and it will be nice to be implemented. And lastly, I'm just a user of this software and I have no voice in this project. But I enjoy discussing open sourced things as a developer.

@xuther

This comment has been minimized.

Copy link

commented Aug 16, 2017

+1
It would be great to have the ability to set some sort of retention time period, after which executions are purged.

@HowardHong

This comment has been minimized.

Copy link

commented Aug 17, 2017

I agree. A global or project level config bulk-delete-expiration variable similiar to the unix find command (eg: "find . -mtime n' -exec rm {} ; ) would be very nice to have.

@JoshVorick

This comment has been minimized.

Copy link

commented Nov 16, 2017

+1

Just ran out of space on a box because Rundeck slowly accumulated ~100GB of logs. I guess one solution is to write a Rundeck script to delete old logs?

@ahonor

This comment has been minimized.

Copy link
Contributor

commented Nov 16, 2017

@JoshVorick yes the workaround is to use the rd executions deletebulk command.

@ahonor ahonor closed this Nov 16, 2017

@danielmroberts

This comment has been minimized.

Copy link

commented Jan 19, 2018

@ahonor We are using the api workaround but it is very very very very slow. We hit it daily to remove anything over a month old (~7k executions) and it takes an hour and a half to do so. Just realized it is locking up the server and causing scheduled tasks to not be triggered during this time period. Great.

We already had disk space issues with our first machine and couldn't delete old execution logs due to performance but thought it was because of low disk space. This time we started from scratch with mysql and were sure to setup a task to perform removal using the workaround but now we are having more issues with the bad performance of the work around.

There needs to be an efficient way to accomplish cleanup.

@ahonor

This comment has been minimized.

Copy link
Contributor

commented Jan 19, 2018

@danielmroberts 100% agreed this needs to be done more automatically and configurable per job. In the mean time, curious what database you are using. Indexing can improve speed, too.

@danielmroberts

This comment has been minimized.

Copy link

commented Jan 22, 2018

@ahonor Using a local mysql database and the default initialized database structure

@danielmroberts

This comment has been minimized.

Copy link

commented Jan 22, 2018

@ahonor any recommendations on indexes that wouldn't have been implemented automatically?

@danielmroberts

This comment has been minimized.

Copy link

commented Jan 25, 2018

I used the script from danifr but it looks like the underlying issue is just the amount of time it takes to delete one execution, which I assume is what the cli bulk delete workaround is doing underneath. It is taking nearly a full second per deletion. When you have thousands of executions to delete that adds up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.