-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[4398] Trash batch-purging #4399
Conversation
Proposed fixes: Purge datasets 10 at a time to avoid timeouts when there are a lot of trash items. Make the trash list a numbered list. Since the list does not do pagination, and the purge button is all the way in the bottom, the sysadmin needs to go to the bottom of the list to press "Purge." Making it a numbered list, gives the sysadmin useful info when there a lot of datasets available for purging. Features: [ ] includes tests covering changes [ ] includes updated documentation [X] includes user-visible changes [ ] includes API changes [ ] includes bugfix for possible backport Please [X] all the boxes above that apply
I'd suggest doing this a bit differently, especially because we allow hooking package delete/purge which can cause some pretty lengthy tasks to run (such as unmarking it for publication on the source site). Your batching may prevent a database operation from timing out, but it won't stop the HTTP server from timing out.
With the removal of revisions the purge method can be simplified quite a bit as well. |
I'm fine with this quick hack, but @TkTech 's approach is the best long term fix if someone wants to work on it. |
We had a large 2.5.x installation with hundreds of datasets in the Trash, and we were getting web server timeouts when their sysadmin was trying to purge the trash. Had to pull this together as they were running out of DB space as the datastore was filling it up. Even had to use @metaodi's "quick-and-dirty" paster command to delete orphaned tables (#3422 ). They're now on 2.7.3, and find we still need this because of their heavy usage of the site results in trash filling up. Perhaps, until someone works on the long-term fix? |
Sorry, my comment really should have been on the issue rather than the PR. I'm 100% 👍 on this change going in, but the issue is still an issue and should remain open until we can move it to the background workers. |
This is handled by the flask views now in master. Can you make the same change to https://github.com/ckan/ckan/blob/master/ckan/views/admin.py ? |
…mber of trash items are greater than 10
The Travis build shows up as pending here, but if you click through, all the tests pass. @jqnatividad made the requested changes for Flask. Is there anything else that needs to be done before merging the PR, @wardi? |
Closing this stale PR as this has been "partly" fixed with the removal of revisions. "Partly" because the datasets are only marked as |
Related to #4398
Proposed fixes:
Purge datasets 10 at a time to avoid timeouts when there are a lot of trash items.
Make the trash list a numbered list. Since the list does not do pagination, and the purge button is all the way in the bottom, the sysadmin needs to go to the bottom of the list to press "Purge." Making it a numbered list, gives the sysadmin useful info when there a lot of datasets available for purging.
Features:
Please [X] all the boxes above that apply