background deletion for soft-deleted database #3

jiangphcn · 2020-04-21T11:24:53Z

Overview

Allow background job to delete soft-deleted database according to specified criteria to release space. Once database is hard-deleted, the data can't be fetched back.

Testing recommendations

tbd

Related Issues or Pull Requests

apache/couchdb#2666

Checklist

Code is written and works correctly
Changes are covered by tests
[N/A] Any new configurable parameters are documented in rel/overlay/etc/default.ini
[N/A] A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation

allow background job to delete soft-deleted database according to specified criteria to release space. Once database is hard-deleted, the data can't be fetched back.

nickva · 2020-04-21T23:55:08Z

@jiangphcn looks like a good start!

I was think if we could perhaps just add this to fabric (fabric2*_*) modules like we did with indexing, also since soft deletion happens there, deletion logic would not be out of place there either.

For the general structure what do you think about starting with something simpler at first, and say have only a singleton job, basically with type = <<"dbdelete">> and jobid <<"dbdelete_job">>.

Then we won't even need a couch_dbdelete_server. We just call a couch_jobs:set_type_timeout(?DB_DELETE_JOB_TYPE, 6) and couch_jobs:add( undefined, ?DB_DELETE_JOB_TYPE, ?DB_DELETE_JOB, #{}) in some init function called by a supervisor. That would ensure this job would exist if it doesn't already.

Then the couch_dbdelete_worker gen_server would wait to accept that job run it. Because of the locking we would know only one job would run in the whole cluster.

In that worker we could run fabric2_db:list_deleted_dbs_info(...) but, let's use a callback there and accumulate batches of dbs, say 50 or 100 at a time. And for each db in the batch, we check the time limit and delete the old db instance. Here you can use a fabric2_util:pmap as I don't think we have a transactional interface to open and delete? Or just do a simple foreach loop at first.

After we are done, the job can reschedule itself to run in at some future point in time say after 1 hour or so and then finish. To make the scheduling work in the accept function we'd use the max_sched_time of now + say 1 minute.

jiangphcn · 2020-04-23T12:51:25Z

@nickva thanks so much for your review and great suggestion, especially related to how to better leverage couch_jobs.

Originally, I tried to put logic to fabric module. However, I got error:badarg error when there is call related to couch_jobs. As discussed, it is related to a circular dependency between couch_jobs and fabric.

Based on all of your comments above, I tried to come up with one branch apache/couchdb@ae4e9f0 where one retry is implemented to address a circular dependency.

nickva · 2020-04-23T16:09:28Z

@jiangphcn we chatted on slack but I'll summarize some of that just for visibility

For error:badarg we just need to find a way a better way to wait for couch_jobs to initialize there. Maybe call application:which_applications/1 or try to get he list of children from couch_jobs_sup.

For the general execution pattern it would be something like:

init() : respond back to ensure initialization proceeds, then call wait_for_couch_jobs()

wait_for_couch_jobs() :

wait for couch_jobs to initialize and either get the cleanup job, or try to add it, then call run_loop()

run_loop():

call couch_jobs:accept() and wait for job
once it gets a job it calls process_expirations()
when it returns calls couch_jobs:resubmit(..., ScheduledTime = Now + 1hour)
call run_loop() recursively

process_expirations():

go through all the dbs
periodically update the jobs state, (We could have stats there like accept_time, last_update_time, scheduled_time, dbs_processed, dbs_deleted, etc...)

jiangphcn force-pushed the background-delete branch from 8e4ab5b to 5462d94 Compare April 21, 2020 12:38

background deletion for soft-deleted database

9bac02d

allow background job to delete soft-deleted database according to specified criteria to release space. Once database is hard-deleted, the data can't be fetched back.

jiangphcn force-pushed the background-delete branch from 5462d94 to 9bac02d Compare April 21, 2020 12:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

background deletion for soft-deleted database #3

background deletion for soft-deleted database #3

jiangphcn commented Apr 21, 2020 •

edited

Loading

nickva commented Apr 21, 2020

jiangphcn commented Apr 23, 2020

nickva commented Apr 23, 2020 •

edited

Loading

background deletion for soft-deleted database #3

Are you sure you want to change the base?

background deletion for soft-deleted database #3

Conversation

jiangphcn commented Apr 21, 2020 • edited Loading

Overview

Testing recommendations

Related Issues or Pull Requests

Checklist

nickva commented Apr 21, 2020

jiangphcn commented Apr 23, 2020

nickva commented Apr 23, 2020 • edited Loading

jiangphcn commented Apr 21, 2020 •

edited

Loading

nickva commented Apr 23, 2020 •

edited

Loading