Design reliable expiration #26

djmitche · 2020-11-22T00:05:52Z

The sync model is such that there's a point in the list of operations that is committed. That can be used to determine when to delete an expired task -- basically, when an op after its expiration time has been committed.

That might need to be recorded as an operation? Needs a bit more thought.

djmitche · 2020-12-03T03:16:44Z

Note that this is not about marking a task as Status::Deleted. That's just a visibility thing. This is about actually removing the task from the database.

The problem with deleting tasks is that the only way to combine an Update and a Delete operation is into a Delete (since the deleted properties are all gone). And that is a kind of data loss.

So, we want to make sure that a task is only actually deleted from the DB when it's very unlikely that it will conflict with an update. One way to do that may be to only delete tasks when their modified timestamp is sufficiently far in the past -- and more than juts a few days. We'll probably want configurable times, and probably different times for Status:Deleted vs. Status:Completed tasks. Users might want to do historical analyses of completed tasks, but not care about deleted, for example.

djmitche · 2021-01-09T23:46:12Z

I think this will be one of the "gc" operations, and should probably be done only after a sync. Scan the DB for expired tasks (using the definition of expired described above) and add Delete operations for those tasks.

The result for users will be that any further operations on that task will be ignored, which is not terribly confusing.

savchenko · 2021-03-25T13:00:03Z

Having data actually deleted and not merely hidden is an important property. Can we do something like this?

Each task has the following structure: {UUID: [k, v], [k, v], ...}
When task is deleted, it becomes {UUID: nop }
When any other client is sync'ed, they delete tasks where UUID == nop.

djmitche · 2021-03-25T13:25:26Z

The problem is that we don't store tasks, we store operations, and those operations need to be transposable. An Update(taskId, "status", "deleted") operation is easily transposed with other Update operations, but Delete(taskId) and Update(taskId, "k", "v") don't transpose without losing the k/v data in the second operation. So a delete would "win" over a simultaneous update, for example.

This actually mirrors TW -- task 123 delete just marks the task as deleted. What TW doesn't do is eventually reclaim the space in that task. This proposal fixes the latter bit, by actually deleting the data some time after the deletion operation, when it's more certain that the user won't be making simultaneous updates to the task.

I guess the idea behind "actually deleted" is user data sovereignty? Like when I delete a facebook post it'd be nice if facebook actually deleted it instead of just marking it invisible? I get that, but going back to the data model, we store operations, and those are in an immutable chain. So even if I add a Delete(taskId) operation to that chain, further back in the chain there's an Update(taskId, "description", "Lonnie says PharmaCorp will cure cancer tomorrow, BUY BUY BUY") that can't be removed without invalidating all operations after that one, and even if Lonnie's lawyer really wants you to. So, I think the best we can do is to be clear that "deleted' data can still be recovered if necessary -- both by making an "undelete" operation and with a big "NOTE" in the section on expiration. That note can also suggest some kind of export-and-re-import as a way to "start over" without deleted/expired tasks.

Note, again, that Google Docs (the canonical example of an OT-based document) have the same characteristic: you can't delete changes in a gdoc history, but you can make a new gdoc and copy the latest state into it, then delete the entire old document.

djmitche added the rust-api label Jan 2, 2021

savchenko added the TBD label Apr 28, 2021

djmitche removed the TBD label Sep 26, 2021

djmitche added this to the v0.5.0 milestone Sep 26, 2021

dbr mentioned this issue Oct 25, 2021

Add integration tests for snapshots #310

Merged

djmitche self-assigned this Jan 29, 2022

djmitche linked a pull request Mar 6, 2022 that will close this issue

Add support for expiration #341

Merged

djmitche closed this as completed in #341 Mar 10, 2022

This was referenced Apr 25, 2024

Add support for purging tasks #383

Closed

Optional support for automatically expiring old tasks GothenburgBitFactory/taskwarrior#3402

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design reliable expiration #26

Design reliable expiration #26

djmitche commented Nov 22, 2020

djmitche commented Dec 3, 2020

djmitche commented Jan 9, 2021

savchenko commented Mar 25, 2021

djmitche commented Mar 25, 2021

Design reliable expiration #26

Design reliable expiration #26

Comments

djmitche commented Nov 22, 2020

djmitche commented Dec 3, 2020

djmitche commented Jan 9, 2021

savchenko commented Mar 25, 2021

djmitche commented Mar 25, 2021