Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define Delete to be idempotent #104

Closed
Stebalien opened this issue Oct 23, 2018 · 15 comments · Fixed by #132, ipfs/go-ds-s3#29, ipfs/go-ds-leveldb#30 or ipfs/go-ds-flatfs#54
Closed

Define Delete to be idempotent #104

Stebalien opened this issue Oct 23, 2018 · 15 comments · Fixed by #132, ipfs/go-ds-s3#29, ipfs/go-ds-leveldb#30 or ipfs/go-ds-flatfs#54
Assignees

Comments

@Stebalien
Copy link
Member

Moving from: ipfs/go-ds-flatfs#30

  1. We usually only care if the operation succeeded.
  2. We definitely don't want to abort a batch operation because a delete failed.
  3. User code often just aborts/returns errors without thinking about them.
  4. Not all datastores can tell us if there was nothing to delete (e.g., badger).
  5. Flatfs no longer reliably tells us this.
  6. HTTP delete is defined to be idempotent so many web-backed datastores won't be able to fulfill this contract either. (also, there's a reason HTTP delete is idempotent...).

This boils down to:

  1. We can't reliably return ErrNotFound on delete making it useless except, maybe, as a source of metrics.
  2. Having to handle ErrNotFound on delete in user code is annoying.
  3. Having to try to return ErrNotFound on delete is annoying at best.

Proposal: Define it to be idempotent (i.e., return nil when the key isn't found).

@Stebalien
Copy link
Member Author

For example, see: https://github.com/ipfs/go-datastore/pull/103/files#diff-42ea5667b327bb207485077410d5f499R42. I could fix that, but it would require a bunch of extra state I don't really want to track (and it would cause a failed delete to fail the entire batch.

@kevina
Copy link
Contributor

kevina commented Oct 23, 2018

I don't have a strong objection. I would prefer a way to communicate back that a key is not found if that information is known, but as this may complicate things, this is not a hard requirement.

@Stebalien
Copy link
Member Author

The user can still call Has first. I agree that exposing all information we know is a good idea, it just makes life hard in practice (in this case).

@raulk
Copy link
Member

raulk commented Oct 24, 2018

Generally, I agree. DELETEs are idempotent in SQL, leveldb, badger, redis, but some databases will report the number of keys deleted. Maybe we should add capability for returning an (int, error), where the former can take value -1 if the DB does not support this feature?

For the record, go-ds-badger does not return ErrNotFound on deleting an inexistent key; it simply forwards the error from badger, which does not complain in this case.

  1. We definitely don't want to abort a batch operation because a delete failed.

In most of cases, yes. But ultimately this depends on the application logic. For example, in the Datastore TTL shim I've half-baked, we have two keys per TTL entry (for good reasons), and we might want deletes pertaining to a single TTL entry to be as atomic as possible.

@Stebalien
Copy link
Member Author

In most of cases, yes. But ultimately this depends on the application logic.

Ah, sorry, I meant we don't want to abort if the delete failed because the item wasn't there. We probably do want to abort on other errors.

Maybe we should add capability for returning an (int, error), where the former can take value -1 if the DB does not support this feature?

So we'd return:

  • 1 - We deleted something.
  • 0 - Nothing happened.
  • -1 - No idea.

(where we'd only return an error if something went wrong)

Personally, I'm not sure if implementing that is worth the work/complexity. Making DELETE idempotent is quite easy:

  1. Slowly remove all cases where we return ErrNotFound on delete. We're already failing to do this in some cases so this won't break anything.
  2. Eventually, define DELETE to be idempotent.
  3. Finally, remove all checks for ErrNotFound on delete.

Basically, we can do this slowly without having to fix everything up-front. Changing the method signature, on the other hand, will take a bit more work.

We could still do it, I just want to make sure the change is well motivated (i.e., we have a use-case where we actually care about this). If we do run into a case where we'd like to know, we can change the signature later (well, it'll be a breaking change but it'll be a breaking change either way).

@kevina
Copy link
Contributor

kevina commented Oct 24, 2018

@Stebalien I agree this is over-complicating things. I think the best to not worry about for now. A better way to handle this is to query the datastore afterwards for the status of the last operataion, but this will require the use of handles, which is something we may want to do anyway down the road.

@bigs
Copy link
Contributor

bigs commented Oct 24, 2018

i like @raulk's suggestion about returning the number of keys deleted... DBs that support versioning (badger) could potentially delete more than one key. so i'd think of it as <0, 0, >0

@Stebalien
Copy link
Member Author

@raulk @kevina @bigs so, are you fine making delete idempotent (for now) where, sometime in the future, we could consider adding a delete count? Really, I'm just trying to (a) simplify some code that tries to hard to track whether or not the operation succeeded and (b) trying to nail down semantics (I hate undefined behavior).

@kevina
Copy link
Contributor

kevina commented Oct 24, 2018

@Stebalien that fine with me.

@bigs
Copy link
Contributor

bigs commented Oct 24, 2018 via email

@raulk
Copy link
Member

raulk commented Oct 24, 2018

Idem. I have no immediate use case right now for the delete count.

@schomatis
Copy link

SGTM, in my book Delete just means that following calls to Get are the ones that will actually return the ErrNotFound error, no guarantees of what was there before issuing the call.

@momack2 momack2 added this to Inbox in ipfs/go-ipfs May 9, 2019
@MichaelMure
Copy link
Contributor

I was bitten by this one. In particular, the Delete function of MapDatastore in this repository is not idempotent. As it is often used to implement a test suite, it also force in some situation to deal with this case.

@MichaelMure
Copy link
Contributor

Another one not idempotent is flatfs.

Stebalien added a commit that referenced this issue Aug 21, 2019
Stebalien added a commit to ipfs/go-ds-s3 that referenced this issue Aug 21, 2019
Stebalien added a commit to ipfs/go-ds-s3 that referenced this issue Aug 21, 2019
Stebalien added a commit to ipfs/go-ds-leveldb that referenced this issue Aug 21, 2019
Stebalien added a commit to ipfs/go-ds-s3 that referenced this issue Aug 21, 2019
@Stebalien
Copy link
Member Author

All done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment