Safe block deletion (with ref count) #631

tbekas · 2023-11-21T12:18:47Z

No description provided.

benbierens · 2024-01-11T07:46:25Z

codex/rest/json.nim

+    totalBlocks* {.serialize.}: Natural
+    quotaMaxBytes* {.serialize.}: NBytes
+    quotaUsedBytes* {.serialize.}: NBytes
+    quotaReservedBytes* {.serialize.}: NBytes


Great, you just broke a bunch of deserializers.
'NBytes' serialized to a string that is "123'NByte" where previously this json field was just a number. Since the name of the fields have "Bytes" in them, do we want this? This If yes, I'll updated the deserializers.
I suppose we could adjust NBytes to JSON-serialize to normal numbers, since we're not building a UI here, and the NBytes always adds "'NByte" no matter if the number is in KB/MB/GB range, it's not really human-readable anyway.

If I do curl -s 127.0.0.1:8080/api/codex/v1/space | jq I get

{ "totalBlocks": 0, "quotaMaxBytes": 8589934592, "quotaUsedBytes": 0, "quotaReservedBytes": 0 }

I think that the bytes suffix should stay on properties.

benbierens · 2024-01-11T07:57:24Z

codex/stores/typeddatastore.nim

+  KeyVal[T] = (?Key, ?!T)
+  ResIter[T] =  Future[KeyVal[T]]
+
+


I can't find any tests. You could probably run these using SQLiteDatastore.new(Memory), right?

It's tested transitively via the testrepostore.nim. I think we can live without a dedicated test suit for this file, since the functionality is small and used only in repostore.nim.

This looks like something that needs to live in datastore proper? And, I'd actually add some tests for it as suggested.

benbierens · 2024-01-11T08:03:12Z

codex/utils/asynciter.nim

+  proc isFinished(): bool = true
+
+  Iter.new(genNext, isFinished)
+


I really like the Iter/AsyncIter types. Given how fundamental they can be to the codebase, I think they really need some tests. Even if the tests are very simple and kinda silly, like testing that an empty iterator is finished.... :b But that just means the tests are easy to write!

I agree, I will add tests for this file.

benbierens · 2024-01-11T08:08:07Z

codex/utils/genericcoders.nim

+  pb.finish
+  pb.buffer
+
+proc autodecode*(T: typedesc[tuple | object], bytes: seq[byte]): ?!T =


This file was a good idea, I think. I'd definitely test the autoencode/autodecode. The other ones would be very easy, but I could live without tests for those.

I agree, I will add few simple test cases.

benbierens · 2024-01-11T08:14:07Z

codex/stores/repostore.nim

    blockTtl*: Duration
    started*: bool

+  QuotaUsage* = object


Could maybe make a 'Quota' object, pull the quotaMaxBytes in here too, and put this object + all related procs in a different file? I don't know if there are enough Quota-procs to justify this, but it'd make this file a little smaller maybe.

QuotaUsage is db entity, while quotaMaxBytes is a configuration property that's stays immutable through the lifetime of an app. Therefore I think they belong in different spaces. If we want to present them together, they could be combined in a presentation layer (Json) as it's already done in the RestRepoStore object.

Also I think that either all db entities should be in separate files, or none of them, to keep the code base consistent. Moving QuotaUsage to separate file while keeping other entities is inconsistent.

benbierens · 2024-01-11T08:39:13Z

codex/stores/repostore.nim

+        res: DeleteResult
+
+      if currMd =? maybeCurrMd:
+        if currMd.refCount == 0 or currMd.expiry > expiryLimit:


currMd.expiry > expiryLimit
expiry is a timestamp, right? Does this mean we delete blocks if their currMd.expiry is further into the future than expiryLimit?

You're 100% right, there's a logic error here. It should be currMd.expiry < expiryLimit and the default value of expiryLimit should be SecondsSince1970.low instead of SecondsSince1970.high.

Thank you for catching this! I will add a test case for this.

benbierens · 2024-01-11T08:45:50Z

codex/stores/repostore.nim

+      return failure(err)
+    trace "Leaf metadata stored, block refCount incremented"
+  else:
+    trace "Leaf metadata already exists"


if (we just stored this block metadata)
-> then we increase the ref count by one
else (it was already stored)
-> we do nothing

Shouldn't that be the other way around?
if (we stored the block)
-> do nothing (it got stored with a ref count of 1)
else (it was already there)
-> increase ref count

No, the refCount refers to how many trees are referring this block.

If you store just a block (without any references to the tree yet), what should be the initial refCount for this block? In my opinion it's 0 since that's the count of references to the trees.

Then when adding new references to this block (new leaf metadata) we increase the refCount by 1.

benbierens · 2024-01-11T08:49:03Z

codex/stores/repostore.nim

+    trace "Block Stored"
+    if err =? (await self.updateQuotaUsage(plusUsed = res.used)).errorOption:
+      # rollback changes
+      without delRes =? await self.tryDeleteBlock(blk.cid), err:


if I'm reading this right, we store the block first, then we check if this exceeds the quota.
Shouldn't that be the other way around? Ask the quota if it has space for this, and error out immediately if it doesn't.

I will think about a way to address it, but really it's not that trivial. First of all we don't know if we should attempt to increase the quota unless we try to store a block - because it may turn out that the block is already stored and there was no reason to attempt increase the quota in the first place.

It seems to be easy at surface. But there might occur concurrency errors, that eventually may lead to permanent inconsistencies in db, which should be avoided at all cost. Keep in mind that we need to keep consistent 4 different records in 2 different stores:

block in repoDs

block metadata in metaDs

quota usage in metaDs

total blocks in metaDs

All of this should be written as a batch in one go with the method put*(self: Datastore, batch: seq[BatchEntry]) method, this will avoid inconsistencies across this stores, the only exception being the repoDs which is right now a separate datastore. Another approach would be to moving the repoDs to the same table as the rest of the keys, but that might have additional performance consequences - but still worth a try.

We can also think about adding a concurrent store on top that will update/rollback a batch of put oprations in one transation.

Following up on a offline discussion regarding this issue. We agreed that all the items we store in metaDs, which are:

block metadata

quota byte usage

total number of blocks

should be saved with a single batch and a single transaction. However currently we don't have modify API that would accept batch input, and adding such API would require some additional development effort in nim-datastore, so we're fine with lower consistency guarantee, which is just consistency between block metadata
and the block data (a file).

benbierens · 2024-01-11T08:58:34Z

tests/codex/stores/testrepostore.nim

-      discard (await metaDs.get(QuotaReservedKey)).tryGet
+      repo.totalUsed == 100'nb
+      repo.quotaUsedBytes == 100'nb
+      repo.quotaReservedBytes == 0'nb

  test "Should release bytes":
    discard createTestBlock(100)


why do we make this block, and then throw it away?
that's not very nice to the block. :o

I haven't added this code, but as far as I can tell the goal is to verify that the quota usage changes after creating a block.

benbierens · 2024-01-11T09:04:18Z

tests/codex/stores/testrepostore.nim


-  test "Should not update block expiration timestamp when current expiration is farther then new one":
+  test "Should update block expiration timestamp when new expiration is farther":


This one does the opposite, which is correct. The title is a copy-paste error.

Ah, you're right, thanks for catching it!

dryajov · 2024-02-10T18:42:02Z

codex/stores/repostore.nim

-    n = uint64.fromBytesBE(data[0..<sizeof(uint64)]).int
-    cid = ? Cid.init(data[sizeof(uint64)..<sizeof(uint64) + n]).mapFailure
-  success(cid)
+proc encode(t: Cid): seq[byte] = t.data.buffer


this file is already quite busy, can we perhaps split this helpres out to their own unit?

Except for Cid all encode/decode procs are defined here because all corresponding types are also defined here.

Then it's probably good to extract the types to their own types module and share across both, this is the commonly accepted approach.

This file needs to be split into several functional units that group related functionality.

I suggest the following structure:

respostore/types.nim - common types shared by all other functional unites

repostore/coders.nim - encoding/decoding primitives

repostore/operations.nim - the update/get/modify operation

repostore/repostore.nim/store.nim - the actual repostore functionality

repostore.nim - top level module that exports the required public functinal units/modules

This structure would make the changes easier to track and compare with previous versions. It's quite hard to follow whats going on right now and the file is already quite large and busy with unrelated functionality.

dryajov

This is quite a lot of work, I think we're at a point now that we can start merging this into master, can we rebase this with master so I can give it a proper review?

dryajov · 2024-02-20T18:37:20Z

codex/utils/asynciter.nim

@@ -4,8 +4,8 @@ import pkg/questionable
 import pkg/chronos

 type
-  Function*[T, U] = proc(fut: T): U {.raises: [CatchableError], gcsafe, noSideEffect.}
-  IsFinished* = proc(): bool {.raises: [], gcsafe, noSideEffect.}
+  Function*[T, U] = proc(fut: T): U {.raises: [CatchableError], gcsafe, closure.}


Closure is unnecessary here, when declaring a proc type, it's implicitely a closure.

dryajov · 2024-02-20T18:37:41Z

codex/utils/executor.nim

@@ -0,0 +1,5 @@
+
+
+type


leftover from other work?

It is 🤦 Removing it

benbierens

Overall good improvements.
Commit sha-5e952f0 passes dist-tests. 👍

dryajov · 2024-03-21T19:11:13Z

codex/stores/repostore/store.nim

+  trace "Stopping repo"
+  await self.close()
+
+  self.started = false


nitpick, we usually have new lines at EOF, maybe we should add a github hook to enforce it...

tbekas force-pushed the safe-block-deletion branch 2 times, most recently from 3833072 to 92fea28 Compare November 28, 2023 14:26

tbekas force-pushed the safe-block-deletion branch from 92fea28 to 2ab86dd Compare December 5, 2023 15:22

tbekas force-pushed the safe-block-deletion branch 3 times, most recently from 0f86d9f to eb90288 Compare December 13, 2023 16:11

tbekas changed the title ~~WIP, Safe block deletion (with ref count)~~ Safe block deletion (with ref count) Dec 13, 2023

tbekas force-pushed the safe-block-deletion branch 2 times, most recently from c96b0f2 to bfc467e Compare December 13, 2023 17:24

tbekas requested a review from dryajov December 13, 2023 17:29

tbekas force-pushed the safe-block-deletion branch from bfc467e to 9bd6cce Compare December 13, 2023 17:36

tbekas requested a review from elcritch December 13, 2023 17:53

tbekas marked this pull request as ready for review December 14, 2023 15:32

tbekas force-pushed the safe-block-deletion branch 5 times, most recently from 3b2d85e to f8c13ea Compare December 20, 2023 11:15

benbierens reviewed Jan 11, 2024

View reviewed changes

tbekas force-pushed the safe-block-deletion branch 2 times, most recently from 4ac3464 to 31f431b Compare January 18, 2024 15:28

tbekas force-pushed the safe-block-deletion branch 6 times, most recently from d7053f7 to ac896f9 Compare January 31, 2024 16:54

tbekas requested a review from benbierens January 31, 2024 17:54

tbekas force-pushed the safe-block-deletion branch from ac896f9 to 532ceda Compare February 7, 2024 15:37

dryajov reviewed Feb 10, 2024

View reviewed changes

tbekas force-pushed the safe-block-deletion branch 4 times, most recently from a951bbf to dd212ef Compare February 16, 2024 17:14

dryajov reviewed Feb 20, 2024

View reviewed changes

tbekas force-pushed the safe-block-deletion branch from dd212ef to d3d603c Compare February 28, 2024 12:52

tbekas self-assigned this Feb 28, 2024

tbekas force-pushed the safe-block-deletion branch from 6e1c317 to 5e952f0 Compare March 13, 2024 16:10

benbierens approved these changes Mar 14, 2024

View reviewed changes

tbekas force-pushed the safe-block-deletion branch from 5e952f0 to ec73f3c Compare March 19, 2024 21:52

dryajov reviewed Mar 21, 2024

View reviewed changes

Rework AsyncIter

9bbe0f6

tbekas force-pushed the safe-block-deletion branch from ec73f3c to 8d2e38c Compare June 5, 2024 09:26

tbekas added 4 commits June 5, 2024 14:09

Add tests for finishing iter on error

0a1039f

Block deletion with ref count & repostore refactor

e929850

Split repostore file into types, coders, operations and store

823a476

Rebase using asynciter branch

d85f08c

tbekas force-pushed the safe-block-deletion branch from 8d2e38c to d85f08c Compare June 5, 2024 12:10

Use nim-serde for encoding/decoding

6135b0c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Safe block deletion (with ref count) #631

Safe block deletion (with ref count) #631

tbekas commented Nov 21, 2023

benbierens Jan 11, 2024

tbekas Jan 11, 2024

benbierens Jan 11, 2024

tbekas Jan 11, 2024 •

edited

dryajov Feb 20, 2024

benbierens Jan 11, 2024

tbekas Jan 11, 2024

benbierens Jan 11, 2024

tbekas Jan 11, 2024

benbierens Jan 11, 2024

tbekas Jan 11, 2024

benbierens Jan 11, 2024

tbekas Jan 11, 2024 •

edited

benbierens Jan 11, 2024

tbekas Jan 11, 2024 •

edited

benbierens Jan 11, 2024

tbekas Jan 11, 2024 •

edited

dryajov Feb 10, 2024

tbekas Mar 13, 2024

benbierens Jan 11, 2024

tbekas Jan 11, 2024

benbierens Jan 11, 2024

tbekas Jan 11, 2024

dryajov Feb 10, 2024

tbekas Feb 12, 2024

dryajov Feb 20, 2024

dryajov Feb 20, 2024

tbekas Feb 21, 2024

dryajov left a comment

dryajov Feb 20, 2024

dryajov Feb 20, 2024

tbekas Feb 21, 2024

benbierens left a comment

dryajov Mar 21, 2024


		test "Should not update block expiration timestamp when current expiration is farther then new one":
		test "Should update block expiration timestamp when new expiration is farther":

Safe block deletion (with ref count) #631

Are you sure you want to change the base?

Safe block deletion (with ref count) #631

Conversation

tbekas commented Nov 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tbekas Jan 11, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tbekas Jan 11, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tbekas Jan 11, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tbekas Jan 11, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dryajov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benbierens left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tbekas Jan 11, 2024 •

edited

tbekas Jan 11, 2024 •

edited

tbekas Jan 11, 2024 •

edited

tbekas Jan 11, 2024 •

edited