[Colossus] Add pruning service #4845

zeeshanakram3 · 2023-08-28T06:38:13Z

addresses #4813

This PR adds:

adds cleanup service to prune outdated assets, and configuration option to enable/disable it on storage-node startup
adds CLI command 'dev:cleanup' for performing pruning actions (ideally should not be used in production)
extends '/status' endpoint with additional info, e.g. sync, cleanup/pruning & serving buckets configurations

How the pruning of assets work

The cleanupService runs the data objects cleanup/pruning workflow. It removes all the locally stored data objects that the operator is no longer obliged to keep, because either the data object has been deleted from the runtime or it has been moved to some other bucket/s

PRECONDITIONS:

Since the cleanup uses the QueryNode to query the data obligations, the QueryNode processor should not lag more than MAXIMUM_QN_LAGGING_THRESHOLD blocks, otherwise the cleanup workflow would not be performed in order to avoid pruning assets based on outdated state.
If the asset being pruned from this storage node still exists in the runtime (i.e. its storage obligation has been moved), then at least "X" other storage nodes should hold the asset, otherwise, the cleanup workflow would not be performed, where "X" is defined by MINIMUM_REPLICATION_THRESHOLD
If the asset being pruned from this storage node is currently being downloaded by some external actors, then the cleanup action for this asset would be postponed

… cleanup & buckets configurations

mnaamani

Overall this looks good and will be quite valuable. Left some feedback. Also needs an update from master to fix some merge conflicts.

storage-node/src/commands/dev/cleanup.ts

mnaamani · 2023-10-24T07:17:47Z

storage-node/src/commands/server.ts

@@ -62,6 +66,16 @@ export default class Server extends ApiCommandBase {
      description: 'Interval between synchronizations (in minutes)',
      default: 1,
    }),
+    cleanup: flags.boolean({
+      char: 's',


char 's' already assigned for sync. Better remove it as I don't know what the behavior is if short flag is used and multiple flags assigned same char.

Done, changed the short flag to c. WDYT?

mnaamani · 2023-10-24T07:19:05Z

storage-node/src/commands/server.ts

+    cleanupInterval: flags.integer({
+      char: 'i',
+      description: 'Interval between periodic cleanup actions (in minutes)',
+      default: 1,


I would say this should be much higher like 1h, maybe even longer?
Depends on how much work/time it takes per cleanup run.

Yeah, ideally it should be higher that the sync interval, I just set it to 1min, basically test the pruning behavior locally. Now I have set it to 6hrs

storage-node/src/commands/server.ts

storage-node/src/services/caching/localDataObjects.ts

mnaamani · 2023-10-31T13:42:48Z

storage-node/src/services/caching/localDataObjects.ts

+): Promise<{ dataObjectId: DataObjectId; pinnedCount: DataObjectPinCount } | undefined> {
+  if (idCache.has(dataObjectId)) {
+    await lock.acquireAsync()
+    const id = { dataObjectId, pinnedCount: idCache.get(dataObjectId) as DataObjectPinCount }


return type of function doesn't seem to match id, dataObjectId is a string where as in the return type it is DataObjectId (u64)

The DataObjectId is also a string, as you can see

joystream/storage-node/src/services/caching/localDataObjects.ts

Line 7 in d8fcf54

type DataObjectId = string

Ah yes, i didn't spot that type definition.

mnaamani · 2023-11-09T05:35:14Z

storage-node/src/services/sync/cleanupService.ts

+
+  const timeoutMs = 60 * 1000 // 1 minute since it's only a HEAD request
+  const deletionTasks: DeleteLocalFileTask[] = []
+  await Promise.all(


Suggested change

await Promise.all(

await Promise.allSettled(

To avoid the promise being rejected when HEAD request to one data object fails, Promise.allSettled() is more appropriate.

Also I can image if the number of dataobjects could potentially be very large, doing a massive number of HTTP HEAD requests in parallel might be a bad idea. Perhaps we should do it in batches.

Done, but I have to change target option in tsconfig.json to es2020 as Promise.allSettled does not seem to be available in es2017

storage-node/src/services/webApi/controllers/filesApi.ts

mnaamani

One final small change and this should be ready to test in production :)
And updating from master to resolve merge conflict.

mnaamani · 2023-11-14T05:56:53Z

storage-node/src/services/caching/localDataObjects.ts

+  dataObjectId: string
+): Promise<{ dataObjectId: DataObjectId; pinnedCount: DataObjectPinCount } | undefined> {
+  if (idCache.has(dataObjectId)) {
+    await lock.acquireAsync()


shouldn't the lock be acquired before checking idCache.has() ?

Addressed. However, placing a lock before the if condition can lead to unnecessary blocking if the dataObjectId is not present in the cache. So I reimplemented by first checking the cache without acquiring the lock. This is a quick operation and if the dataObjectId is not present, just return without the overhead of acquiring the lock.

…4845

mnaamani

I will merge this work into a new branch colossus-beta where we can do final tweaks for a "beta" release.

zeeshanakram3 added 3 commits August 28, 2023 02:40

colossue: added cleanup service to prune outdated assets

af930c3

added CLI command 'dev:cleanup' for performing puring actions

487137e

colossus: extends '/status' endpoint with additional info, e.g. sync,…

d8fcf54

… cleanup & buckets configurations

zeeshanakram3 requested a review from mnaamani August 28, 2023 06:38

mnaamani requested changes Nov 9, 2023

View reviewed changes

address requested changes

3a2e92e

zeeshanakram3 requested a review from mnaamani November 13, 2023 08:16

mnaamani requested changes Nov 14, 2023

View reviewed changes

zeeshanakram3 added 2 commits November 14, 2023 22:08

address CR

67ccfe7

Merge remote-tracking branch 'upstream/master' into pr/zeeshanakram3/…

f692d15

…4845

zeeshanakram3 force-pushed the colossus-add-pruning-service branch from 5e4be6e to f692d15 Compare November 14, 2023 17:33

mnaamani changed the base branch from master to colossus-beta November 17, 2023 05:52

mnaamani self-requested a review November 17, 2023 05:53

mnaamani approved these changes Nov 17, 2023

View reviewed changes

mnaamani merged commit c9ca6f9 into Joystream:colossus-beta Nov 17, 2023
22 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Colossus] Add pruning service #4845

[Colossus] Add pruning service #4845

zeeshanakram3 commented Aug 28, 2023 •

edited

mnaamani left a comment

mnaamani Oct 24, 2023

zeeshanakram3 Nov 13, 2023

mnaamani Oct 24, 2023

zeeshanakram3 Nov 13, 2023

mnaamani Oct 31, 2023

zeeshanakram3 Nov 13, 2023

mnaamani Nov 14, 2023

mnaamani Nov 9, 2023

zeeshanakram3 Nov 13, 2023

mnaamani left a comment

mnaamani Nov 14, 2023

zeeshanakram3 Nov 14, 2023

mnaamani left a comment

[Colossus] Add pruning service #4845

[Colossus] Add pruning service #4845

Conversation

zeeshanakram3 commented Aug 28, 2023 • edited

How the pruning of assets work

mnaamani left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mnaamani left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mnaamani left a comment

Choose a reason for hiding this comment

zeeshanakram3 commented Aug 28, 2023 •

edited