Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colossus 3.6.0 #4796

Merged
merged 14 commits into from
Jul 26, 2023
Merged

Colossus 3.6.0 #4796

merged 14 commits into from
Jul 26, 2023

Conversation

mnaamani
Copy link
Contributor

@mnaamani mnaamani commented Jun 26, 2023

Fixes #4794

  • Added support for multiple transactor accounts when running the server, so we can take advantage of ability for operators to assign different transactor accounts per bucket.

  • Added support specifying which buckets to serve when running a server.

  • Bump version to v3.6.0

In combination these two changes now allows workers who operate multiple buckets to be able to run multiple nodes, for example with each node serving a single bucket.

  • Also finally added behavior to allow server to startup even if transactor account is not provided to one or more buckets. The server will still serve downloads and sync those buckets. This is very useful for testing. I can run a node and specify a valid worker id and test the nodes behavior, open telemetry traces etc.
yarn storage-node server -d ~/tmp/ -o 3333 --apiUrl wss://rpc.joystream.org \
    -s -r 1 -w 8 --buckets 0,1,2,3,4,5,6,7

2023-07-16 03:07:35:735 info: Initializing runtime connection to: wss://rpc.joystream.org
2023-07-16 03:07:38:738 info: Waiting for chain to be synced before proceeding.
2023-07-16 03:07:38:738 info: Query node endpoint set: http://localhost:8081/graphql
2023-07-16 03:07:38:738 debug: Query - storageBucketsConnection
 ›   Warning: Missing transactor key for bucket 1
2023-07-16 03:07:39:739 warn: Only subset of buckets will process uploads!
2023-07-16 03:07:39:739 info: Buckets synced and served: 1
2023-07-16 03:07:39:739 info: Buckets accepting uploads: 
2023-07-16 03:07:39:739 info: Removing temp directory ...
2023-07-16 03:07:39:739 info: Creating temp directory ...
2023-07-16 03:07:39:739 debug: Local ID cache loaded.
2023-07-16 03:07:39:739 info: Synchronization enabled.
2023-07-16 03:07:39:739 debug: Max file size runtime parameter: 64424509440
2023-07-16 03:07:39:739 info: Listening on http://localhost:3333
2023-07-16 03:07:39:739 info: Sync paused for 1 minute(s).

Even an invalid worker id can be specified, in which-case the server will startup, but with very limited functionality. This is useful for preparing a new node, testing endpoint accessibility, and general testing. We should add a separate command for this purpose.

@vercel
Copy link

vercel bot commented Jun 26, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated (UTC)
pioneer-testnet ⬜️ Ignored (Inspect) Jul 20, 2023 10:21pm

@mnaamani mnaamani changed the title Colossus 3.5.0 Colossus 3.6.0 Jul 14, 2023
Copy link
Contributor

@zeeshanakram3 zeeshanakram3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good. I think just some cosmetic changes are required.

I tested this seems to be working as expected. The overall code refactorization and customization were long overdue, so great job.

@@ -87,15 +87,13 @@ type DataObject = {
*/
export async function getStorageObligationsFromRuntime(
queryNodeUrl: string,
workerId: number
workerId: number,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workerId param is not used anymore


if (!(await verifyWorkerId(api, workerId))) {
logger.warn(`workerId ${workerId} does not exist in the storage working group`)
// this.exit(ExitCodes.InvalidWorkerId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kindly remove this commented line, since we have decided to bootstrap storage-node even if an invalid workerId is specified

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thoughts, decided to revert allowing invalid workerId: https://github.com/Joystream/joystream/pull/4796/files#r1270006202

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thoughts, decided to revert allowing invalid workerId: https://github.com/Joystream/joystream/pull/4796/files#r1270006202

Sounds good

@@ -302,19 +354,9 @@ async function verifyWorkerId(
)
)
)
.filter(([bucketId]) => bucketsToServe.length === 0 || bucketsToServe.includes(parseInt(bucketId)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the behavior if the workerId input is invalid & list of buckets that needed to be served is valid, should the storage-node serve those buckets? The current implementation does not allow for that.

I think workerId & buckets flags can be mutually exclusive as we can derive the worker info if the buckets flag is specified i.e. if buckets is provided then we can be assured that worker ID is the worker that is assigned to serve these buckets in the runtime. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent point I didn't consider that because allowing invalid worker id was a last minute change i didn't without thinking it through, also I just realized that a storage bucket can only have one operator (unlike distribution buckets).

if buckets is provided then we can be assured that worker ID is the worker that is assigned to serve these buckets in the runtime. WDYT?

But in this case you can pass buckets for different workerId.. should we allow that? (It would only make sense if both 'workers' are the same person or are cooperating (sharing transactor-key and maybe role key))

I'm tempted to completely drop the workerId flag and only have the buckets flag so you always have to be explicit as to what bucket to serve.

I don't want to make this overly complicated and prone to making mistakes, just to allow a more testable node. I think its better that I revert this change so you must provide a valid worker id, and we can separately make have another command just for testing purpouses which would allow specifying any bucket.

Comment on lines +158 to +166
async loadKeys(flags: {
dev: boolean
keyFile?: string
password?: string | string[]
accountUri?: string | string[]
keyStore?: string
}): Promise<void> {
const keyring = this.getKeyring()
const { dev, password, keyFile, accountUri, keyStore } = flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to comfirm, what is the desired behavior here? Should these flags (keyFile, accountUri, keyStore) be allowed to be simultaneously used or they should be exclusive? Currently, they can be simultaneously used, if desired otherwise, these should be marked as exclusive

Copy link
Contributor Author

@mnaamani mnaamani Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No they are not exclusive, all can be used at the same time.

@@ -223,6 +226,10 @@ async function validateUploadFileParams(req: express.Request, res: express.Respo
const dataObjectId = new BN(req.query.dataObjectId?.toString() || '')
const bagId = req.query.bagId?.toString() || ''

if (!res.locals.uploadBuckets.includes(req.query.storageBucketId?.toString() || '')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess storageBucketId defined above can be used here instead of duplicating the expression req.query.storageBucketId?.toString() || ''?

@@ -56,13 +61,14 @@ export default class DevSync extends Command {

async run(): Promise<void> {
const { flags } = this.parse(DevSync)

const bucketId = '' + flags.bucketId
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const bucketId = '' + flags.bucketId
const bucketId = flags.bucketId.toString()

I think better to use .toString() method?

Copy link
Contributor

@zeeshanakram3 zeeshanakram3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@mnaamani mnaamani merged commit 8458489 into Joystream:master Jul 26, 2023
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Colossus] Storage Node - artificial operating limitations
2 participants