Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GQL Timeouts are reached on large scale drives with default public gateway #348

Open
madjin opened this issue May 16, 2024 · 8 comments
Open
Assignees

Comments

@madjin
Copy link

madjin commented May 16, 2024

Whenever I try any command now I'm getting this error Error: GQL Error: Cannot read properties of null (reading 'transactions'

Example commands:

ardrive list-folder --parent-folder-id "f82a6864-d5d5-45a7-9c56-7dc1da125926" --max-depth 1

ardrive create-manifest --folder-id "cbc1a69c-1367-441a-9615-534d3c9c2ad2" -w /path/to/my/key.json --boost 1.5

I mainly want to create a manifest for that folder. Is it because there's too many files now in my drive? Creating the manifest was the final step to this archived collection.

@fedellen fedellen self-assigned this May 17, 2024
@fedellen
Copy link
Collaborator

hey there!

it looks like your query to graphQL is timing out. recently the public gateway has implemented some timeouts as a countermeasure to extreme traffic resulting in poor performance. this specific drive is very large and is sadly reaching this timeout consistently with the queries setup in the app

we're looking into this one from multiple levels. it may be possible for us to get a specific index on the public gateway in place that will help alleviate this problem but it could take some time to add. we'll be digging into if we can find any workarounds through the app and also improving the error message to be more informative when these timeouts happen

thanks for bringing this to our attention!

@madjin
Copy link
Author

madjin commented May 17, 2024

Appreciate the update! Do you think, after the issue is solved, there should be some guidelines on drives in terms of capacity / number of items? I didn't read about such anywhere while researching arweave, so I just kept all relevant files for my dataset together in one drive.

Happy to help in anyway to solve this issue, feel free to ping me in discord for quick responses too

@fedellen
Copy link
Collaborator

hey @madjin! we just released an update here that might help the create manifest command

https://github.com/ardriveapp/ardrive-cli/releases/tag/v2.0.3

there was certain under performing query where we lookup for the owner of a drive. once we have the owner, the public gateway has a strong index on the ArFS queries we need. in the new update we've removed this query in the case of write actions and just use the owner of the provided wallet. not sure if this will be enough for the command to finish for you, and it won't solve it for the list-folder command

Do you think, after the issue is solved, there should be some guidelines on drives in terms of capacity / number of items?

the public arweave gateway can only handle so much scale and will rate limit and/or time out users. I'm not sure what this number is in a drive quantifiably. its worth looking into, I can try to get some better answers

I do see goldsky GQL resolves the query, but doesn't serve the data so isn't fully compatible with the CLI yet. there might be an opportunity here to build some robustness around that that I can try look into

@madjin
Copy link
Author

madjin commented May 29, 2024

Thanks for this update, I got a new error when attempting to list my drive with --max-depth 1

GQL Error: Defined query timeout of 20000ms exceeded when running query.

This is the command I tried to do:

ardrive list-folder --parent-folder-id "f82a6864-d5d5-45a7-9c56-7dc1da125926" --max-depth 1

Side note: Maybe I should make separate drives to upload my data with. I think I started experiencing issues around 150-200k files. My dataset is 34,015 files that I have different filetypes for (json, vox, glb, etc) which I tried to keep all in one drive.

@madjin
Copy link
Author

madjin commented May 29, 2024

Can I manually override the GQL timeout Error? It happens when trying to create-manifest also and when I tried to upload files:

ardrive upload-file --local-path glb_xmp_baked/ --parent-folder-id "f82a6864-d5d5-45a7-9c56-7dc1da125926" -w /path/to/key.json --boost 1.5

GQL Error: Defined query timeout of 20000ms exceeded when running query.

@fedellen
Copy link
Collaborator

fedellen commented Jun 6, 2024

Thanks for your patience here! We're juggling lots of different priorities

Unfortunately this timeout you're experiencing is imposed on the public gateway's side as a protection from poor performing queries. I understand you've received support and are trying other gateways out. That is absolutely encouraged in this situation

I did some investigating of our code and did find some areas we could improve our index usage for using the public gateway. We've just released ardrive-cli 2.0.4 that has these improvements. You may still reach a limitation but we'd be interested if you see any improvements with your large scale drives

@madjin
Copy link
Author

madjin commented Jun 6, 2024

Sweet, I'll give 2.0.4 a try! I still have to upload another big batch of files today so its good timing.

@madjin
Copy link
Author

madjin commented Jun 8, 2024

I was able to use 2.0.4 to upload through default gateway, but it was like a one time thing. If I wanted to create a manifest file, I needed to use a separate gateway otherwise I got GQL Error: Defined query timeout of 20000ms exceeded when running query

@fedellen fedellen changed the title Error: GQL Error: Cannot read properties of null (reading 'transactions') GQLTimeouts are reached on large scale drives with default public gateway Jul 8, 2024
@fedellen fedellen changed the title GQLTimeouts are reached on large scale drives with default public gateway GQL Timeouts are reached on large scale drives with default public gateway Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants