Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build a Dataloader on Async gem primitives #4727

Merged
merged 20 commits into from
Dec 15, 2023

Conversation

rmosolgo
Copy link
Owner

@rmosolgo rmosolgo commented Dec 12, 2023

This is another attempt to get Dataloader to "Just Work" when it comes to running Source#fetch in the background.

This approach uses the async gem to handle backgrounding, using Async do ... end.

TODO:

  • Sadly CI doesn't seem to like this. But It passes locally....
  • Test: during GraphQL execution, does it run resolvers in parallel? (It might, since it uses Async { ... }.wait, but I don't have a test for it at the moment.
    • This was already covered in the existing tests
  • Try a Ruby on Rails example app and make sure this will deliver as promised with ActiveRecord, Redis, and HTTP calls inside a Source.
  • Update the documents to recommend this implementation (remove references to other schedulers)

@rmosolgo rmosolgo marked this pull request as draft December 12, 2023 19:10
@rmosolgo rmosolgo mentioned this pull request Dec 12, 2023
@trevorturk
Copy link

This is a bit in the weeds for me to truly review, but I gave things a read through and it looks good! I'm happy you've been able to make some progress here, and excited for the Async-ified Dataloader!

I see your todos in the PR description, and figured I'd share a bit of code I'm using in a Rails app running Falcon/Async and doing a lot of Async::HTTP work: https://gist.github.com/trevorturk/f8baeaf8a954bc3764d70652db1692fa -- one bit is around testing that things are running in parallel and the other is a fix for integration tests.

I'll keep an eye on this PR and report back if I have anything useful to share. Feel free to ping me if you have any questions I might be able to help with.

I'll also @ioquatix just in case he has the time/interest to put his eyes on this.

@rmosolgo
Copy link
Owner Author

I tried it in Rails and it worked like a charm! Net::HTTP and ActiveRecord ran in parallel without a problem. Demo: https://github.com/rmosolgo/rails-graphql-async-demo

@rmosolgo rmosolgo changed the base branch from master to 2.2-dev December 13, 2023 20:18
@ioquatix
Copy link

You can visualise what is going on using the async-debug gem which is one line to drop into a test application (usually).

@rmosolgo
Copy link
Owner Author

I think I'm all set, application-wise. It's just CI -- it runs for hours but gets nowhere: https://github.com/rmosolgo/graphql-ruby/actions/runs/7196641771/job/19601940761

Actually, the Rails build (sometimes?) gets somewhere, but not too far: https://github.com/rmosolgo/graphql-ruby/actions/runs/7196641771/job/19601942904

I'm going to dig in and straighten it out tomorrow.

@rmosolgo rmosolgo marked this pull request as ready for review December 13, 2023 20:37
@ioquatix
Copy link

It’s not uncommon for CI to expose race conditions because scheduling is so much slower and thus unpredictable. I would suggest looking at scheduling operations.

@ioquatix
Copy link

I should add, tests that depend on sleep actually sleeping the desired amount of time, and sleep N finishing faster than sleep N+M may break. It's extremely tricky to write those sorts of tests on the CI as the actual vCPU seems incredibly unpredictable (and overloaded).

@rmosolgo
Copy link
Owner Author

rmosolgo commented Dec 14, 2023

🎉

It was scheduling code.

Previously I was relying on loops going round and round to eventually let everything resolve. But I had a test to make sure that the dataloader loop didn't exceed 1000 iterations. It was -- in fact, it was neverending -- so the test suite never finished.

I added an Async::Condition instead, so that Tasks which are waiting for some other data can wait on that condition, then when a dataloader loop is finished, it notifies the condition so that those tasks can start again.

@ioquatix, this is hot stuff! Thanks for this great library. I look forward to learning more, for example, applying semaphores to the internal list of dataloader jobs if the need arises (it hasn't yet...).

Edit: I got ahead of myself ... still a few commented-out tests to fix...

@rmosolgo
Copy link
Owner Author

Amazing. @ioquatix and @trevorturk, thanks again for your help along the way on this!

@rmosolgo rmosolgo merged commit ab74e7c into 2.2-dev Dec 15, 2023
12 checks passed
@rmosolgo rmosolgo deleted the async-primitives-dataloader branch December 15, 2023 17:33
@ioquatix
Copy link

Nice work team!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants