Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spike] Query Executor in Rust #135

Merged
merged 4 commits into from
Oct 1, 2020
Merged

[Spike] Query Executor in Rust #135

merged 4 commits into from
Oct 1, 2020

Conversation

jbaxleyiii
Copy link
Contributor

@jbaxleyiii jbaxleyiii commented Sep 8, 2020

spike

What this is

This is a spike of a porting the JavaScript query executor which takes the query plan generated by the new Rust query planner and does a simple http fetch to the federated service and merges the data back together. This is not something we should intend to merge, rather it was a spike to see where things will be tricky as we work to do the port genuinely and where we could slice things for smaller deliverables.

What do I do with this?

What I would to get out of this exercise is a shared understanding of some of the footguns which may lie ahead, what a common data structure needs to look like for the executor, and some of the challenges (and opportunities) in moving from a single threaded environment to a multi-threaded one. So

  1. Don't review this for code quality / idiomatic Rust code. Its not that I'm not looking to improve my Rust, but rather this is a sketch of an idea not a real thing.
  2. Do read through the main executor. Below is a more detailed guide of what is interesting
  3. Do ask questions about the data structures, general shape of the problem.
  4. Don't focus on the http parts (tide, parsing of the response, etc). This was just for local testing.

Detailed Review Notes

  • The JS query executor relies heavily on two things to be successful. First, the single threaded nature of JavaScript and second, the ease of interior mutability within that thread. In particular, the FlattenNode and the FetchNode both mutate the result sink in a way that works if there is only one thread and a shared reference to the sink. For example, flattenResultsAtPath mutates the results while creating a zip for subsequent nodes to work on. This is really hard / not what we should do in a world where you have multiple threads and interior mutability isn't really the way to go.
  • RwLock is incredible. In the executor we need to be able to read from a shared reference at multiple times without block access to the data but block on writes. This is important because prior to sending a fetch we need to read the results to generate variables for downstream requests and we don't want to block on this for every request that is being formed. However, we do need to block on writes so that reads get the updated data correctly. A Mutex blocks on both but RwLock gets us just what we wanted. You can see where we create the lock and an example where we write back to it
  • deep merging of json isn't easy. I found a library but am not 100% sure it actually does what we want. It may or may not be why parallel flatten nodes don't work
  • Instead of manually forking threads, I am relying on futures to manage promises for me. I'm not sure if this is actually doing what I want it to do but it behaves like Promise.allSettled which is similar to the existing one. Maybe we should go more all in on threads here, not sure.
  • We are going to need to do some cloning when it comes time to send requests. Mainly of the variables because these could be used across multiple requests to the executeNode can't take ownership of them. I don't think this is a problem but wanted to flag it here for discussion.
  • Some idioms in JS are just not possible in Rust.. This is on the whole a good thing and it has already helped to improve this implementation but don't count on a direct port being the same shape in each part.
  • We will need to figure out lifetimes / ownership of the schema within threads so it can be created outside of a request. I didn't worry about that right now but that is definitely something we will need to do

Suggested next steps

I think after an initial read through, a pairing session could be a helpful way to talk through this. Overall I'm quite thrilled at how quickly this came together and it makes me even more excited about when we tackle this for real. Let me know if you have any questions! 馃憤

Additional changes

#186 was merged into this one. Its description:

This PR was developed on top of #135

First and foremost, any changes to the existing executor.rs and related code written in #135 are mostly there as suggestions. @jbaxleyiii lmk if you want to pair on these or feel free to take / reject parts. I didn't dig too deep into the implementation of the executor yet, I'm going to start doing that this week.

Second: This PR separates out the Stargate library from the binary, for a few reasons:

  1. Separating out the business logic from the wrapping server makes for easier testing.
  2. The library doesn't choose an async runtime, nor does it assume any specific implementation of an HTTP server.
  3. Use actix-web instead of tide; we've discussed these options and feel more comfortable going with a more mature server implementation, as well as one that uses tokio.

@jbaxleyiii If these changes look good, we can merge them into #135 and we can continue from there. If you want to do it differently, lmk!

  • Address todo! / unimplemented! / FIXME in the code
  • Tests for the executor (ideally these should be unit tests on the Stargate library, for coverage)
  • Integration tests. I was thinking we can set up acephei services locally and then the gateway would have localhost references to those on different ports, integration tests would be entirely localhost.
  • Add tracing/metrics middleware -- I believe there are some existing crates.
  • Write more log lines so that they're integrated with the logger of the http server (use env_logger ?)

http/src/main.rs Outdated

// Start server
app.at("/").post(|req: Request<()>| async move {
let request_context = req.body_graphql().await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:O

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quality contributions Avery is known for. Keep up the good work!

trevor-scheer added a commit that referenced this pull request Sep 24, 2020
This commit introduces a Dockerfile for building a stargate image.
Currently this is building from an in-flight PR (#135), which we
can update as soon as it lands on `main`.

The additional GH action publishes the container to the new
GitHub Container Registry. For now, this requires a PAT (as noted
in the comments). We should remove this in favor of GITHUB_TOKEN
as soon as it's supported.
Enrico2 pushed a commit that referenced this pull request Sep 30, 2020
This PR was developed on top of #135

First and foremost, any changes to the existing _executor.rs_ and related code written in #135 are mostly there as **suggestions**. @jbaxleyiii lmk if you want to pair on these or feel free to take / reject parts. I didn't dig too deep into the implementation of the executor yet, I'm going to start doing that this week.

Second: This PR separates out the Stargate library from the binary, for a few reasons:
1. Separating out the business logic from the wrapping server makes for easier testing.
2. The library doesn't choose an async runtime, nor does it assume any specific implementation of an HTTP server.
3. Use actix-web instead of tide; we've discussed these options and feel more comfortable going with a more mature server implementation, as well as one that uses tokio.

@jbaxleyiii If these changes look good, we can merge them into #135 and we can continue from there. If you want to do it differently, lmk!

- [ ] Address `todo!` / `unimplemented!` / `FIXME` in the code
- [ ] Tests for the executor (ideally these should be unit tests on the Stargate library, for coverage)
- [ ] Integration tests. I was thinking we can set up acephei services locally and then the gateway would have `localhost` references to those on different ports, integration tests would be entirely localhost.
- [ ] Add tracing/metrics middleware -- I believe there are some existing crates.
- [ ] Write more log lines so that they're integrated with the logger of the http server (use `env_logger` ?)
This commit is a first take at porting over execution from the
TypeScript implementation as well as surrounding scaffolding for a PoC
server runtime and process manager. This is an initial stab for feedback
and will be split up into multiple commits for review in different PRs
around structure, execution, and basic server scaffold after testing is
done.
Enrico2 pushed a commit that referenced this pull request Sep 30, 2020
This PR was developed on top of #135

First and foremost, any changes to the existing _executor.rs_ and related code written in #135 are mostly there as **suggestions**. @jbaxleyiii lmk if you want to pair on these or feel free to take / reject parts. I didn't dig too deep into the implementation of the executor yet, I'm going to start doing that this week.

Second: This PR separates out the Stargate library from the binary, for a few reasons:
1. Separating out the business logic from the wrapping server makes for easier testing.
2. The library doesn't choose an async runtime, nor does it assume any specific implementation of an HTTP server.
3. Use actix-web instead of tide; we've discussed these options and feel more comfortable going with a more mature server implementation, as well as one that uses tokio.

@jbaxleyiii If these changes look good, we can merge them into #135 and we can continue from there. If you want to do it differently, lmk!

- [ ] Address `todo!` / `unimplemented!` / `FIXME` in the code
- [ ] Tests for the executor (ideally these should be unit tests on the Stargate library, for coverage)
- [ ] Integration tests. I was thinking we can set up acephei services locally and then the gateway would have `localhost` references to those on different ports, integration tests would be entirely localhost.
- [ ] Add tracing/metrics middleware -- I believe there are some existing crates.
- [ ] Write more log lines so that they're integrated with the logger of the http server (use `env_logger` ?)
@Enrico2 Enrico2 marked this pull request as ready for review September 30, 2020 21:23
@Enrico2 Enrico2 self-assigned this Sep 30, 2020
This PR was developed on top of #135

First and foremost, any changes to the existing _executor.rs_ and related code written in #135 are mostly there as **suggestions**. @jbaxleyiii lmk if you want to pair on these or feel free to take / reject parts. I didn't dig too deep into the implementation of the executor yet, I'm going to start doing that this week.

Second: This PR separates out the Stargate library from the binary, for a few reasons:
1. Separating out the business logic from the wrapping server makes for easier testing.
2. The library doesn't choose an async runtime, nor does it assume any specific implementation of an HTTP server.
3. Use actix-web instead of tide; we've discussed these options and feel more comfortable going with a more mature server implementation, as well as one that uses tokio.

@jbaxleyiii If these changes look good, we can merge them into #135 and we can continue from there. If you want to do it differently, lmk!

- [ ] Address `todo!` / `unimplemented!` / `FIXME` in the code
- [ ] Tests for the executor (ideally these should be unit tests on the Stargate library, for coverage)
- [ ] Integration tests. I was thinking we can set up acephei services locally and then the gateway would have `localhost` references to those on different ports, integration tests would be entirely localhost.
- [ ] Add tracing/metrics middleware -- I believe there are some existing crates.
- [ ] Write more log lines so that they're integrated with the logger of the http server (use `env_logger` ?)
@Enrico2
Copy link
Contributor

Enrico2 commented Sep 30, 2020

@trevor-scheer @abernix @jbaxleyiii That last commit should bring this branch to a mergeable state without breaking the release process. I've created an issue for follow ups (#198)

Copy link
Member

@abernix abernix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feedback is meant for @Enrico2, who has taken over this PR, and only about one small part that I had my attention called to in #135 (comment)

stargate/Dockerfile Outdated Show resolved Hide resolved
Comment on lines 5 to 11
ARG BASE_IMAGE=ekidd/rust-musl-builder:latest
FROM alpine/git:latest AS cloner

RUN git clone https://github.com/apollographql/federation.git /src && \
git -C /src checkout jbaxleyiii/gateway-spike
RUN git clone https://github.com/apollographql/federation.git /src

# Our first FROM statement declares the build environment.
FROM ${BASE_IMAGE} AS builder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only existed as a workaround for the time-being while we were based on this branch. This is meant to be replaced with a COPY ../ /src (inside the builder stage) so that this can be built from CI from the current commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My Dockerfile foo isn't great to make sure I'm doing the right thing, I see what you're saying though, I hope it's ok but I'm going to merge this and maybe you can fix this later / we can do it tomorrow together?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @abernix , @trevor-scheer and I tried to address this comment, but couldn't quite figure it out quickly, we can't COPY ../ . (/src would be wrong) because it's a parent directory.
We tried some variations but failed at various stages. I had to stop for lunch and then do other more pressing stuff :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Enrico2 @trevor-scheer All good. I can handle this!

stargate/Dockerfile Outdated Show resolved Hide resolved
@Enrico2 Enrico2 merged commit 065a672 into main Oct 1, 2020
@Enrico2 Enrico2 deleted the jbaxleyiii/gateway-spike branch October 1, 2020 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants