-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stargate suggestions to #135 + actix-web support #186
Stargate suggestions to #135 + actix-web support #186
Conversation
e2bc774
to
f7fc123
Compare
fff5959
to
feaf031
Compare
47c61ee
to
1ee6ad1
Compare
936d683
to
962e6f7
Compare
…s dependencies are in `crates`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's problematic or blocking, but it seems strange to have graphql-parser
"belong" to stargate
. I'd expect we'll be using this crate in all kinds of tooling and it seems pretty standalone.
@@ -1,7 +1,6 @@ | |||
[package] | |||
name = "query-planner-wasm" # this ends up being the name in the published npm package. | |||
version = "0.0.0-DO.NOT.CHANGE" # The version is managed in the package.json, by Lerna. | |||
private = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? I suspect @abernix could speak to its existence. This seems likely to be a relic of package.json
's private
key - maybe it's reflected in a generated package.json
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaict this had no affect, and in fact caused a warning that said there's an unknown key in the file.
Also, for future reference, the only keys that get copied to the package.json
that gets generated by wasm-pack
are here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, all good to remove this; this doesn't appear to do what I had thought it would. I was trying to prevent this package from getting accidentally published to Crates.io. Though I think that what I had meant to do was probably to use the publish = false
option.
#[structopt(long, parse(from_os_str))] | ||
pub manifest: PathBuf, | ||
|
||
/// Where to write the output: to `stdout` or `file` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copypasta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes thanks!
let planner = QueryPlanner::new(schema); | ||
let service_list = get_service_list(&planner.schema); | ||
Stargate { | ||
planner, | ||
service_list, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can get_service_list
be a fn that lives on a QueryPlanner
instance? Then Stargate
would only need a planner passed in to its new
fn, which could then get the service_list
itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can, I tried that at some point. I think I landed on this because it makes sense for the service list to be a property of Stargate. It felt a little awkward to get that from the planner in terms of ownership.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I remember, in order to do that, query-planner (which is a different crate) would need to either be aware of ServiceDefinition
which comes from stargate-lib, and so that definition would have to move, and that has cascading effects.
This PR was developed on top of #135 First and foremost, any changes to the existing _executor.rs_ and related code written in #135 are mostly there as **suggestions**. @jbaxleyiii lmk if you want to pair on these or feel free to take / reject parts. I didn't dig too deep into the implementation of the executor yet, I'm going to start doing that this week. Second: This PR separates out the Stargate library from the binary, for a few reasons: 1. Separating out the business logic from the wrapping server makes for easier testing. 2. The library doesn't choose an async runtime, nor does it assume any specific implementation of an HTTP server. 3. Use actix-web instead of tide; we've discussed these options and feel more comfortable going with a more mature server implementation, as well as one that uses tokio. @jbaxleyiii If these changes look good, we can merge them into #135 and we can continue from there. If you want to do it differently, lmk! - [ ] Address `todo!` / `unimplemented!` / `FIXME` in the code - [ ] Tests for the executor (ideally these should be unit tests on the Stargate library, for coverage) - [ ] Integration tests. I was thinking we can set up acephei services locally and then the gateway would have `localhost` references to those on different ports, integration tests would be entirely localhost. - [ ] Add tracing/metrics middleware -- I believe there are some existing crates. - [ ] Write more log lines so that they're integrated with the logger of the http server (use `env_logger` ?)
Co-authored-by: Ran Magen <ran@apollographql.com> ![spike](https://media.giphy.com/media/XuSS2uqOPf83K/giphy.gif) ## What this is This is a spike of a porting the JavaScript [query executor](https://github.com/apollographql/apollo-server/blob/60d974240769aac3e1c75ff968657b2d3e60e9b6/packages/apollo-gateway/src/executeQueryPlan.ts) which takes the query plan generated by the new Rust query planner and does a simple http fetch to the federated service and merges the data back together. This is **not something we should intend to merge**, rather it was a spike to see where things will be tricky as we work to do the port genuinely and where we could slice things for smaller deliverables. ## What do I do with this? What I would to get out of this exercise is a shared understanding of some of the footguns which may lie ahead, what a common data structure needs to look like for the executor, and some of the challenges (and opportunities) in moving from a single threaded environment to a multi-threaded one. So 1. Don't review this for code quality / idiomatic Rust code. Its not that I'm not looking to improve my Rust, but rather this is a sketch of an idea not a real thing. 2. Do read through the [main executor](http/src/main.rs). Below is a more detailed guide of what is interesting 3. Do ask questions about the data structures, general shape of the problem. 4. Don't focus on the http parts (tide, parsing of the response, etc). This was just for local testing. ## Detailed Review Notes - The JS query executor relies heavily on two things to be successful. First, the single threaded nature of JavaScript and second, the ease of interior mutability within that thread. In particular, the `FlattenNode` and the `FetchNode` both mutate the result sink in a way that works if there is only one thread and a shared reference to the sink. For example, [flattenResultsAtPath](https://github.com/apollographql/apollo-server/blob/60d974240769aac3e1c75ff968657b2d3e60e9b6/packages/apollo-gateway/src/executeQueryPlan.ts#L165) mutates the results while creating a zip for subsequent nodes to work on. This is really hard / not what we should do in a world where you have multiple threads and interior mutability isn't really the way to go. - `RwLock` is incredible. In the executor we need to be able to read from a shared reference at multiple times without block access to the data but block on writes. This is important because prior to sending a fetch we need to read the results to generate variables for downstream requests and we don't want to block on this for every request that is being formed. However, we do need to block on writes so that reads get the updated data correctly. A `Mutex` blocks on both but `RwLock` gets us just what we wanted. You can see where we [create the lock](main...jbaxleyiii/gateway-spike#diff-1d121b833f22f1fd43615c8db9fa578bR78) and an example where we [write back to it](main...jbaxleyiii/gateway-spike#diff-1d121b833f22f1fd43615c8db9fa578bR293) - deep merging of json isn't easy. I found a library but am not 100% sure it actually does what we want. It may or may not be why parallel flatten nodes [don't work](main...jbaxleyiii/gateway-spike#diff-1d121b833f22f1fd43615c8db9fa578bR170-R175) - Instead of manually forking threads, I am relying on [futures to manage promises](main...jbaxleyiii/gateway-spike#diff-1d121b833f22f1fd43615c8db9fa578bR109-R117) for me. I'm not sure if this is actually doing what I want it to do but it behaves like `Promise.allSettled` which is similar to the existing one. Maybe we should go more all in on threads here, not sure. - We are going to need to do [some cloning](main...jbaxleyiii/gateway-spike#diff-1d121b833f22f1fd43615c8db9fa578bR215-R227) when it comes time to send requests. Mainly of the variables because these could be used across multiple requests to the executeNode can't take ownership of them. I don't think this is a problem but wanted to flag it here for discussion. - [Some idioms in JS are just not possible in Rust.](https://github.com/apollographql/apollo-server/blob/main/packages/apollo-gateway/src/executeQueryPlan.ts#L206). This is on the whole a good thing and it has already helped to improve this implementation but don't count on a direct port being the same shape in each part. - We will need to figure out lifetimes / ownership of the schema within threads so it can be created outside of a request. I [didn't worry about that right now](main...jbaxleyiii/gateway-spike#diff-1d121b833f22f1fd43615c8db9fa578bR24-R50) but that is definitely something we will need to do ## Suggested next steps I think after an initial read through, a pairing session could be a helpful way to talk through this. Overall I'm quite thrilled at how quickly this came together and it makes me even more excited about when we tackle this for real. Let me know if you have any questions! 👍 ## Additional changes #186 was merged into this one. Its description: This PR was developed on top of #135 First and foremost, any changes to the existing _executor.rs_ and related code written in #135 are mostly there as **suggestions**. @jbaxleyiii lmk if you want to pair on these or feel free to take / reject parts. I didn't dig too deep into the implementation of the executor yet, I'm going to start doing that this week. Second: This PR separates out the Stargate library from the binary, for a few reasons: 1. Separating out the business logic from the wrapping server makes for easier testing. 2. The library doesn't choose an async runtime, nor does it assume any specific implementation of an HTTP server. 3. Use actix-web instead of tide; we've discussed these options and feel more comfortable going with a more mature server implementation, as well as one that uses tokio. @jbaxleyiii If these changes look good, we can merge them into #135 and we can continue from there. If you want to do it differently, lmk! - [ ] Address `todo!` / `unimplemented!` / `FIXME` in the code - [ ] Tests for the executor (ideally these should be unit tests on the Stargate library, for coverage) - [ ] Integration tests. I was thinking we can set up acephei services locally and then the gateway would have `localhost` references to those on different ports, integration tests would be entirely localhost. - [ ] Add tracing/metrics middleware -- I believe there are some existing crates. - [ ] Write more log lines so that they're integrated with the logger of the http server (use `env_logger` ?)
This PR was developed on top of #135
First and foremost, any changes to the existing executor.rs and related code written in #135 are mostly there as suggestions. @jbaxleyiii lmk if you want to pair on these or feel free to take / reject parts. I didn't dig too deep into the implementation of the executor yet, I'm going to start doing that this week.
Second: This PR separates out the Stargate library from the binary, for a few reasons:
@jbaxleyiii If these changes look good, we can merge them into #135 and we can continue from there. If you want to do it differently, lmk!
todo!
/unimplemented!
/FIXME
in the codelocalhost
references to those on different ports, integration tests would be entirely localhost.env_logger
?)