-
Notifications
You must be signed in to change notification settings - Fork 3
Use sonic-rs #215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use sonic-rs #215
Conversation
This avoids allocating a new string for each intermediate result (same for vector).
Single allocation of the vector. No more `flat_map` and `collect` - this way we remove the allocation and copying of intermediate vectors.
In short, instead of creating a new string, we go character by character and escape the string and we write to buffer directly
entity_satisfies_type_condition in two code paths
The previous version was never reaching primitive cases because for primitives selection is empty already. This fixes that so it uses our serialization instead of serde_json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @kamilkisiela, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request primarily focuses on a significant performance optimization by migrating the project's JSON serialization and deserialization infrastructure from serde_json to sonic-rs. This change impacts core components like the gateway and query plan executor, aiming to improve overall throughput and efficiency. The migration involved extensive refactoring to align existing logic with the sonic-rs API, alongside build system adjustments to further enhance performance.
Highlights
- Core JSON Library Migration: The project has largely transitioned from
serde_jsontosonic-rsfor JSON serialization and deserialization, aiming for significant performance improvements. This change is pervasive across the gateway, query plan executor, and various utility modules. - Performance-Oriented Build Configuration: New
rustdocflagsandrustflagshave been added to.cargo/config.tomlto enabletarget-cpu=nativefor bothbuildandbenchprofiles, allowing the Rust compiler to generate highly optimized code for the host CPU, which is particularly beneficial for data-intensive operations like JSON parsing. - Extensive Codebase Adaptation: Numerous files required refactoring to align with the
sonic-rsAPI, involving updates to JSON value manipulation, deep merging logic, GraphQL request parsing, and query plan execution/projection. This includes leveragingsonic-rs's specific traits and methods for efficient data handling. - Partial
async_graphqlIntegration: Whilesonic-rsis being adopted, theasync_graphqlexecutor module has been temporarily disabled, and its full integration withsonic-rsis noted as a pending task.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request successfully migrates the project from serde_json to sonic-rs for JSON serialization and deserialization, which is a positive step towards potential performance improvements. The changes are extensive and cover many files, indicating a thorough effort. However, I've identified a few areas where the migration could be more complete, more efficient, or more robust in terms of error handling and observability.
| return; | ||
| } | ||
|
|
||
| panic!("Unexpected data type {}", data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The panic! macro here will cause the application to crash if an unexpected data type is encountered during projection. While this might indicate a serious logical error, a graceful error handling mechanism (e.g., returning a Result or logging an error and returning Value::Null) would be more robust for a production system, preventing a full application crash.
| async_graphql::Value::from_json(json!(sonic_rs::to_string(value))) | ||
| .unwrap_or_default(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conversion involves an inefficient double serialization/deserialization. async_graphql::Value::from_json expects a serde_json::Value, but sonic_rs::to_string produces a string. This means the sonic_rs::Value is first converted to a string, then that string is parsed back into a serde_json::Value by async_graphql. If sonic-rs provides a direct conversion from sonic_rs::Value to serde_json::Value (e.g., value.to_serde_json_value()), that would be more efficient. Otherwise, consider if async_graphql can be configured to work directly with sonic_rs::Value or if this conversion is truly necessary.
async_graphql::Value::from_json(serde_json::from_str(&sonic_rs::to_string(value)).unwrap_or_default()),| fn from(response: async_graphql::Response) -> Self { | ||
| ExecutionResult { | ||
| data: Some(response.data.into_json().unwrap()), | ||
| data: Some(sonic_json!(serde_json::to_string(response.data.into_json().unwrap()))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous comment, this line performs an inefficient double serialization/deserialization. response.data.into_json().unwrap() yields a serde_json::Value, which is then converted to a string, and then that string is likely parsed back into a sonic_rs::Value by sonic_json!. This adds unnecessary overhead. Ideally, async_graphql should be configured to produce sonic_rs::Value directly, or a more efficient conversion method should be used if available.
data: Some(sonic_rs::from_str(&serde_json::to_string(&response.data.into_json().unwrap()).unwrap()).unwrap()),|
|
||
| let response_bytes = response.bytes().await?; | ||
| let execution_result = | ||
| sonic_rs::from_slice::<ExecutionResult>(&response_bytes).expect("parse(response)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using .expect() for deserialization can lead to a panic if the response body is not valid JSON or does not conform to the ExecutionResult structure. This could crash the application. It's generally safer to handle this Result explicitly, perhaps by returning an Err or logging the error and constructing a default ExecutionResult to prevent a crash.
sonic_rs::from_slice::<ExecutionResult>(&response_bytes).unwrap_or_else(|e| {
tracing::error!("Failed to parse HTTP response: {}", e);
ExecutionResult::from_error_message(format!("Failed to parse HTTP response: {}", e))
});| async-trait = "0.1" | ||
| serde = "1.0.219" | ||
| serde_json = "1.0.140" | ||
| serde_json = { version = "1.0.140" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The serde_json dependency is still present in query-plan-executor/Cargo.toml even though sonic-rs has been introduced and seems to replace most JSON operations. If serde_json is no longer directly used by this crate, it should be removed to reduce the dependency footprint and build times. If it's still indirectly required (e.g., by async-graphql which is conditionally compiled), consider adding a comment explaining its necessity.
| target_map: &mut serde_json::Map<String, Value>, | ||
| source_map: serde_json::Map<String, Value>, | ||
| ) { | ||
| pub fn deep_merge_objects(target_map: &mut sonic_rs::Object, source_map: sonic_rs::Object) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The #[instrument] macro was removed from the deep_merge_objects function. This function is critical for merging GraphQL responses. Removing the instrumentation might reduce observability and make debugging harder if issues arise during deep merging. Consider re-adding the instrumentation to maintain visibility into its execution.
|
✅
|
cdff818 to
d84b729
Compare
|
@kamilkisiela close for now? |
d84b729 to
4922535
Compare
b146ee7 to
61535e0
Compare
bdd3056 to
64dae35
Compare
2f95327 to
757a21c
Compare
still a few things to cover (async_graphql stuff and local execution) but for gateway to run it's not needed.