Skip to content

Conversation

@kamilkisiela
Copy link
Contributor

still a few things to cover (async_graphql stuff and local execution) but for gateway to run it's not needed.

ardatan and others added 20 commits June 27, 2025 13:30
This avoids allocating a new string for each intermediate result (same
for vector).
Single allocation of the vector. No more `flat_map` and `collect` - this
way we remove the allocation and copying of intermediate vectors.
Follows the same pattern as the response projection. One string buffer
we write to.
In short, instead of creating a new string, we go character by character
and escape the string and we write to buffer directly
entity_satisfies_type_condition in two code paths
The previous version was never reaching primitive cases because for
primitives selection is empty already.
This fixes that so it uses our serialization instead of serde_json.
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @kamilkisiela, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on a significant performance optimization by migrating the project's JSON serialization and deserialization infrastructure from serde_json to sonic-rs. This change impacts core components like the gateway and query plan executor, aiming to improve overall throughput and efficiency. The migration involved extensive refactoring to align existing logic with the sonic-rs API, alongside build system adjustments to further enhance performance.

Highlights

  • Core JSON Library Migration: The project has largely transitioned from serde_json to sonic-rs for JSON serialization and deserialization, aiming for significant performance improvements. This change is pervasive across the gateway, query plan executor, and various utility modules.
  • Performance-Oriented Build Configuration: New rustdocflags and rustflags have been added to .cargo/config.toml to enable target-cpu=native for both build and bench profiles, allowing the Rust compiler to generate highly optimized code for the host CPU, which is particularly beneficial for data-intensive operations like JSON parsing.
  • Extensive Codebase Adaptation: Numerous files required refactoring to align with the sonic-rs API, involving updates to JSON value manipulation, deep merging logic, GraphQL request parsing, and query plan execution/projection. This includes leveraging sonic-rs's specific traits and methods for efficient data handling.
  • Partial async_graphql Integration: While sonic-rs is being adopted, the async_graphql executor module has been temporarily disabled, and its full integration with sonic-rs is noted as a pending task.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully migrates the project from serde_json to sonic-rs for JSON serialization and deserialization, which is a positive step towards potential performance improvements. The changes are extensive and cover many files, indicating a thorough effort. However, I've identified a few areas where the migration could be more complete, more efficient, or more robust in terms of error handling and observability.

return;
}

panic!("Unexpected data type {}", data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The panic! macro here will cause the application to crash if an unexpected data type is encountered during projection. While this might indicate a serious logical error, a graceful error handling mechanism (e.g., returning a Result or logging an error and returning Value::Null) would be more robust for a production system, preventing a full application crash.

Comment on lines +45 to +46
async_graphql::Value::from_json(json!(sonic_rs::to_string(value)))
.unwrap_or_default(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This conversion involves an inefficient double serialization/deserialization. async_graphql::Value::from_json expects a serde_json::Value, but sonic_rs::to_string produces a string. This means the sonic_rs::Value is first converted to a string, then that string is parsed back into a serde_json::Value by async_graphql. If sonic-rs provides a direct conversion from sonic_rs::Value to serde_json::Value (e.g., value.to_serde_json_value()), that would be more efficient. Otherwise, consider if async_graphql can be configured to work directly with sonic_rs::Value or if this conversion is truly necessary.

                    async_graphql::Value::from_json(serde_json::from_str(&sonic_rs::to_string(value)).unwrap_or_default()),

fn from(response: async_graphql::Response) -> Self {
ExecutionResult {
data: Some(response.data.into_json().unwrap()),
data: Some(sonic_json!(serde_json::to_string(response.data.into_json().unwrap()))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the previous comment, this line performs an inefficient double serialization/deserialization. response.data.into_json().unwrap() yields a serde_json::Value, which is then converted to a string, and then that string is likely parsed back into a sonic_rs::Value by sonic_json!. This adds unnecessary overhead. Ideally, async_graphql should be configured to produce sonic_rs::Value directly, or a more efficient conversion method should be used if available.

            data: Some(sonic_rs::from_str(&serde_json::to_string(&response.data.into_json().unwrap()).unwrap()).unwrap()),


let response_bytes = response.bytes().await?;
let execution_result =
sonic_rs::from_slice::<ExecutionResult>(&response_bytes).expect("parse(response)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using .expect() for deserialization can lead to a panic if the response body is not valid JSON or does not conform to the ExecutionResult structure. This could crash the application. It's generally safer to handle this Result explicitly, perhaps by returning an Err or logging the error and constructing a default ExecutionResult to prevent a crash.

            sonic_rs::from_slice::<ExecutionResult>(&response_bytes).unwrap_or_else(|e| {
                tracing::error!("Failed to parse HTTP response: {}", e);
                ExecutionResult::from_error_message(format!("Failed to parse HTTP response: {}", e))
            });

async-trait = "0.1"
serde = "1.0.219"
serde_json = "1.0.140"
serde_json = { version = "1.0.140" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The serde_json dependency is still present in query-plan-executor/Cargo.toml even though sonic-rs has been introduced and seems to replace most JSON operations. If serde_json is no longer directly used by this crate, it should be removed to reduce the dependency footprint and build times. If it's still indirectly required (e.g., by async-graphql which is conditionally compiled), consider adding a comment explaining its necessity.

target_map: &mut serde_json::Map<String, Value>,
source_map: serde_json::Map<String, Value>,
) {
pub fn deep_merge_objects(target_map: &mut sonic_rs::Object, source_map: sonic_rs::Object) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The #[instrument] macro was removed from the deep_merge_objects function. This function is critical for merging GraphQL responses. Removing the instrumentation might reduce observability and make debugging harder if issues arise during deep merging. Consider re-adding the instrumentation to maintain visibility into its execution.

@github-actions
Copy link

TestsPassed ☑️SkippedFailed ❌️Time ⏱
federation-audit | abstract-types.xml18 ran3 ✅15 ❌536ms
federation-audit | child-type-mismatch.xml4 ran1 ✅3 ❌126ms
federation-audit | circular-reference-interface.xml2 ran1 ✅1 ❌45ms
federation-audit | complex-entity-call.xml1 ran1 ✅87ms
federation-audit | corrupted-supergraph-node-id.xml10 ran8 ✅2 ❌143ms
federation-audit | enum-intersection.xml5 ran5 ✅63ms
federation-audit | fed1-external-extends-resolvable.xml1 ran1 ✅24ms
federation-audit | fed1-external-extends.xml4 ran4 ✅59ms
federation-audit | fed1-external-extension.xml4 ran4 ✅60ms
federation-audit | fed2-external-extends.xml4 ran4 ✅65ms
federation-audit | fed2-external-extension.xml4 ran4 ✅61ms
federation-audit | include-skip.xml4 ran4 ✅79ms
federation-audit | input-object-intersection.xml3 ran3 ✅36ms
federation-audit | interface-object-with-requires.xml7 ran3 ✅4 ❌122ms
federation-audit | keys-mashup.xml1 ran1 ✅46ms
federation-audit | mutations.xml4 ran4 ✅77ms
federation-audit | mysterious-external.xml2 ran2 ✅38ms
federation-audit | nested-provides.xml2 ran2 ✅32ms
federation-audit | node.xml1 ran1 ✅25ms
federation-audit | non-resolvable-interface-object.xml7 ran7 ✅79ms
federation-audit | null-keys.xml1 ran1 ❌35ms
federation-audit | override-type-interface.xml4 ran3 ✅1 ❌64ms
federation-audit | override-with-requires.xml4 ran4 ✅89ms
federation-audit | parent-entity-call-complex.xml1 ran1 ✅32ms
federation-audit | parent-entity-call.xml1 ran1 ✅37ms
federation-audit | provides-on-interface.xml2 ran2 ✅39ms
federation-audit | provides-on-union.xml2 ran1 ✅1 ❌44ms
federation-audit | requires-interface.xml5 ran5 ✅83ms
federation-audit | requires-requires.xml5 ran5 ✅105ms
federation-audit | requires-with-argument.xml5 ran3 ✅2 ❌113ms
federation-audit | requires-with-fragments.xml6 ran4 ✅2 ❌120ms
federation-audit | shared-root.xml2 ran2 ✅45ms
federation-audit | simple-entity-call.xml1 ran1 ✅23ms
federation-audit | simple-inaccessible.xml4 ran4 ✅51ms
federation-audit | simple-interface-object.xml13 ran8 ✅5 ❌200ms
federation-audit | simple-override.xml2 ran2 ✅36ms
federation-audit | simple-requires-provides.xml12 ran12 ✅163ms
federation-audit | typename.xml6 ran6 ✅97ms
federation-audit | unavailable-override.xml2 ran2 ✅35ms
federation-audit | union-interface-distributed.xml10 ran10 ✅110ms
federation-audit | union-intersection.xml12 ran12 ✅137ms

@github-actions
Copy link

github-actions bot commented Jun 27, 2025

k6-benchmark results

     ✓ response code was 200
     ✓ no graphql errors
     ✓ valid response structure

     █ setup

     checks.........................: 100.00% ✓ 8439      ✗ 0   
     data_received..................: 249 MB  8.2 MB/s
     data_sent......................: 3.3 MB  110 kB/s
     http_req_blocked...............: avg=130.24µs min=1.43µs  med=3.32µs   max=22.74ms  p(90)=5.27µs   p(95)=7.46µs  
     http_req_connecting............: avg=121.77µs min=0s      med=0s       max=22.58ms  p(90)=0s       p(95)=0s      
     http_req_duration..............: avg=107.3ms  min=4.45ms  med=93.56ms  max=508.91ms p(90)=161.62ms p(95)=210.81ms
       { expected_response:true }...: avg=107.3ms  min=4.45ms  med=93.56ms  max=508.91ms p(90)=161.62ms p(95)=210.81ms
     http_req_failed................: 0.00%   ✓ 0         ✗ 2833
     http_req_receiving.............: avg=1.65ms   min=34.45µs med=57.45µs  max=135.71ms p(90)=584.72µs p(95)=2.25ms  
     http_req_sending...............: avg=642.17µs min=8.3µs   med=16.7µs   max=80.72ms  p(90)=109.17µs p(95)=1.76ms  
     http_req_tls_handshaking.......: avg=0s       min=0s      med=0s       max=0s       p(90)=0s       p(95)=0s      
     http_req_waiting...............: avg=105.01ms min=4.38ms  med=91.83ms  max=503.63ms p(90)=158.41ms p(95)=206.75ms
     http_reqs......................: 2833    93.369378/s
     iteration_duration.............: avg=536.37ms min=99.45ms med=522.22ms max=979.37ms p(90)=649.13ms p(95)=709.79ms
     iterations.....................: 2813    92.710222/s
     vus............................: 50      min=50      max=50
     vus_max........................: 50      min=50      max=50

Base automatically changed from kamil-no-write to callback June 30, 2025 09:40
Base automatically changed from callback to no-preserve-order June 30, 2025 09:41
@ardatan ardatan force-pushed the no-preserve-order branch 2 times, most recently from cdff818 to d84b729 Compare July 1, 2025 12:35
@dotansimha
Copy link
Member

@kamilkisiela close for now?

@ardatan ardatan force-pushed the no-preserve-order branch from d84b729 to 4922535 Compare July 2, 2025 13:16
@kamilkisiela kamilkisiela marked this pull request as draft July 2, 2025 14:02
@ardatan ardatan force-pushed the no-preserve-order branch 4 times, most recently from b146ee7 to 61535e0 Compare July 9, 2025 09:34
@ardatan ardatan force-pushed the no-preserve-order branch 3 times, most recently from bdd3056 to 64dae35 Compare July 18, 2025 10:42
@ardatan ardatan mentioned this pull request Jul 19, 2025
@ardatan ardatan force-pushed the no-preserve-order branch from 2f95327 to 757a21c Compare July 21, 2025 10:09
@ardatan ardatan closed this in #202 Jul 23, 2025
@dotansimha dotansimha deleted the kamil-perf-sonic branch September 1, 2025 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants