Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CherryPick] Improve Execution Performance (#1392) #2068

Merged
merged 2 commits into from
Mar 4, 2024

Conversation

seantleonard
Copy link
Contributor

@seantleonard seantleonard commented Mar 1, 2024

(note: descriptions that are quoted were intially authored by GitHub Copilot and modified for brevity, correctness, and conciseness. @seantleonard)

Why this change?

Take advantage from Hot Chocolates execution pipeline to improve execution of pre-acquired data. Without this change, there exists inefficiencies in JsonDocument object allocations and disposal, inefficiencies in GraphQL result processing in the ResolverMiddleware class

What is this change?

  • Introduction of ResolverTypeInterceptor.cs class
  • Introduction of ExecutionHelper.cs class
  • Introduction of ArrayPoolWriter.cs class

The ArrayPoolWriter.cs file provides a way to manage and interact
with pooled arrays, mainly byte arrays, in a more efficient and controlled manner. Here are some key benefits:

Memory Efficiency: It uses ArrayPool<byte>, a shared pool of
byte arrays. This is an efficient way to handle memory because it reuses arrays, thereby reducing the frequency of garbage collection and the memory footprint of the application.

Buffer Management: The class ArrayPoolWriter implements the
IBufferWriter<byte> interface, providing a standard way to write to the buffer. It also implements the IDisposable interface, ensuring proper cleanup of resources.

Exception Handling: The class includes robust error handling,
throwing exceptions when methods are used incorrectly, such as attempting to advance the buffer past its capacity or trying to use the writer after it has been disposed.

Dynamic Buffer Expansion: If the required capacity exceeds the
current buffer's capacity, it automatically rents a larger buffer from the pool, copies existing data, and returns the original buffer, ensuring that the buffer size can grow as needed.
Access Flexibility: It provides methods to access written data as both
ReadOnlyMemory<byte> and ReadOnlySpan<byte>, offering flexible ways to interact with the data.

This class can be beneficial in scenarios where you anticipate writing
to a byte buffer repeatedly or where the amount of data to write may grow over time.

  • JsonObjectExtensions.cs

Easy Conversion: Provides a straightforward way to convert
JsonObject instances to JsonElement or JsonDocument.

Memory Optimization: The conversion process leverages the
ArrayPoolWriter for writing the JsonObject to a pooled buffer, which avoids the need for serializing to a full JSON string. This can potentially save memory and improve performance.

  • SqlQueryEngine updates

The changes in the SqlQueryEngine.cs file can be summarized as
follows:

The ResolveInnerObject method is now renamed to ResolveObject. The
new method handles the case where the JsonElement is of JsonValueKind.String type and requires parsing into a JSON object. It also handles the case where the JsonElement is of JsonValueKind.Object type, directly returning the element.

The ResolveListType method is now renamed to ResolveList. It has
been refactored to handle two types of JsonElement: JsonValueKind.Array and JsonValueKind.String. It deserializes both types into a list of JsonElements. The method now also checks and handles the case where metadata is not null.

The public JsonDocument? ResolveInnerObject(JsonElement element, IObjectField fieldSchema, ref IMetadata metadata) method signature has been changed to public JsonElement ResolveObject(JsonElement element, IObjectField fieldSchema, ref IMetadata metadata) to account for no longer passing back and forth JsonDocument objects.

The public object? ResolveListType(JsonElement element, IObjectField fieldSchema, ref IMetadata metadata) method signature has been changed to public object ResolveList(JsonElement array, IObjectField fieldSchema, ref IMetadata? metadata) to account for no longer passing back and forth JsonDocument objects.

The ResolveObject and ResolveList methods now have detailed
summary comments explaining their functionality.
An extra check for
parentMetadata.Subqueries.TryGetValue(QueryBuilder.PAGINATION_FIELD_NAME, out PaginationMetadata? paginationObjectMetadata) has been added in the ResolveObject method. If true, parentMetadata is updated with paginationObjectMetadata.

Overall, these changes focus on handling the cases where JsonElement
is in the form of a string and needs to be parsed into a JSON object or array, as well as improving the metadata handling.

ExecutionHelper.cs

The ExecutionHelper class contains methods that are primarily used
for interacting with the query engine and mutation engine to execute queries and mutations, respectively.

ExecuteQueryAsync(...): Represents the root query resolver and
fetches initial data from the query engine. It accepts a context parameter and uses the query engine to execute the query. If the selection type is a list, it will execute a list query, register a cleanup action to dispose the documents after the query, and set the result. If not, it will execute a single item query and set the result.

ExecuteMutateAsync(...): Represents the root mutation resolver and
invokes the mutation on the query engine. Similar to the ExecuteQueryAsync method, it handles both list and single item mutations.

The ExecuteLeafField, ExecuteObjectField, ExecuteListField
methods are HotChocolate coined "pure resolvers" used to resolve the results of specific types of fields in the GraphQL request.

The GetMetadata, GetMetadataObjectField, SetNewMetadata,
SetNewMetadataChildren methods handling storing and retreiving SqlPaginationMetadata objects utilized by each level of depth in the path of a GraphQL result.

  • Added fix for DW JSON result differing from MSSQL json result. DW has extraneous escape characters in the json which results in JsonDocument/Element not resolving the expected JsonValueKind, and requires additional JsonDocument.Parse operation to remediate.

Testing

  • Unit testing added for the ArrayPoolWriter functionality. This was adopted from HotChocolate code.
  • GraphQL integration tests that already existed were used to ensure the refactor maintained the ability to handle request scenarios we test for. There were tests that initially failed that were used to help validate SqlPaginationMetadata usage, storage, and creation within the PureResolver context. The following test classes are directly applicable for testing query execution through the PureResolver code paths established in ExecutionHelper
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLFilterTests
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLMutationTests
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLPaginationTests
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLQueryTests -
      Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLSupportedTypesTests

(note: descriptions that are quoted were intially authored by GitHub
Copilot and modified for brevity, correctness, and conciseness.
@seantleonard)

## Why this change?

Take advantage from Hot Chocolates execution pipeline to improve
execution of pre-acquired data. Without this change, there exists
inefficiencies in `JsonDocument` object allocations and disposal,
inefficiencies in GraphQL result processing in the `ResolverMiddleware`
class

## What is this change?

+ Introduction of `ResolverTypeInterceptor.cs` class which 
+ Introduction of `ExecutionHelper.cs` class w

+ Introduction of `ArrayPoolWriter.cs` class
> The `ArrayPoolWriter.cs` file provides a way to manage and interact
with pooled arrays, mainly byte arrays, in a more efficient and
controlled manner. Here are some key benefits:

> **Memory Efficiency**: It uses `ArrayPool<byte>`, a shared pool of
byte arrays. This is an efficient way to handle memory because it reuses
arrays, thereby reducing the frequency of garbage collection and the
memory footprint of the application.
>
> **Buffer Management**: The class ArrayPoolWriter implements the
`IBufferWriter<byte>` interface, providing a standard way to write to
the buffer. It also implements the `IDisposable` interface, ensuring
proper cleanup of resources.
>
> **Exception Handling**: The class includes robust error handling,
throwing exceptions when methods are used incorrectly, such as
attempting to advance the buffer past its capacity or trying to use the
writer after it has been disposed.
>
> **Dynamic Buffer Expansion**: If the required capacity exceeds the
current buffer's capacity, it automatically rents a larger buffer from
the pool, copies existing data, and returns the original buffer,
ensuring that the buffer size can grow as needed.
> Access Flexibility: It provides methods to access written data as both
`ReadOnlyMemory<byte>` and `ReadOnlySpan<byte>`, offering flexible ways
to interact with the data.
>
> This class can be beneficial in scenarios where you anticipate writing
to a byte buffer repeatedly or where the amount of data to write may
grow over time.

+ JsonObjectExtensions.cs
> **Easy Conversion**: Provides a straightforward way to convert
`JsonObject` instances to `JsonElement` or `JsonDocument`.
>
> **Memory Optimization**: The conversion process leverages the
ArrayPoolWriter for writing the JsonObject to a pooled buffer, which
avoids the need for serializing to a full JSON string. This can
potentially save memory and improve performance.

+ SqlQueryEngine updates
> The changes in the `SqlQueryEngine.cs` file can be summarized as
follows:
>
> The `ResolveInnerObject` method is now renamed to `ResolveObject`. The
new method handles the case where the JsonElement is of
JsonValueKind.String type and requires parsing into a JSON object. It
also handles the case where the JsonElement is of JsonValueKind.Object
type, directly returning the element.
>
> The `ResolveListType` method is now renamed to `ResolveList`. It has
been refactored to handle two types of `JsonElement`:
`JsonValueKind.Array` and `JsonValueKind.String`. It deserializes both
types into a list of JsonElements. The method now also checks and
handles the case where metadata is not null.
>
> The `public JsonDocument? ResolveInnerObject(JsonElement element,
IObjectField fieldSchema, ref IMetadata metadata)` method signature has
been changed to `public JsonElement ResolveObject(JsonElement element,
IObjectField fieldSchema, ref IMetadata metadata)` to account for no
longer passing back and forth `JsonDocument` objects.
>
> The `public object? ResolveListType(JsonElement element, IObjectField
fieldSchema, ref IMetadata metadata)` method signature has been changed
to `public object ResolveList(JsonElement array, IObjectField
fieldSchema, ref IMetadata? metadata)` to account for no longer passing
back and forth `JsonDocument` objects.
>
> The `ResolveObject` and `ResolveList` methods now have detailed
summary comments explaining their functionality.
An extra check for
`parentMetadata.Subqueries.TryGetValue(QueryBuilder.PAGINATION_FIELD_NAME,
out PaginationMetadata? paginationObjectMetadata)` has been added in the
`ResolveObject` method. If true, `parentMetadata` is updated with
`paginationObjectMetadata`.
>
> Overall, these changes focus on handling the cases where `JsonElement`
is in the form of a string and needs to be parsed into a JSON object or
array, as well as improving the metadata handling.

ExecutionHelper.cs
> The `ExecutionHelper` class contains methods that are primarily used
for interacting with the query engine and mutation engine to execute
queries and mutations, respectively.

> `ExecuteQueryAsync(...)`: Represents the root query resolver and
fetches initial data from the query engine. It accepts a context
parameter and uses the query engine to execute the query. If the
selection type is a list, it will execute a list query, register a
cleanup action to dispose the documents after the query, and set the
result. If not, it will execute a single item query and set the result.
>
> `ExecuteMutateAsync(...)`: Represents the root mutation resolver and
invokes the mutation on the query engine. Similar to the
ExecuteQueryAsync method, it handles both list and single item
mutations.
>
> The `ExecuteLeafField`, `ExecuteObjectField`, `ExecuteListField`
methods are HotChocolate coined "pure resolvers" used to resolve the
results of specific types of fields in the GraphQL request.
>
> The `GetMetadata`, `GetMetadataObjectField`, `SetNewMetadata`,
`SetNewMetadataChildren` methods handling storing and retreiving
SqlPaginationMetadata objects utilized by each level of depth in the
path of a GraphQL result.

- Added fix for DW JSON result differing from MSSQL json result. DW has
extraneous escape characters in the json which results in
JsonDocument/Element not resolving the expected JsonValueKind, and
requires additional JsonDocument.Parse operation to remediate.

## Testing

- Unit testing added for the ArrayPoolWriter functionality. This was
adopted from HotChocolate code.
- GraphQL integration tests that already existed were used to ensure the
refactor maintained the ability to handle request scenarios we test for.
There were tests that initially failed that were used to help validate
SqlPaginationMetadata usage, storage, and creation within the
PureResolver context. The following test classes are directly applicable
for testing query execution through the PureResolver code paths
established in `ExecutionHelper`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLFilterTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLMutationTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLPaginationTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLQueryTests`
-
`Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLSupportedTypesTests`

---------

Co-authored-by: Sean Leonard <sean.leonard@microsoft.com>
@seantleonard
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@seantleonard
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@seantleonard seantleonard merged commit 36c3791 into release/0.11 Mar 4, 2024
9 checks passed
@seantleonard seantleonard deleted the dev/sean/perf_cherrypick011 branch March 4, 2024 20:15
@seantleonard seantleonard added the 🍒Cherrypick Cherry-picking another commit/PR label May 15, 2024
rohkhann added a commit that referenced this pull request May 24, 2024
## Why make this change?
When we moved to the new form of query execution:
#2068, we got rid of the
resolver middleware that had the function:
`        public static bool RepresentsNullValue(JsonElement element)
        {
return (string.IsNullOrEmpty(element.ToString()) && element.GetRawText()
== "null");
        }`

which would check if the returned string was null. This was useful for
datetime values in datawarehouse where because of the use of string agg
we have to return null as a string.

The error that is seen is: "The string 'null' was not recognized as a
valid DateTime. There is an unknown word starting at index '0'.
## What is this change?
This change switches to leveraging tryparse for the datetime field
value. This allows us to handle the null case, as if the string is null,
tryparse will just fail and return null for date.

## How was this tested?
A test has been added to ensure that datetime null case is handled.
Before:
Running a sample query which checks for a date column:
`query
{
  tests
  {
    items
    {
      ReleaseDate
    }
  }
}`

![image](https://github.com/Azure/data-api-builder/assets/124841904/e29516b4-a61f-4805-bb9e-4fd942786185)

After fix:


![image](https://github.com/Azure/data-api-builder/assets/124841904/fb2cb85b-6188-49f9-92f4-56b6fb0ed7b0)

---------

Co-authored-by: Ayush Agarwal <34566234+ayush3797@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🍒Cherrypick Cherry-picking another commit/PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants