Skip to content

Conversation

severussundar
Copy link
Contributor

@severussundar severussundar commented Jan 30, 2024

Why make this change?

This PR tackles the a)Database Query Generation and b)Selection Set Resolution components of Multiple Mutation/Multiple Create feature.

What is this change?

1. Enhancing the Hotchocolate input parsing logic

At the moment, DAB supports create mutation operation on a single entity and a single item. The current parsing logic BaseSqlQueryStructure.GQLMutArgumentToDictParams() is written with a foundational idea that a) All the fields present in the create mutation input request belong to the top-level entity b) All the fields will be scalar fields.

With the multiple create feature, these no longer hold true. Here, are some items that needs to be accounted for by the parsing logic.
a) The fields present in the input request, could correspond to columns of the top-level entity or could be relationship fields
b) Relationship fields are not scalar fields. They could be of Object type or List type depending on the type of relationship between top level entity and the related entity.
c) Since, nesting levels without any limit is supported, a relationship field can further have a combination of scalar fields + relationship fields. This inner relationship field can again be a combination of scalar fields + relationship fields and so on...
d) In addition to point create mutation, a many type mutation operation is also generated. Ex: createbooks, createBookmarks_multiple. These operations let the user create multiple items of the top-level entity. The root input field for this operation will be items as opposed to item (for a point create mutation operation).

New parsing logic: SqlMutationEngine.GQLMultipleCreateArgumentToDictParams()
Final data structure of the input params after parsing:
Many type Multiple Create: List<IDictionary<string, object?>>
Point Multiple Create: IDictionary<string, object?>

2. MultipleCreateStructure - New Wrapper class introduced

MultipleCreateStructure is used to hold all the relevant information about an entity needed for processing it - determining if there are referencing/referenced entities, determining if there is a linking table insertion necessary along with it, populating the relationship fields in the current entity and linking entity.

3. Identifying and handling the implications of a point vs many multiple create operation

Logic for identifying and handling different multiple type mutation operations - Point vs Many multiple create operation is added in SqlMutationEngine.PerformMultipleCreateOperation()

4. Understanding and processing the parsed mutation input request

Logic added in SqlMutationEngine.PerformDbInsertOperation().

This function handles the core logic from understanding the different fields of the parsed input parameters all the way till the successful creation of records in the database. This function is invoked for both point and many type multiple create operations.
The logic in this function at a high-level can be summarized as follows:

  • Identifying whether the input type is of list type vs object type (indicative of many vs one relationship cardinality)
  • From the parsed input parameters, identifying the referencing and referenced entities
  • Identifying and populating the fields belonging to the current and linking table entity.
  • Recurse for all the referenced entities - the top-level entity depends on the PKs of these entities. Hence, the referenced entities have to processed before the top-level entity.
  • Once the logic for all the referenced entities have been executed, all the foreign keys needed for the top-level entity should be available. Validate and populate all the foreign keys needed.
  • Build and execute the database query for creating a new record in the table backing the top-level entity. When building the database query, predicates corresponding to create policy (if defined) are added.
  • Store the PKs of the created record - this is needed for resolving the selection set.
  • Check if a record has to be created in the linking table. If so, build and execute the database query necessary.
  • Recurse for all the referencing entities - these entities depend on the PKs of the current entity. Hence, these entities need to processed after the top-level entity.

5. Introduction of synchronous methods in QueryExecutor:

Given the nature of multiple create feature, the order of execution of code statements is highly important. Consider the example of Book - Publisher which are related through a N:1 relationship.

Here, the insertion of Publisher item has to be completed before the logic for populating the foreign keys of Book item can begin to execute.

To guarantee the correct order of execution as well maintain the transactional aspects of the feature (successfully rollback all the inserted items if there any failures), the following equivalent synchronous methods were needed.

  • ExecuteQuery
  • ExecuteQueryAgainstDb
  • SetManagedIdentityAccessTokenIfAny
  • Read
  • ExtractResultSetFromDbDataReader
  • GetResultProperties

Note: These synchronous methods are used only for the create part of the multiple create feature. The selection set resolution still continues to use the asynchronous equivalent of these methods and takes advantage of the benefits they offer. All the other features continue to use the asynchronous version of these methods.

6. Selection Set Resolution

  • Selection set resolution involves fetching the requested fields on items created as result of current mutation operation. The logic for selection set resolution of a point create mutation operation is the same as the one that is present today. However, to resolve the selection set of a many type multiple create operation such as createbooks or createBookmarks_multiple, new logic had to be introduced.
    The current limitation with SqlInsertQueryStructure is that it can accept PKs of 0 or 1 item. It cannot accept PKs for a list of items which is precisely what is needed for many type multiple create mutation operation.
    New constructor for SqlInsertQueryStructure to accept PKs for a list of items is introduced.

  • To account for a list of PKs in the eventual SELECT statement, the logic for adding predicates is updated.

  • Logic changes in SqlQueryStructure.AddPrimaryKeyPredicates() and MsSqlQueryBuilder.BuildQueryForMultipleCreateOperation()
    Note: Exact database queries for each scenario and query plan analysis are added in the design doc.

  • For all the entities involved in the selection set, read policy will be honored.

7. Some more notes about the feature

  • When the feature flag for multiple create is disabled (or) when used with database types that do not support multiple create operations, a create mutation will follow the current logic of processing a create mutation. Only for databases that support multiple create and when the feature is enabled, this new logic kicks in.
    Rationale: Accounting for related entity fields in every step of processing the mutation request is uncessary. So, when it can be determined that the intention is to not use multiple create feature (either feature flag is disabled or database type does not support the feature), a lot of unnecessary logic can be totally skipped.

  • All or None Behavior: The scope of the transaction applies to the mutation operation as a whole. Irrespective of a point or many type multiple create operation, either all of the items will be created successfully or none of them will be created.

8. Introduced new entities and tables for testing

Entity Name Table backing the entity
Publisher_MM publishers_mm
Book_MM books_mm
Author_MM authors_mm
Review_MM reviews_mm
WebsiteUser_MM website_users_mm

The intent of introducing these new tables and entities is to define relationships between only through config file and validate different create mutation scenarios. There are no foreign key constraints defined in any of the tables. The relationships defined are only through config file.

9. Database Queries executed:

Consider the following multiple create mutation request

mutation multipleCreateExample {
  createbook(
    item: {
      title: "Harry Potter and the Goblet of Fire"
      publishers: { name: "Bloomsbury" }
      reviews: [
        {
          content: "Review #1"
          website_users: { id: 5100, username: "Website User #1" }
        }
        { content: "Review #2", websiteuser_id: 1 }
      ]
      authors: [
        {
          name: "J.K Rowling"
          birthdate: "1965-07-31"
          royalty_percentage: 100.0
        }
      ]
    }
  ) {
    id
    title
    publisher_id
    publishers {
      id
      name
    }
    reviews {
      items {
        book_id
        id
        content
      }
    }
    authors {
      items {
        id
        name
        birthdate
      }
    }
  }
}

The following database queries are executed in the same order:

CREATE STATEMENTS

1. Publisher
INSERT INTO [dbo].[publishers] ([name]) OUTPUT Inserted.[id] AS [id], Inserted.[name] AS [name] VALUES (@param0);
2. Book
INSERT INTO [dbo].[books] ([title], [publisher_id]) OUTPUT Inserted.[id] AS [id], Inserted.[title] AS [title], Inserted.[publisher_id] AS [publisher_id] VALUES (@param0, @param1);
3. Website User
INSERT INTO [dbo].[website_users] ([id], [username]) OUTPUT Inserted.[id] AS [id], Inserted.[username] AS [username] VALUES (@param0, @param1);
4. Reviews
INSERT INTO [dbo].[reviews] ([book_id], [content], [websiteuser_id]) OUTPUT Inserted.[book_id] AS [book_id], Inserted.[id] AS [id], Inserted.[content] AS [content], Inserted.[websiteuser_id] AS [websiteuser_id] VALUES (@param0, @param1, @param2);
INSERT INTO [dbo].[reviews] ([book_id], [content], [websiteuser_id]) OUTPUT Inserted.[book_id] AS [book_id], Inserted.[id] AS [id], Inserted.[content] AS [content], Inserted.[websiteuser_id] AS [websiteuser_id] VALUES (@param0, @param1, @param2);
5. Authors
INSERT INTO [dbo].[authors] ([name], [birthdate]) OUTPUT Inserted.[id] AS [id], Inserted.[name] AS [name], Inserted.[birthdate] AS [birthdate] VALUES (@param0, @param1);
6. Linking table
INSERT INTO [dbo].[book_author_link] ([book_id], [royalty_percentage], [author_id]) OUTPUT Inserted.[book_id] AS [book_id], Inserted.[author_id] AS [author_id], Inserted.[royalty_percentage] AS [royalty_percentage] VALUES (@param0, @param1, @param2);

SELECT STATEMENT

SELECT TOP 1 [table0].[id] AS [id], [table0].[title] AS [title], [table0].[publisher_id] AS [publisher_id], JSON_QUERY ([table1_subq].[data]) AS [publishers], JSON_QUERY (COALESCE([table4_subq].[data], '[]')) AS [reviews], JSON_QUERY (COALESCE([table8_subq].[data], '[]')) AS [authors] FROM [dbo].[books] AS [table0] OUTER APPLY (SELECT TOP 1 [table1].[id] AS [id], [table1].[name] AS [name] FROM [dbo].[publishers] AS [table1] WHERE [table0].[publisher_id] = [table1].[id] ORDER BY [table1].[id] ASC FOR JSON PATH, INCLUDE_NULL_VALUES,WITHOUT_ARRAY_WRAPPER) AS [table1_subq]([data]) OUTER APPLY (SELECT TOP 100 [table4].[book_id] AS [book_id], [table4].[id] AS [id], [table4].[content] AS [content] FROM [dbo].[reviews] AS [table4] WHERE [table4].[book_id] = [table0].[id] ORDER BY [table4].[book_id] ASC, [table4].[id] ASC FOR JSON PATH, INCLUDE_NULL_VALUES) AS [table4_subq]([data]) OUTER APPLY (SELECT TOP 100 [table8].[id] AS [id], [table8].[name] AS [name], [table8].[birthdate] AS [birthdate] FROM [dbo].[authors] AS [table8] INNER JOIN [dbo].[book_author_link] AS [table12] ON [table12].[author_id] = [table8].[id] WHERE [table12].[book_id] = [table0].[id] ORDER BY [table8].[id] ASC FOR JSON PATH, INCLUDE_NULL_VALUES) AS [table8_subq]([data]) WHERE [table0].[id] = @param19 ORDER BY [table0].[id] ASC FOR JSON PATH, INCLUDE_NULL_VALUES,WITHOUT_ARRAY_WRAPPER

How was this tested?

  • Existing unit tests and integration tests - validate the correct functioning of all other features in DAB
  • Integration and Unit tests specific to multiple create validate that correctness of multiple create feature.
  • Manual Tests

Sample Request(s)

image

image

@severussundar severussundar self-assigned this Jan 30, 2024
@severussundar
Copy link
Contributor Author

/azp run

Copy link
Contributor

@seantleonard seantleonard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments so far, still have more to review.

@severussundar
Copy link
Contributor Author

/azp run

Copy link
Contributor

@seantleonard seantleonard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more comments. Still reviewing

Copy link
Contributor

@seantleonard seantleonard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more feedback. Now on to test files.

Copy link
Contributor

@seantleonard seantleonard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Massive PR, and looks like good test coverage across various scenarios that are possible. Thank you for pushing through on this. Looking forward to further using this and additional bug bashing to play around with this feature!

Copy link
Contributor

@seantleonard seantleonard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback is mostly nits but necessary to implement to improve code readability for others' understanding. No logic changes requested.

Copy link
Contributor

@ayush3797 ayush3797 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after addressing unresolved feedback. Thanks for completing the puzzle :)

@severussundar
Copy link
Contributor Author

/azp run

@severussundar severussundar merged commit f5faec7 into main Apr 26, 2024
@severussundar severussundar deleted the dev/shyamsundarj/nested-inserts-db-query-generation branch April 26, 2024 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Multiple mutations Fixes/enhancements related to nested mutations.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiple Create - Database Query Generation and Selection Set Resolution
3 participants