Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(GraphQL): Generate GraphQL query response by optimized JSON encoding (GraphQL-730) #7371

Merged
merged 20 commits into from Jan 30, 2021

Conversation

abhimanyusinghgaur
Copy link
Contributor

@abhimanyusinghgaur abhimanyusinghgaur commented Jan 27, 2021

This PR moves JSON generation for all the GraphQL queries and mutations inside Dgraph.
Previously, the GraphQL layer would get a JSON response from Dgraph in the form expected for DQL. That JSON response had to go through the following pipeline:

  • Unmarshal
  • Run GraphQL completion
  • Marshal

After this pipeline was executed, then the output JSON would be in the form as expected by GraphQL.

Now, the response from Dgraph would already be in the form expected by GraphQL. So, we no more need to run the above pipeline. Hence, saving valuable time and memory.

With this change, the total time taken for any GraphQL query has now reduced by ~30%. Also, the GraphQL layer now amounts for only ~0.5% of the total query time compared to ~35% previously (14-35% for simple queries and 35-70% for @custom HTTP queries, reference: Discuss).


This change is Reviewable

…eries (GRAPHQL-877) (#7230)

This PR moves the GraphQL JSON generation for simple queries inside Dgraph.

Previously, the GraphQL layer would get a JSON response from Dgraph in the form expected for DQL. That JSON response had to go through the following pipeline:

Unmarshal
Run GraphQL completion
Marshal
After this pipeline was executed, then the output JSON would be in the form as expected by GraphQL.

Now, the response from Dgraph would already be in the form expected by GraphQL. So, we no more need to run the above pipeline. Hence, saving valuable time and memory.
…tations (GRAPHQL-877) (#7237)

This PR is a continuation of #7230.

It makes standard GraphQL mutations work with the JSON changes.

Each mutation now returns the GraphQL JSON required for that mutation as a byte array from the MutationResolver itself, so no further processing of the JSON is required.
Delete mutations now query the `queryField` explicitly before executing the mutation, to make it work with the JSON changes.
Continued from:
* #7230
* #7237

This PR makes Geo queries/mutations work with the new JSON changes.
…s (GRAPHQL-936) (#7240)

Continued from:
* #7230
* #7237
* #7238

This PR makes aggregate queries at root and child levels work with the new JSON changes.
It removes the default behaviour of rewriting the count field to DQL query everytime (introduced in #7119). Now, rewritten DQL will have count field only if it was requested in GraphQL.
…fields asked multiple times at same level (GRAPHQL-938) (#7283)

Previously, for a GraphQL query like this:

```
query {
    queryThing {
      i1: id
      i2: id
      name
      n: name
      n1: name
    }
}
```

GraphQL layer used to generate following DQL query:

```
query {
    queryThing(func: type(Thing)) {
      id : uid
      name : Thing.name
    }
}
```

This can’t work with the new way of JSON encoding for GraphQL because it expects that everything asked for in GraphQL query is present in DQL response. So, Now, the GraphQL layer generates the following DQL to make it work:

```
query {
    queryThing(func: type(Thing)) {
      Thing.i1 : uid
      Thing.i2: uid
      Thing.name : Thing.name
      Thing.n : Thing.name
      Thing.n1 : Thing.name
    }
}
```

Basically, the `DgraphAlias()` is now generated using the `TypeCondition.GraphqlAlias` format. That works even with fragments as they have their specific type conditions. Here are related thoughts:
```
query {
	queryHuman {
		name
		n: name
	}
	# DQL: use field.name as DgAlias()
	# queryHuman(func: type(Human)) {
	#   name: Human.name
	#   name: Human.name
	# }
	# => just using field.name doesn't work => use field.alias
	# queryHuman(func: type(Human)) {
	#   name: Human.name
	#   n: Human.name
	# }
	# but there are interfaces and fragments
	queryCharacter {
		... on Human {
			n: name
		}
		... on Droid {
			n: bio
		}
	}
	# DQL: use field.alias as DgAlias()
	# queryCharacter(func: type(Character)) {
	#   n: Character.name
	#   n: Character.bio
	# }
	# => just using field.alias doesn't work => use field.name + field.alias
	# queryCharacter(func: type(Character)) {
	#   name.n: Character.name
	#   bio.n: Character.bio
	# }
	# but the implementing types may have fields with same name not inherited from the interface
	queryThing {
		... on ThingOne {
			c: color
		}
		... on ThingTwo {
			c: color
		}
	}
	# DQL: use field.name + field.alias as DgAlias()
	# queryThing(func: type(Thing)) {
	#   color.c: ThingOne.color
	#   color.c: ThingTwo.color
	# }
	# => even that doesn't work => use Typename + field.name + field.alias
	# queryThing(func: type(Thing)) {
	#   ThingOne.color.c: ThingOne.color
	#   ThingTwo.color.c: ThingTwo.color
	# }
	# nice! but then different frags explicitly ask for an inherited field with same name & alias
	queryCharacter {
		... on Human {
			n: name
		}
		... on Droid {
			n: name
		}
	}
	# DQL: use Typename + field.name + field.alias as DgAlias()
	# queryCharacter(func: type(Character)) {
	#   Character.name.n: Character.name
	#   Character.name.n: Character.name
	# }
	# => doesn't work => use typeCond + Typename + field.name + field.alias
	# queryCharacter(func: type(Character)) {
	#   Human.Character.name.n: Character.name
	#   Droid.Character.name.n: Character.name
	# }
	# but wait, wouldn't just typeCond + field.alias work?
	# queryCharacter(func: type(Character)) {
	#   Human.n: Character.name
	#   Droid.n: Character.name
	# }
	# yeah! that works even for all the above cases.
	#
	# OR
	#
	# just appending the count when encountering duplicate alias at the same level also works.
	# But, we need to maintain the count map every time we need DgAlias().
	# Also, this requires query pre-processing and the generated aliases are non-deterministic.
	#
	# So, we are going with the typeCond + field.alias approach. That will work with @Custom(dql:...) too.
}
```
…on are now resolved separately (GRAPHQL-937) (#7317)

This PR:
* makes the admin server on `/admin` and `@custom(http: {...})` Query/Mutation work through a common pipeline separate from the queries/mutations resolved through Dgraph
* adds scalar coercion to queries resolved through Dgraph
* and, removes completion as a step from resolvers, as it is not needed anymore.
…raphql-json

# Conflicts:
#	edgraph/graphql.go
#	edgraph/server.go
…TP fields (GRAPHQL-878) (#7361)

This PR moves the resolution and GraphQL JSON generation for `@custom(http: {...})` fields inside Dgraph.
It also fixes following existing bugs with `@custom` field resolution:
* https://discuss.dgraph.io/t/slash-graphql-lambda-bug/12233 (GRAPHQL-967)
* race condition in custom logic with repeated fields in implementing types (GRAPHQL-860)
* GRAPHQL-639
@github-actions github-actions bot added area/enterprise Related to proprietary features area/graphql Issues related to GraphQL support on Dgraph. labels Jan 27, 2021
@abhimanyusinghgaur abhimanyusinghgaur added area/performance Performance related issues. and removed area/enterprise Related to proprietary features labels Jan 27, 2021
@abhimanyusinghgaur abhimanyusinghgaur changed the title perf(GraphQL): JSON is generated by Dgraph for GraphQL queries (GraphQL-730) perf(GraphQL): Generate GraphQL query response by optimized JSON encoding (GraphQL-730) Jan 28, 2021
This is a cleanup PR for the JSON changes. It is mostly about nit-picks, making some old tests work, skipping some tests to be ported later to e2e and making the CI green overall. It also fixes a [Discuss Issue](https://discuss.dgraph.io/t/custom-field-resolvers-are-not-always-called/12489) (GRAPHQL-989).
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: The code seems a bit hard to understand. Having smaller and more relevant names would make it more comprehensible.

I didn't check the logic. Defer to @pawanrawal for the final approval.

Reviewed 70 of 70 files at r1.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @abhimanyusinghgaur, @jarifibrahim, @pawanrawal, and @vvbalaji-dgraph)


edgraph/server.go, line 942 at r1 (raw file):

}

type RequestWithContext struct {

Just call it Request, or find a better name.


query/outputnode.go, line 1125 at r1 (raw file):

		// implying that the root fastJson node will always have at least one child. So, no need
		// to check for the case where there are no children for the root fastJson node.
		gqlEncCtx := &graphQLEncodingCtx{

Possible to push this to a separate func?


query/outputnode.go, line 1125 at r1 (raw file):

		// implying that the root fastJson node will always have at least one child. So, no need
		// to check for the case where there are no children for the root fastJson node.
		gqlEncCtx := &graphQLEncodingCtx{

Rename to graphQLEncoder? Also, var = genc


query/outputnode.go, line 1127 at r1 (raw file):

		gqlEncCtx := &graphQLEncodingCtx{
			buf:              &bufw,
			dgraphTypeAttrId: enc.idForAttr("dgraph.type"),

typeAttrId?

Unless there's non-dgraph stuff here, the prefix dgraph seems redundant.


query/outputnode.go, line 1136 at r1 (raw file):

			//  * benchmark and find how much load should be put on HttpClient concurrently.
			//  * benchmark and find a default buffer capacity for these channels
			gqlEncCtx.errChan = make(chan x.GqlErrorList, 100)

Why 100? Why not 3?


query/outputnode.go, line 1137 at r1 (raw file):

			//  * benchmark and find a default buffer capacity for these channels
			gqlEncCtx.errChan = make(chan x.GqlErrorList, 100)
			gqlEncCtx.customFieldResultChan = make(chan customFieldResult, 100)

Just set to 3.


query/outputnode.go, line 1144 at r1 (raw file):

			go func() {
				for gqlErrs := range gqlEncCtx.errChan {
					gqlEncCtx.gqlErrs = append(gqlEncCtx.gqlErrs, gqlErrs...)

genc.Errors


query/outputnode.go, line 1161 at r1 (raw file):

				// The last option seems better.
				for result := range gqlEncCtx.customFieldResultChan {
					childFieldAttr := enc.idForAttr(result.childField.DgraphAlias())

attr?


query/outputnode.go, line 1165 at r1 (raw file):

						if childFieldNode, err := enc.makeCustomNode(childFieldAttr,
							result.childVal); err == nil {
							childFieldNode.next = parent.child

move this below, so the two lines of code are together.

Also, can you just call it childNode?


query/outputnode.go, line 1175 at r1 (raw file):

							parent.child = childFieldNode
						} else {
							gqlEncCtx.errChan <- x.GqlErrorList{result.childField.GqlErrorf(nil,

Deal with the else condition first. And continue from it. Then, you can have one less indentation for the main logic.


query/outputnode.go, line 1186 at r1 (raw file):

				field.SelectionSet())
			// close the error and result channels, to terminate the goroutines started above
			close(gqlEncCtx.errChan)

errCh


query/outputnode.go, line 1187 at r1 (raw file):

			// close the error and result channels, to terminate the goroutines started above
			close(gqlEncCtx.errChan)
			close(gqlEncCtx.customFieldResultChan)

Rename Chan to just Ch.


query/outputnode_graphql.go, line 79 at r1 (raw file):

		// range of int32.
		// TODO: think if we can do this check in a better way without parsing the bytes.
		if _, err := strconv.ParseInt(string(val), 10, 32); err != nil {

ParseInt(..., 0, 32)


query/outputnode_graphql.go, line 106 at r1 (raw file):

func toString(val []byte) string {
	strVal := string(val)
	strVal = strVal[1 : len(strVal)-1] // remove `"` from beginning and end

https://play.golang.org/p/7LyONcVF6ti

Can use strings.Trim with cutset = ".


query/outputnode_graphql.go, line 112 at r1 (raw file):

// encode creates a JSON encoded GraphQL response.
func (gqlCtx *graphQLEncodingCtx) encode(enc *encoder, fj fastJsonNode, fjIsRoot bool,
	childSelectionSet []gqlSchema.Field, parentField gqlSchema.Field,

We typically want 3 args. Max 4. This is 6 args.


query/outputnode_graphql.go, line 397 at r1 (raw file):

				// the current fastJson node.
				child = child.next
			} else if !curSelectionIsDgList && (!enc.getList(cur) || (fjIsRoot && (next == nil || enc.

100 chars.


query/outputnode_graphql.go, line 866 at r1 (raw file):

			gqlCtx.errChan <- append(errs, childField.GqlErrorf(nil,
				"Evaluation of custom field failed because expected result of "+
					"external BATCH request to be of list type, got: %v for field: %s within type: %s.",

100 chars.


query/outputnode_graphql.go, line 874 at r1 (raw file):

				"Evaluation of custom field failed because expected result of "+
					"external request to be of size %v, got: %v for field: %s within type: %s.",
				len(uniqueParents), len(batchedResult), childField.Name(), childField.GetObjectName()))

100 chars.

Copy link
Contributor Author

@abhimanyusinghgaur abhimanyusinghgaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 44 of 71 files reviewed, 18 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @pawanrawal, and @vvbalaji-dgraph)


edgraph/server.go, line 942 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Just call it Request, or find a better name.

Done.


query/outputnode.go, line 1125 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Possible to push this to a separate func?

Done.


query/outputnode.go, line 1125 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Rename to graphQLEncoder? Also, var = genc

Done.


query/outputnode.go, line 1127 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

typeAttrId?

Unless there's non-dgraph stuff here, the prefix dgraph seems redundant.

Done.


query/outputnode.go, line 1136 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Why 100? Why not 3?

Done.


query/outputnode.go, line 1137 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Just set to 3.

Done.


query/outputnode.go, line 1144 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

genc.Errors

Done.


query/outputnode.go, line 1161 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

attr?

Done.


query/outputnode.go, line 1165 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

move this below, so the two lines of code are together.

Also, can you just call it childNode?

Done.


query/outputnode.go, line 1175 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Deal with the else condition first. And continue from it. Then, you can have one less indentation for the main logic.

Done.


query/outputnode.go, line 1186 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

errCh

Done.


query/outputnode.go, line 1187 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Rename Chan to just Ch.

Done.


query/outputnode_graphql.go, line 79 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

ParseInt(..., 0, 32)

Done.


query/outputnode_graphql.go, line 106 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

https://play.golang.org/p/7LyONcVF6ti

Can use strings.Trim with cutset = ".

Done.


query/outputnode_graphql.go, line 112 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

We typically want 3 args. Max 4. This is 6 args.

Done.


query/outputnode_graphql.go, line 397 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

100 chars.

Done.


query/outputnode_graphql.go, line 866 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

100 chars.

Done.


query/outputnode_graphql.go, line 874 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

100 chars.

Done.

Copy link
Contributor

@pawanrawal pawanrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Parts of the PR were already reviewed in detail earlier.

@abhimanyusinghgaur abhimanyusinghgaur merged commit 74e3aa7 into master Jan 30, 2021
@abhimanyusinghgaur abhimanyusinghgaur deleted the abhimanyu/graphql-json branch February 8, 2021 09:34
abhimanyusinghgaur added a commit that referenced this pull request Feb 18, 2021
This PR makes `@custom(dql: ...)` work again after the GraphQL response JSON generation changes went in and broke it in #7371.
@NamanJain8 NamanJain8 restored the abhimanyu/graphql-json branch September 8, 2021 12:17
@joshua-goldstein joshua-goldstein deleted the abhimanyu/graphql-json branch August 11, 2022 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/graphql Issues related to GraphQL support on Dgraph. area/performance Performance related issues.
Development

Successfully merging this pull request may close these issues.

None yet

3 participants