Schema and statement validation prior to execution #40

aew · 2016-11-11T11:17:09Z

Fixes a number of outstanding bugs and incompatibilities between validation output and execution input. Backwards compatible.

Changes include:

Perform schema and statement validation prior to execution, with more robust error handling (backwards compatible)
Line and column numbers in validation error output
Fix bug that could cause an infinite loop in the absence of a parent spec
Enable memoization of schema and statement preparation prior to execution
Post-process validated schema to eliminate unnecessary complexity
Fix potential collision of specs for root query and mutation fields sharing the same type
Support validation of arguments for root fields
Fix bug in validation of unused fragments
Fix bugs in validation of scalar leafs for lists of scalars and lists of objects
Deprecate graphql-clj.type/create-schema in favor of graphql-clj.validator/validate-schema
Fix bug in fragment cycles validation
Properly assign specs for recursively nested fields (switch to pre-order traversal for adding statement specs)

This is a large change because the parser was eliminating line and column metadata necessary for producing validation error messages. The executor was coupled to parser output rather than validator output. This PR proposes that the executor take the validator output as an input rather than the parser output.

Related discussion: #22

…t of a mandatory validation phase

aew · 2016-11-11T22:29:37Z

I think I understand the issue - a query failing validation (because of this subscriptionType issue) is returning more than just the errors key, including things that don't serialize to JSON. I'll patch this.

…pecs

aew · 2016-11-12T15:37:08Z

There were actually additional bugs hidden by a bug in the fragment cycles validator which was inappropriately short-circuiting the DFS. I've fixed these bugs and added a couple missing unit tests for things that were breaking the Graphiql starter project.

tendant · 2016-11-12T15:41:13Z

Just had simple test in GraphiQL, looks like it work fine now.

Thanks a lot for your great work. The validation really makes huge difference on usability of the library. It helps provide much better error messages.

tendant · 2016-11-12T16:06:06Z

Do you have more commits coming? If not, I might release a new version tonight.

aew · 2016-11-12T16:18:06Z

I'm not planning to commit any more unless we discover more bugs, so feel free to release a new version. Thanks!

tendant · 2016-11-14T15:29:00Z

Found a new issue in inspector:

https://github.com/tendant/graphql-clj-starter/tree/testing-validation-loc

Error result:

{
  "errors": [
    {
      "error": "Cannot query field 'kind' on type 'FullType'.",
      "loc": {
        "line": 21,
        "column": 1
      }
    }
  ]
}

aew · 2016-11-14T19:25:04Z

Ok - taking a look at this. Can you clarify which query you used to produce this error?

tendant · 2016-11-14T20:12:43Z

It came from introspection query.

Just open graphiql, the error will show up in graphql result.

tendant · 2016-11-15T00:45:28Z

Problem might relate to reloading of schema.

Looks like after updating schema in dev environment. Introspection query returns:

{
  "errors": [
    {
      "error": "Cannot query field 'kind' on type 'FullType'.",
      "loc": {
        "line": 21,
        "column": 1
      }
    }
  ]
}

When try to run any query after changing schema in dev server, I got below error for all root query fields:

{:errors [{:node-type :field, :parent {:node-type :operation-definition, :v/parent {:node-type :statement-root}, :v/path [Query]}, :node {:node-type :field, :field-name #object[graphql_clj.box.Box 0x231e19a3 {:status :ready, :val chapters}], :selection-set [{:node-type :field, :field-name #object[graphql_clj.box.Box 0x702e24d5 {:status :ready, :val id}], :v/path [Query chapters id], :spec :graphql-clj.1623193917/id}], :v/path [Query chapters], :spec :graphql-clj.1623193917/chapters, :v/parent {:node-type :operation-definition, :v/parent {:node-type :statement-root}, :v/path [Query]}, :v/parentk :selection-set}, :error Parent type not found}]}

tendant · 2016-11-15T00:46:09Z

BTW: I tried to use validator/validate-schema* function, it still has the same issue.

tendant · 2016-11-15T03:52:36Z

src/graphql_clj/executor.clj

+          (let [resolver (resolver/create-resolver-fn (:schema (:state validated-statement)) resolver-fn)]
+            (assoc-in validated-statement [:state :resolver] resolver)))))))
+
+(def prepare (memoize prepare*))


It is better not to use memoize. User can choose to use memoize if needed, they can also choose how to memoize.

Although memoize will get higher performance at the expense of higher memory use. However it is better leave it to user to decide whether use it or not.

Another good thing about leaving this to the user is that user can choose to use memoize if they wish, or they can use enhanced version of memoize with customized caching policy.

tendant · 2016-11-15T03:54:42Z

src/graphql_clj/executor.clj

-           (graphql-clj.type/create-type-meta-fn graphql-clj.type/demo-schema))
-  (let [type-schema (type/create-schema graphql-clj.introspection/introspection-schema)]
-    (execute nil type-schema nil "query{__schema{types {name kind}}}")))
+  ([validated-document]


For public api of execute, we should try to keep it to the bare minimal. User can create they own version of execute if needed.

It is easier to support only one execute function and maintain backward compatibility in the long run.

tendant · 2016-11-15T04:22:23Z

The problem goes away when all memoize function calls are removed.

Also, I would like to propose to expose execute function to be as simple as we can:

(defn execute
  [context schema-or-state resolver-fn statement-or-state variables]
    ...))

User can cache validated schema, validated statement and resolver-fn as they wish.

tendant · 2016-11-15T06:19:20Z

Another issue with the validated schema and statement: validation result is more than 10x bigger than parsing result.

Below are size comparison for schema "graphql-clj-starter.graphql/starter-schema"

6.2K  parsed-schema
70K   validated-schema

Below are size comparison for statement "{droid}"

242B parsed-statement
71K    validated-statement

Looks like the validated data contains the schema in addition to the result.

tendant · 2016-11-16T05:56:16Z

src/graphql_clj/visitor.clj

+           (filter #(= (:type-name %) root-name))
+           first
+           :fields
+           (map (juxt (comp box/box->val :field-name) (comp box/box->val :type-name)))


Bug here: Root fields return nil as field type, when root field is using composite type like NOT_NULL, LIST etc.

Root field type can't be expressed use string value of type-name.

Fixed this for list types. I think non-null types are working (there is a test for this with loremIpsum of type String!) but I may be missing something.

tendant · 2016-11-16T05:57:55Z

src/graphql_clj/spec.clj

+  (if (or recursive? (not (meta d)))
+    (do
+      (assert (= (first d) 'clojure.spec/def))               ;; Protect against unexpected statement eval
+      (try (eval d) (catch Compiler$CompilerException _ d))) ;; Squashing errors here to provide better error messages in validation


(assert (= (first d) 'clojure.spec/def)) does not prevent against all unexpected statement eval. The best way is not to use eval at all.

I couldn't think of another way to dynamically generate these specs. We could avoid using spec - we could use prismatic schema instead or something like that. Other suggestions?

aew · 2016-11-16T09:14:03Z

Sounds good re: removing memoize and the single arity for the execute function - I'll make those changes.

Re: size of the validation output - I'm waiting to see how validation output interacts with a simplified execution phase to understand exactly what we need to keep, then we can remove everything else and it will be small.

aew · 2016-11-16T13:02:23Z

I've made the changes discussed above - @tendant worth taking another look.

For additional context, there are a lot of outstanding issues that concern me, but I don't know how many of them we want to fix in a single PR:

Getting to "of type" is complex, happens in different ways in different places, and most likely needs to be solved a different way to maintain the same behavior with simpler code
I'm not very happy with the dynamic use of spec - we may be better served using prismatic schema given spec's global state, eval, minimal leverage from the specs after they are created, etc.
I would expect the validation phase to be slow (I haven't stopped to benchmark it). I don't think that optimizing this is a high priority at the moment, because ideally validation would behave like JIT compilation, whereas execution affects latency of serving each request. I think we want the output of the validation phase to inline everything necessary to execute a query (probably including the specific resolver function for each field, which would be a breaking change from today). I don't have a clear idea of what the statement validation output format should be until we re-implement the execution phase.
With this change, current library users may start doing validation inside the inner loop, which could be an unexpected breaking change from a performance perspective (this is why I initially thought to do memoization inside the library to keep things working as expected). We may want to think about how we make it clear where people should be memoizing, or bite the bullet and make an actual breaking change once the execution phase is clear.

Given all of this, we could either merge this and hope for the best, or remove validation as a requirement for the execution phase, and perhaps add a different execution function for previously "prepared" inputs. WDYT?

tendant · 2016-11-16T20:15:33Z

I will test it and merge it, if there is no major issue.

As long as we keep the public API simple and consistent, it is not too hard to change the internal implementation if needed.

tendant · 2016-11-16T21:36:58Z

Merged. Thanks you @aew for your great contribution!

aew added 30 commits November 5, 2016 16:54

Experiment with boxed values

8f1f7c9

Uncomment validation tests

efdf88f

Support loc for default values of correct type

406f4df

Support loc for fields on correct type

6ed9db3

Support loc for fragments on composite types

784155c

Support loc for known argument names

17a6bdd

Support loc for known directives

99c7160

Support loc for known fragment names

6ece740

Replace boxing with a deftype implementation

251b7f5

Box type names

68634c2

Support loc for known type names

c906637

Support loc for fragment cycles

ab380b9

Support loc for no undefined variables

4026738

Support loc for no unused fragments

fd33ca2

Support loc for no unused variables

20c109a

Support loc for provided non null arguments

ed3c36a

Support loc for scalar leafs

06ee4ea

Support loc for unique names

12a10ce

Support loc for variables are input types

f138f0b

Support loc for variables in allowed position

b66c2f3

Support loc for lone anonymous operation

15eb20f

Add changelog

517f12d

Cleanup boxing and other validation-specific complexity after validation

73f914d

Inject introspection schema and post-process schema into a map as par…

6f449f6

…t of a mandatory validation phase

Simplify map schema representation, as types are unique

412c0e5

Remove unnecessary unbox

092b97d

Move introspection schema manipulation into tests

ff80e46

Remove state wrapper on validate schema output

438a454

Support mutation and introspection statements with mandatory validation

a3c30ff

Update readme and confirm it still works

a226867

aew added 2 commits November 12, 2016 11:14

Fix bugs in fragment cycles validation and recursively nested field s…

78caa7f

…pecs

Include introspection field arguments in schema

678deed

Extract unbox node function

6704bfe

tendant mentioned this pull request Nov 14, 2016

spec/format for "Transformed Map" #35

Closed

tendant reviewed Nov 15, 2016

View reviewed changes

tendant reviewed Nov 16, 2016

View reviewed changes

aew added 2 commits November 16, 2016 07:43

Remove memoization and extra arities of execute function

cf14db0

Bug fix for list root types

e40a050

Update README and add unit tests to ensure readme code works

5453c97

tendant merged commit 973fcb8 into tendant:master Nov 16, 2016

aew deleted the validation-loc branch November 16, 2016 23:10

tendant mentioned this pull request Nov 17, 2016

Reduce the size of validated schema and statement #42

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schema and statement validation prior to execution #40

Schema and statement validation prior to execution #40

aew commented Nov 11, 2016 •

edited

Loading

aew commented Nov 11, 2016 •

edited

Loading

aew commented Nov 12, 2016

tendant commented Nov 12, 2016

tendant commented Nov 12, 2016

aew commented Nov 12, 2016

tendant commented Nov 14, 2016 •

edited

Loading

aew commented Nov 14, 2016 •

edited

Loading

tendant commented Nov 14, 2016

tendant commented Nov 15, 2016

tendant commented Nov 15, 2016

tendant Nov 15, 2016

tendant Nov 15, 2016

tendant commented Nov 15, 2016

tendant commented Nov 15, 2016

tendant Nov 16, 2016

tendant Nov 16, 2016

aew Nov 16, 2016

tendant Nov 16, 2016

aew Nov 16, 2016 •

edited

Loading

aew commented Nov 16, 2016

aew commented Nov 16, 2016 •

edited

Loading

tendant commented Nov 16, 2016 •

edited

Loading

tendant commented Nov 16, 2016

Schema and statement validation prior to execution #40

Schema and statement validation prior to execution #40

Conversation

aew commented Nov 11, 2016 • edited Loading

aew commented Nov 11, 2016 • edited Loading

aew commented Nov 12, 2016

tendant commented Nov 12, 2016

tendant commented Nov 12, 2016

aew commented Nov 12, 2016

tendant commented Nov 14, 2016 • edited Loading

aew commented Nov 14, 2016 • edited Loading

tendant commented Nov 14, 2016

tendant commented Nov 15, 2016

tendant commented Nov 15, 2016

tendant Nov 15, 2016

Choose a reason for hiding this comment

tendant Nov 15, 2016

Choose a reason for hiding this comment

tendant commented Nov 15, 2016

tendant commented Nov 15, 2016

tendant Nov 16, 2016

Choose a reason for hiding this comment

tendant Nov 16, 2016

Choose a reason for hiding this comment

aew Nov 16, 2016

Choose a reason for hiding this comment

tendant Nov 16, 2016

Choose a reason for hiding this comment

aew Nov 16, 2016 • edited Loading

Choose a reason for hiding this comment

aew commented Nov 16, 2016

aew commented Nov 16, 2016 • edited Loading

tendant commented Nov 16, 2016 • edited Loading

tendant commented Nov 16, 2016

aew commented Nov 11, 2016 •

edited

Loading

aew commented Nov 11, 2016 •

edited

Loading

tendant commented Nov 14, 2016 •

edited

Loading

aew commented Nov 14, 2016 •

edited

Loading

aew Nov 16, 2016 •

edited

Loading

aew commented Nov 16, 2016 •

edited

Loading

tendant commented Nov 16, 2016 •

edited

Loading