Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite GraphQL schema generation and query parsing (close #2801) #4111

Merged

Conversation

lexi-lambda
Copy link
Contributor

@lexi-lambda lexi-lambda commented Mar 13, 2020

This PR is a work in progress. This description will be updated to fit the usual format once it is closer to done.

Overview

This pull request is a complete rearchitecting of the way we build GraphQL schemas and parse GraphQL queries. It implements some of the ideas in #2801, but the approach is rather different from the one described there (as that approach turned out to be too naïve in various ways).

The core API is built around a new type, Parser, which has the following interface:

data Parser (k :: Kind) (m :: * -> *) (a :: *)

parserType :: Parser k m a -> Type k
runParser :: Parser k m a -> ParserInput k -> m a

instance Functor (Parser k m)

This may raise more question than it answers. How do you do anything with a Parser if it’s only a Functor? What is ParserInput? And what on earth is a Kind?

Before we answer those questions, let’s start with a recap of the Big Idea:

  • Currently, schema generation and query parsing are separate code paths, so they can get out of sync. We want a new approach that can do both at once.

  • We want to parse, not validate, so there shouldn’t be a separate GraphQL validation pass. Checking consistency against the schema should happen naturally as part of query parsing.

  • We need to be able to reflect on the GraphQL schema without even necessarily having a query to execute, since we need to serve GraphQL introspection queries.

The Parser abstraction is designed to solve all of the above problems. It works like this:

  • A Parser is something that knows how to parse a piece of a GraphQL AST. (Note that it doesn’t have anything to do with parsing GraphQL query concrete syntax, that step is still handled by graphql-parser-hs.)

  • You build Parsers using parser combinators, just like other Haskell parsing libraries. Since a Parser parses a GraphQL AST, the primitive combinators in this library are things like scalar, enum, and object.

  • However, unlike a traditional parser combinator library, every Parser is associated with a GraphQL type. For example, if you have a Parser that parses an input object, you can reflectively get information about what fields it contains and what their types are.

Challenges

This idea is simple on the surface, but subtle to implement. Broadly speaking, there are three significant sources of complexity:

  1. GraphQL ASTs are much more complicated than something like JSON. Field sets, fragments, nullability, and variables all add rules that need to be captured in the type system.

  2. Because our GraphQL schema is generated dynamically, based on information that isn’t known until runtime, we essentially have to generate a program on the fly. This stratifies the schema generation logic into two distinct phases: schema generation and query parsing. Both these phases can potentially fail due to invalid input, and in Haskell, that means we need two monads nested inside one another!

  3. The need to be able to reflect on the schema imposes strong constraints on the expressiveness of the combinator language. Especially challenging are mutually-recursive types, since the Parser abstraction requires these actually be represented in Haskell as a cyclic data structure!

Given the above, I think the solution I’ve come up with strikes a good compromise. The types are not so trivial that they are immediately obvious at first glance, but I think they’re actually quite readable once you understand what’s happening. The internal implementation of the combinators is very subtle in places, but I think that’s okay: that should have to change very infrequently. The important thing is what using the combinators looks like, and that, happily, is fairly simple.

API overview

GraphQL schemas

Before talking about parsers, let’s look at the representation of the GraphQL schema itself. GraphQL types are represented in a way that should be at least somewhat recognizable:

data Type k
  = NonNullable (NonNullableType k)
  | Nullable (NonNullableType k)

data NonNullableType k
  = TNamed (Definition (TypeInfo k))
  | TList (Type k)

data TypeInfo k where
  TIScalar :: TypeInfo 'Both
  TIEnum :: NonEmpty (Definition EnumValueInfo) -> TypeInfo 'Both
  TIInputObject :: [Definition (FieldInfo 'Input)] -> TypeInfo 'Input
  TIObject :: [Definition (FieldInfo 'Output)] -> TypeInfo 'Output

Here we have the usual suspects accounted for. What’s most interesting about these definitions is that they’re all indexed by k, which describes their kinds. These are not Haskell kinds, but “GraphQL kinds,” which have the following definition:

data Kind = Both | Input | Output

The GraphQL specification does not actually use the word “kinds” when it discusses the classification of types into input types and output types, but the name is apt. Since GraphQL types are Haskell values, it follows naturally that GraphQL kinds should be Haskell types—everything is shifted one level down—and that is precisely the case. The Kind datatype is used with DataKinds as a type-level index to TypeInfo, and this classification is used to enforce various invariants.

Parsers

As mentioned earlier, the Parser type is at the center of the API, and it has the following signature:

data Parser (k :: Kind) (m :: * -> *) (a :: *)

Here, the k parameter describes the Type associated with the Parser, and it also determines what the input to the parser will be:

runParser :: Parser k m a -> ParserInput k -> m a

type family ParserInput k where
  ParserInput 'Both = Value Variable
  ParserInput 'Input = Value Variable
  ParserInput 'Output = SelectionSet Variable

(The fact that we can treat 'Both the same as 'Input here is something of a happy coincidence that we can use to simplify the types; comments in the code explain in more detail.)

Parsing atoms

Parsers are constructed using some parser combinator functions. The “atomic” parser constructors are scalar and enum:

scalar
  :: MonadParse m
  => Name
  -> Maybe Description
  -> ScalarRepresentation a
  -> Parser 'Both m a

data ScalarRepresentation a where
  SRBoolean :: ScalarRepresentation Bool
  SRInt :: ScalarRepresentation Int32
  SRFloat :: ScalarRepresentation Double
  SRString :: ScalarRepresentation Text

enum
  :: MonadParse m
  => Name
  -> Maybe Description
  -> NonEmpty (Definition EnumValueInfo, a)
  -> Parser 'Both m a

Other combinators accept parsers as input to produce new parsers. For example, parsers for list types and nullable types can be created with list and nullable, respectively:

nullable :: (MonadParse m, 'Input <: k) => Parser k m a -> Parser k m (Maybe a)
list :: (MonadParse m, 'Input <: k) => Parser k m a -> Parser k m [a]

These type signatures introduce a use of the <: operator, which is a subtyping constraint on kinds. The constraint 'Input <: k specifies that k can be either 'Input or 'Both but cannot be 'Output.

Parsing input objects

One of the most important combinators is object, which builds a parser capable of parsing input objects:

object
  :: MonadParse m
  => Name
  -> Maybe Description
  -> FieldsParser 'Input m a
  -> Parser 'Input m a

The object combinator uses a separate FieldsParser abstraction to specify the fields of the input object. A FieldsParser is similar to a Parser, but a key difference is that FieldsParser has an Applicative instance, while Parser only has a Functor instance. This is because any two FieldsParsers can be combined to create a bigger FieldsParser that parses a union of their fields.

Field parsers are created using combinators that accept ordinary Parsers as arguments. The simplest of these is field:

field
  :: (MonadParse m, 'Input <: k)
  => Name
  -> Maybe Description
  -> Parser k m a
  -> FieldsParser 'Input m a

This builds a FieldsParser that parses exactly one field, using the given Parser to parse its value. If the field’s type is nullable, the field will be optional with a default value of null, otherwise the field will be required. If a default value other than null is desired, the fieldWithDefault combinator can be used instead:

fieldWithDefault
  :: (MonadParse m, 'Input <: k)
  => Name
  -> Maybe Description
  -> Value Void -- ^ default value
  -> Parser k m a
  -> FieldsParser 'Input m a

Field parsers are combined using the Applicative instance, which allows them to be used with ApplicativeDo and sequenceA, both of which are very useful for defining composite parsers. For example, a parser for comparison expressions can be defined this way:

let name = getName columnParser <> $$(litName "_comparison_exp")
object name Nothing $ catMaybes <$> sequenceA
  [ field $$(litName "_cast") Nothing (ACast <$> castParser)
  , field $$(litName "_eq") Nothing (AEQ True . mkParameter <$> columnParser)
  , field $$(litName "_ne") Nothing (ANE True . mkParameter <$> columnParser)
  -- etc.
  ]

Parsing selection sets

Finally, there are combinators for parsing selection sets. The selectionSet combinator is essentially the same as object, just for output objects instead of input objects:

selectionSet
  :: MonadParse m
  => Name
  -> Maybe Description
  -> FieldsParser 'Output m a
  -> Parser 'Output m a

Output field parsers are a little more complicated than input field parsers for two reasons:

  1. Output fields can have arguments.

  2. Output fields that return atomic values like scalars and enums don’t have subselection sets, but fields that return other selection sets do.

Both of these are handled by the selection function, which has the most complicated type of all the combinators:

selection
  :: forall k m a b
   . (MonadParse m, 'Output <: k)
  => Name
  -> Maybe Description
  -> FieldsParser 'Input m a
  -> Parser k m b
  -> FieldsParser 'Output m (Maybe (SelectionResult k a b))

type family SelectionResult k a b = r | r -> k a where
  SelectionResult 'Both   a _ = (Name, a)
  SelectionResult 'Output a b = (Name, a, b)

The extra FieldsParser 'Input argument is fairly straightforward—it parses the field’s arguments—but the SelectionResult return type is a bit trickier. It handles point 2 above by returning an extra result if the subparser is an output parser; if it’s an input parser, it just returns the field’s arguments and only uses the input parser for its type.

This is a bit subtle to describe, but when using it, it mostly just works: the result of selection is just slightly different depending on the type of its subparser.

Tying the knot

As mentioned in the challenges section, one of the trickiest parts of the implementation involves mutually-recursive types. Mutual recursion didn’t cause any trouble when we represented a reference to a type as that type’s Name, but with the Parser abstraction, the reference must be more direct (since the Parser needs to actually call the Parser for the referenced type).

This requirement means that Parsers for mutually-recursive types must be represented as a cyclic data structure. If we were to build that data structure by hand, it would be an enormous pain: we’d have to very carefully thread around (lazy) references to all the parsers we construct. To make things easier, we use a MonadSchema class that mostly handles the details automatically using memoization.

When constructing a Parser for a recursive or mutually-recursive type, the constructor function should be wrapped in memoize or one of its variants:

memoize
  :: (HasCallStack, MonadSchema n m, Ord a, Typeable a, Typeable b, Typeable k)
  => TH.Name
  -> (a -> m (Parser k n b))
  -> (a -> m (Parser k n b))

This type signature looks very scary, but it’s actually not nearly as intimidating as it looks. The HasCallStack constraint is just for error reporting, and the Typeable constraints just get magically solved behind the scenes by the compiler. That means this function morally has a type signature like this:

memoize
  :: (MonadSchema n m, Ord a)
  => TH.Name
  -> (a -> m (Parser k n b))
  -> (a -> m (Parser k n b))

Explaining how to use it is best done through an example. Consider the types of GraphQL boolean expressions, which are recursive:

input type_bool_exp {
  _or: [type_bool_exp!]
  _and: [type_bool_exp!]
  _not: type_bool_exp
  ...
}

A naïve implementation of a Parser for this type might look like this:

boolExp
  :: (MonadError QErr m, MonadParse n)
  => QualifiedTable -> m (Parser 'Input n BoolExp)
boolExp tableName = do
  name <- ...
  tableFieldParsers <- ...
  recur <- boolExp tableName -- direct recursion!
  pure $ BoolAnd <$> P.object name Nothing $ sequenceA
    [ P.fieldOptional $$(G.litName "_or") Nothing (BoolOr <$> P.list recur)
    , P.fieldOptional $$(G.litName "_and") Nothing (BoolAnd <$> P.list recur)
    , P.fieldOptional $$(G.litName "_not") Nothing (BoolNot <$> recur)
    ] ++ tableFieldParsers

Note the direct recursive call to boolExp. Because boolExp is monadic (it does some error checking during schema construction, so it must be), laziness won’t us help here. If we were to run this, we would just loop forever!

One solution would be to use MonadFix to add a little explicit sharing, which would work okay for this simple self-recursive example. Unfortunately, it’s much less nice for mutually-recursive types, since it becomes unclear where to put the call to mfix. To avoid that problem, we can use memoize:

boolExp
  :: (MonadSchema n m, MonadError QErr m)
  => QualifiedTable -> m (Parser 'Input n BoolExp)
boolExp = P.memoize 'boolExp \tableName -> do -- explicit memoization
  name <- ...
  tableFieldParsers <- ...
  recur <- boolExp tableName -- recursion will be memoized
  ...

Now each call to boolExp will be cached, and further calls with the same QualifiedTable argument will return the same Parser. The TH.Name argument to memoize is a bit of a trick: we need to somehow distinguish different parser constructors from each other, and Template Haskell name quotations provide a convenient, statically-known unique key. The actual name doesn’t have any significance, it’s just used for its Ord instance.

Remaining work

The Parser abstraction seems to work nicely, and I think the tricky cases in the interface are now worked out. However, there is still a lot of programming left to be done:

  • selectionSet needs to be improved to handle fragments. This shouldn’t be too hard, since we can still use the normalization strategy we currently do.

  • All the existing code needs to be ported to use the new abstractions. I do not expect this to be challenging, but I do expect it to take time.

Also, the current interface does not itself include support for union types and interfaces, so it only achieves feature parity with the current approach. I believe adding union types and interfaces should not be challenging, but I have not yet thought about it in detail.

@lexi-lambda lexi-lambda added the c/server Related to server label Mar 13, 2020
@netlify
Copy link

netlify bot commented Mar 13, 2020

Deploy preview for hasura-docs ready!

Built with commit af72e73

https://deploy-preview-4111--hasura-docs.netlify.app

@0x777
Copy link
Member

0x777 commented Apr 20, 2020

@lexi-lambda

There are three use cases related to Unions and Interfaces that need to be handled

Allow executing top level fields from multiple sources in the same query.

Currently, when you have a remote schema, you could have a query as follows:

query q ($order_id: Int!) {
  order {
  }
  getPaymentdetail(..) {
  }
}

where order is a Postgres table but getPaymentDetail comes from an external remote schema.

What do we do currently?

We use a hacky approach to determine if all of the root fields belong to the same source, if so, we allow the query otherwise we reject it.

If the source is a remote schema, we simply forward the request to the remote server without any validation.

The current approach is hacky because it doesn't handle fragment spreads on query_root

fragment queryAndPayment on query_root {
  order {
  }
  getPaymentdetail(..) {
  }
}
query q ($order_id: Int!) {
  ...queryAndPayment
}

queries such as these will result in internal errors.

How can we fix this?

I'm not sure, I can share my thoughts on how this can be fixed with our current validation approach:

By the end of the validation phase, we have an AST representing a 'denormalized' GraphQL query (i.e, no fragments and variables). We can extend this AST to support denormalized Unions and Interfaces.

Consider this schema and query:

interface Attachment {
  name: String!
  location: Url!
}

type Picture implements Attachment {
  name: String!
  location: Url!
  thumbnail: Url!
  ...
}

type Document implements Attachment {
  name: String!
  location: Url!
  previewLink: Url!
  ...
}

type Query {
  getAttachments(emailId: ID!): [Attachment!]!
}

Imagine such a query:

fragment EmailAttachmentDetails {
  ... on Picture {
    thumbnail
  }
  ... on Attachment {
    previewLink
  }
}

query {
  getAttachments(emailId: 1) {
    name
    location
    ... EmailAttachmentDetails
  }
}

This can be denormalized as follows:

query {
  getAttachments(emailId: 1) {
    name
    location

  ... on Picture {
    name
    location
    thumbnail
  }

  ... on Attachment {
    name
    location
    previewLink
  }
}

i.e,

  1. Get rid of named fragments and only use inline fragments
  2. Merge all the common fields into the fragment of the implemented type in the
    correct order.

Consider this query for example:

fragment EmailAttachmentDetails {
  ... on Picture {
    thumbnail
  }
  ... on Attachment {
    previewLink
  }
}

query {
  getAttachments(emailId: 1) {
    name
    ... EmailAttachmentDetails
    location
  }
}

The denormalized query should look like this:

query {
  getAttachments(emailId: 1) {
    name
    location

  ... on Picture {
    name
    thumbnail
    location
  }

  ... on Attachment {
    name
    previewLink
    location
  }
}

The Selection Set of an interface can be represented by this product type:

type InterfaceSelectionSet = (List Field, Map InterfaceImplementation Field)

Similarly, a Union type's selection set can be captured with:

type UnionSelectionSet = Map UnionMember Field

So our current SelSet type changes to

data SelectionSet
  = SelSetObject !(Seq Field)
  | SelSetInterface !InterfaceSelectionSet
  | SelSetUnion !UnionSelectionSet

If we can get rid of named fragments from the query_root it should be straight forward to generate the correct execution plan based on the list of Fields.

Allow specifying permissions on remote schema types

We would like to support specifying permissions on remote schema's types. This is useful in cases where the 'remote schema' is not under our user's control, say for example Stripe's API. In this case our user may want to hide fields of certain types for certain roles (like column permissions on tables). Further, we may want to add argument presets like column presets. Currently the way we enforce permissions is at the validation layer, which means that we'll need to validate interfaces and unions.

If we generate the correct schema, the same approach as above could solve this problem too.

Relay

WIP (I'll need some time to first implement the Global Object Specification) to share any insights on this.

codingkarthik and others added 22 commits July 23, 2020 20:10
…a#3239) (hasura#4551)

* Add support for multiple top-level fields in a subscription to improve testability of subscriptions

* Add an internal flag to enable multiple subscriptions

* Add missing call to withConstructorFn in live queries (fix hasura#3239)

Co-authored-by: Alexis King <lexi.lambda@gmail.com>
server: add scheduled triggers

Co-authored-by: Alexis King <lexi.lambda@gmail.com>
Co-authored-by: Marion Schleifer <marion@hasura.io>
Co-authored-by: Karthikeyan Chinnakonda <karthikeyan@hasura.io>
Co-authored-by: Aleksandra Sikora <ola.zxcvbnm@gmail.com>
…asura#4661)

Introspection queries accept variables, but we need to make sure to
also touch the variables that we ignore, so that an introspection
query is marked not reusable if we are not able to build a correct
query plan for it.

A better solution here would be to deal with such unused variables
correctly, so that more introspection queries become reusable.

An even better solution would be to type-safely track *how* to reuse
which variables, rather than to split the reusage marking from the
planning.

Co-authored-by: Tirumarai Selvan <tiru@hasura.io>
…#4801)

* flush log buffer on exception in mkWaiApp

* add comment to explain the introduced change

* add changelog
* changes for poller-log

add various multiplexed query info in poller-log

* minor cleanup, also fixes a bug which will return duplicate data

* Live query poller stats can now be logged

This also removes in-memory stats that are collected about batched
query execution as the log lines when piped into an monitoring tool
will give us better insights.

* allow poller-log to be configurable

* log minimal information in the livequery-poller-log

Other information can be retrieved from /dev/subscriptions/extended

* fix few review comments

* avoid marshalling and unmarshalling from ByteString to EncJSON

* separate out SubscriberId and SubscriberMetadata

Co-authored-by: Anon Ray <rayanon004@gmail.com>
Store the admin secret only as a hash to prevent leaking the secret
inadvertently, and to prevent timing attacks on the secret.

NOTE: best practice for stored user passwords is a function with a
tunable cost like bcrypt, but our threat model is quite different (even
if we thought we could reasonably protect the secret from an attacker
who could read arbitrary regions of memory), and bcrypt is far too slow
(by design) to perform on each request. We'd have to rely on our
(technically savvy) users to choose high entropy passwords in any case.

Referencing hasura#4736
…reSQL <= 11 (hasura#5187)

This adds a server flag, --pg-connection-options, that can be used to set a PostgreSQL connection parameter, extra_float_digits, that needs to be used to avoid loss of data on older versions of PostgreSQL, which have odd default behavior when returning float values. (fixes hasura#5092)
Co-authored-by: Vamshi Surabhi <vamshi@hasura.io>
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>
* generalize PGExecCtx to support specialized functions for various operations

* fix tests compilation

* allow customising PGExecCtx when starting the web server
@hasura-bot
Copy link
Contributor

Review app for commit c350585 deployed to Heroku: https://hge-ci-pull-4111.herokuapp.com
Docker image for server: hasura/graphql-engine:pull4111-c3505857

@hasura-bot
Copy link
Contributor

Review app for commit 07f511d deployed to Heroku: https://hge-ci-pull-4111.herokuapp.com
Docker image for server: hasura/graphql-engine:pull4111-07f511de

Co-authored-by: Auke Booij <auke@tulcod.com>
@hasura-bot
Copy link
Contributor

Review app for commit 4fd8e4d deployed to Heroku: https://hge-ci-pull-4111.herokuapp.com
Docker image for server: hasura/graphql-engine:pull4111-4fd8e4d6

@hasura-bot
Copy link
Contributor

Review app for commit c177d88 deployed to Heroku: https://hge-ci-pull-4111.herokuapp.com
Docker image for server: hasura/graphql-engine:pull4111-c177d886

@hasura-bot
Copy link
Contributor

Review app for commit 07e38e8 deployed to Heroku: https://hge-ci-pull-4111.herokuapp.com
Docker image for server: hasura/graphql-engine:pull4111-07e38e8e

@hasura-bot
Copy link
Contributor

Review app for commit 8c7d38d deployed to Heroku: https://hge-ci-pull-4111.herokuapp.com
Docker image for server: hasura/graphql-engine:pull4111-8c7d38d7

@hasura-bot
Copy link
Contributor

Review app for commit f930962 deployed to Heroku: https://hge-ci-pull-4111.herokuapp.com
Docker image for server: hasura/graphql-engine:pull4111-f9309627

Copy link
Member

@tirumaraiselvan tirumaraiselvan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changelog

@hasura-bot
Copy link
Contributor

Review app for commit 69c4c6a deployed to Heroku: https://hge-ci-pull-4111.herokuapp.com
Docker image for server: hasura/graphql-engine:pull4111-69c4c6a9

@lexi-lambda lexi-lambda merged commit 7e97017 into hasura:master Aug 21, 2020
19 checks passed
@hasura-bot
Copy link
Contributor

Review app https://hge-ci-pull-4111.herokuapp.com is deleted

stevefan1999-personal pushed a commit to stevefan1999-personal/graphql-engine that referenced this pull request Sep 12, 2020
… (hasura#4111)

Aka “the PDV refactor.” History is preserved on the branch 2801-graphql-schema-parser-refactor.

* [skip ci] remove stale benchmark commit from commit_diff

* [skip ci] Check for root field name conflicts between remotes

* [skip ci] Additionally check for conflicts between remotes and DB

* [skip ci] Check for conflicts in schema when tracking a table

* [skip ci] Fix equality checking in GraphQL AST

* server: fix mishandling of GeoJSON inputs in subscriptions (fix hasura#3239) (hasura#4551)

* Add support for multiple top-level fields in a subscription to improve testability of subscriptions

* Add an internal flag to enable multiple subscriptions

* Add missing call to withConstructorFn in live queries (fix hasura#3239)

Co-authored-by: Alexis King <lexi.lambda@gmail.com>

* Scheduled triggers (close hasura#1914) (hasura#3553)

server: add scheduled triggers

Co-authored-by: Alexis King <lexi.lambda@gmail.com>
Co-authored-by: Marion Schleifer <marion@hasura.io>
Co-authored-by: Karthikeyan Chinnakonda <karthikeyan@hasura.io>
Co-authored-by: Aleksandra Sikora <ola.zxcvbnm@gmail.com>

* dev.sh: bump version due to addition of croniter python dependency

* server: fix an introspection query caching issue (fix hasura#4547) (hasura#4661)

Introspection queries accept variables, but we need to make sure to
also touch the variables that we ignore, so that an introspection
query is marked not reusable if we are not able to build a correct
query plan for it.

A better solution here would be to deal with such unused variables
correctly, so that more introspection queries become reusable.

An even better solution would be to type-safely track *how* to reuse
which variables, rather than to split the reusage marking from the
planning.

Co-authored-by: Tirumarai Selvan <tiru@hasura.io>

* flush log buffer on exception in mkWaiApp ( fix hasura#4772 ) (hasura#4801)

* flush log buffer on exception in mkWaiApp

* add comment to explain the introduced change

* add changelog

* allow logging details of a live query polling thread (hasura#4959)

* changes for poller-log

add various multiplexed query info in poller-log

* minor cleanup, also fixes a bug which will return duplicate data

* Live query poller stats can now be logged

This also removes in-memory stats that are collected about batched
query execution as the log lines when piped into an monitoring tool
will give us better insights.

* allow poller-log to be configurable

* log minimal information in the livequery-poller-log

Other information can be retrieved from /dev/subscriptions/extended

* fix few review comments

* avoid marshalling and unmarshalling from ByteString to EncJSON

* separate out SubscriberId and SubscriberMetadata

Co-authored-by: Anon Ray <rayanon004@gmail.com>

* Don't compile in developer APIs by default

* Tighten up handling of admin secret, more docs

Store the admin secret only as a hash to prevent leaking the secret
inadvertently, and to prevent timing attacks on the secret.

NOTE: best practice for stored user passwords is a function with a
tunable cost like bcrypt, but our threat model is quite different (even
if we thought we could reasonably protect the secret from an attacker
who could read arbitrary regions of memory), and bcrypt is far too slow
(by design) to perform on each request. We'd have to rely on our
(technically savvy) users to choose high entropy passwords in any case.

Referencing hasura#4736

* server/docs: add instructions to fix loss of float precision in PostgreSQL <= 11 (hasura#5187)

This adds a server flag, --pg-connection-options, that can be used to set a PostgreSQL connection parameter, extra_float_digits, that needs to be used to avoid loss of data on older versions of PostgreSQL, which have odd default behavior when returning float values. (fixes hasura#5092)

* [skip ci] Add new commits from master to the commit diff

* [skip ci] serve default directives (skip & include) over introspection

* [skip ci] Update non-Haskell assets with the version on master

* server: refactor GQL execution check and config API (hasura#5094)

Co-authored-by: Vamshi Surabhi <vamshi@hasura.io>
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* [skip ci] fix js issues in tests by pinning dependencies version

* [skip ci] bump graphql version

* [skip ci] Add note about memory usage

* generalize query execution logic on Postgres (hasura#5110)

* generalize PGExecCtx to support specialized functions for various operations

* fix tests compilation

* allow customising PGExecCtx when starting the web server

* server: changes catalog initialization and logging for pro customization (hasura#5139)

* new typeclass to abstract the logic of QueryLog-ing

* abstract the logic of logging websocket-server logs

  introduce a MonadWSLog typeclass

* move catalog initialization to init step

  expose a helper function to migrate catalog
  create schema cache in initialiseCtx

* expose various modules and functions for pro

* [skip ci] cosmetic change

* [skip ci] fix test calling a mutation that does not exist

* [skip ci] minor text change

* [skip ci] refactored input values

* [skip ci] remove VString Origin

* server: fix updating of headers behaviour in the update cron trigger API and create future events immediately (hasura#5151)

* server: fix bug to update headers in an existing cron trigger and create future events

Co-authored-by: Tirumarai Selvan <tiru@hasura.io>

* Lower stack chunk size in RTS to reduce thread STACK memory (closes hasura#5190)

This reduces memory consumption for new idle subscriptions significantly
(see linked ticket).

The hypothesis is: we fork a lot of threads per websocket, and some of
these use slightly more than the initial 1K stack size, so the first
overflow balloons to 32K, when significantly less is required.

However: running with `+RTS -K1K -xc` did not seem to show evidence of
any overflows! So it's a mystery why this improves things.

GHC should probably also be doubling the stack buffer at each overflow
or doing something even smarter; the knobs we have aren't so helpful.

* [skip ci] fix todo and schema generation for aggregate fields

* 5087 libpq pool leak (hasura#5089)

Shrink libpq buffers to 1MB before returning connection to pool. Closes hasura#5087

See: hasura/pg-client-hs#19

Also related: hasura#3388 hasura#4077

* bump pg-client-hs version (fixes a build issue on some environments) (hasura#5267)

* do not use prepared statements for mutations

* server: unlock scheduled events on graceful shutdown (hasura#4928)

* Fix buggy parsing of new --conn-lifetime flag in 2b0e377

* [skip ci] remove cherry-picked commit from commit_diff.txt

* server: include additional fields in scheduled trigger webhook payload (hasura#5262)

* include scheduled triggers metadata in the webhook body

Co-authored-by: Tirumarai Selvan <tiru@hasura.io>

* server: call the webhook asynchronously in event triggers (hasura#5352)

* server: call the webhook asynchronosly in event triggers

* Expose all modules in Cabal file (hasura#5371)

* [skip ci] update commit_diff.txt

* [skip ci] fix cast exp parser & few TODOs

* [skip ci] fix remote fields arguments

* [skip ci] fix few more TODO, no-op refactor, move resolve/action.hs to execute/action.hs

* Pass environment variables around as a data structure, via @sordina (hasura#5374)

* Pass environment variables around as a data structure, via @sordina

* Resolving build error

* Adding Environment passing note to changelog

* Removing references to ILTPollerLog as this seems to have been reintroduced from a bad merge

* removing commented-out imports

* Language pragmas already set by project

* Linking async thread

* Apply suggestions from code review

Use `runQueryTx` instead of `runLazyTx` for queries.

* remove the non-user facing entry in the changelog

Co-authored-by: Phil Freeman <paf31@cantab.net>
Co-authored-by: Phil Freeman <phil@hasura.io>
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* [skip ci] fix: restrict remote relationship field generation for hasura queries

* [skip ci] no-op refactor; move insert execution code from schema parser module

* server: call the webhook asynchronously in event triggers (hasura#5352)

* server: call the webhook asynchronosly in event triggers

* Expose all modules in Cabal file (hasura#5371)

* [skip ci] update commit_diff.txt

* Pass environment variables around as a data structure, via @sordina (hasura#5374)

* Pass environment variables around as a data structure, via @sordina

* Resolving build error

* Adding Environment passing note to changelog

* Removing references to ILTPollerLog as this seems to have been reintroduced from a bad merge

* removing commented-out imports

* Language pragmas already set by project

* Linking async thread

* Apply suggestions from code review

Use `runQueryTx` instead of `runLazyTx` for queries.

* remove the non-user facing entry in the changelog

Co-authored-by: Phil Freeman <paf31@cantab.net>
Co-authored-by: Phil Freeman <phil@hasura.io>
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* [skip ci] implement header checking

Probably closes hasura#14 and hasura#3659.

* server: refactor 'pollQuery' to have a hook to process 'PollDetails' (hasura#5391)

Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* update pg-client (hasura#5421)

* [skip ci] update commit_diff

* Fix latency buckets for telemetry data

These must have gotten messed up during a refactor. As a consequence
almost all samples received so far fall into the single erroneous 0 to
1K seconds (originally supposed to be 1ms?) bucket.

I also re-thought what the numbers should be, but these are still
arbitrary and might want adjusting in the future.

* [skip ci] include the latest commit compared against master in commit_diff

* [skip ci] include new commits from master in commit_diff

* [skip ci] improve description generation

* [skip ci] sort all introspect arrays

* [skip ci] allow parsers to specify error codes

* [skip ci] fix integer and float parsing error code

* [skip ci] scalar from json errors are now parse errors

* [skip ci] fixed negative integer error message and code

* [skip ci] Re-fix nullability in relationships

* [skip ci] no-op refactor and removed couple of FIXMEs

* [skip ci] uncomment code in 'deleteMetadataObject'

* [skip ci] Fix re-fix of nullability for relationships

* [skip ci] fix default arguments error code

* [skip ci] updated test error message

!!! WARNING !!!
Since all fields accept `null`, they all are technically optional in
the new schema. Meaning there's no such thing as a missing mandatory
field anymore: a field that doesn't have a default value, and which
therefore isn't labelled as "optional" in the schema, will be assumed
to be null if it's missing, meaning it isn't possible anymore to have
an error for a missing mandatory field. The only possible error is now
when a optional positional argument is omitted but is not the last
positional argument.

* [skip ci] cleanup of int scalar parser

* [skip ci] retro-compatibility of offset as string

* [skip ci] Remove commit from commit_diff.txt

Although strictly speaking we don't know if this will work correctly in PDV
if we would implement query plan caching, the fact is that in the theoretical
case that we would have the same issue in PDV, it would probably apply not just
to introspection, and the fix would be written completely differently.  So this
old commit is of no value to us other than the heads-up "make sure query plan
caching works correctly even in the presence of unused variables", which is
already part of the test suite.

* Add MonadTrace and MonadExecuteQuery abstractions (hasura#5383)

* [skip ci] Fix accumulation of input object types

Just like object types, interface types, and union types, we have to avoid
circularities when collecting input types from the GraphQL AST.

Additionally, this fixes equality checks for input object types (whose fields
are unordered, and hence should be compared as sets) and enum types (ditto).

* [skip ci] fix fragment error path

* [skip ci] fix node error code

* [skip ci] fix paths in insert queries

* [skip ci] fix path in objects

* [skip ci] manually alter node id path for consistency

* [skip ci] more node error fixups

* [skip ci] one last relay error message fix

* [skip ci] update commit_diff

* Propagate the trace context to event triggers (hasura#5409)

* Propagate the trace context to event triggers

* Handle missing trace and span IDs

* Store trace context as one LOCAL

* Add migrations

* Documentation

* changelog

* Fix warnings

* Respond to code review suggestions

* Respond to code review

* Undo changelog

* Update CHANGELOG.md

Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* server: log request/response sizes for event triggers (hasura#5463)

* server: log request/response sizes for event triggers

  event triggers (and scheduled triggers) now have request/response size
  in their logs.

* add changelog entry

* Tracing: Simplify HTTP traced request (hasura#5451)

Remove the Inversion of Control (SuspendRequest) and simplify
the tracing of HTTP Requests.

Co-authored-by: Phil Freeman <phil@hasura.io>

* Attach request ID as tracing metadata (hasura#5456)

* Propagate the trace context to event triggers

* Handle missing trace and span IDs

* Store trace context as one LOCAL

* Add migrations

* Documentation

* Include the request ID as trace metadata

* changelog

* Fix warnings

* Respond to code review suggestions

* Respond to code review

* Undo changelog

* Update CHANGELOG.md

* Typo

Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* server: add logging for action handlers (hasura#5471)

* server: add logging for action handlers

* add changelog entry

* change action-handler log type from internal to non-internal

* fix action-handler-log name

* server: pass http and websocket request to logging context (hasura#5470)

* pass request body to logging context in all cases

* add message size logging on the websocket API

  this is required by graphql-engine-pro/hasura#416

* message size logging on websocket API

  As we need to log all messages recieved/sent by the websocket server,
  it makes sense to log them as part of the websocket server event logs.
  Previously message recieved were logged inside the onMessage handler,
  and messages sent were logged only for "data" messages (as a server event log)

* fix review comments

Co-authored-by: Phil Freeman <phil@hasura.io>

* server: stop eventing subsystem threads when shutting down (hasura#5479)

* server: stop eventing subsystem threads when shutting down

* Apply suggestions from code review

Co-authored-by: Karthikeyan Chinnakonda <chkarthikeyan95@gmail.com>

Co-authored-by: Phil Freeman <phil@hasura.io>
Co-authored-by: Phil Freeman <paf31@cantab.net>
Co-authored-by: Karthikeyan Chinnakonda <chkarthikeyan95@gmail.com>

* [skip ci] update commit_diff with new commits added in master

* Bugfix to support 0-size HASURA_GRAPHQL_QUERY_PLAN_CACHE_SIZE

Also some minor refactoring of bounded cache module:
 - the maxBound check in `trim` was confusing and unnecessary
 - consequently trim was unnecessary for lookupPure

Also add some basic tests

* Support only the bounded cache, with default HASURA_GRAPHQL_QUERY_PLAN_CACHE_SIZE of 4000. Closes hasura#5363

* [skip ci] remove merge commit from commit_diff

* server: Fix compiler warning caused by GHC upgrade (hasura#5489)

Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* [skip ci] update all non server code from master

* [skip ci] aligned object field error message with master

* [skip ci] fix remaining undefined?

* [skip ci] remove unused import

* [skip ci] revert to previous error message, fix tests

* Move nullableType/nonNullableType to Schema.hs

These are functions on Types, not on Parsers.

* [skip ci] fix setup to fix backend only test

the order in which permission checks are performed on the branch is
slightly different than on master, resulting in a slightly different
error if there are no other mutations the user has access to. By
adding update permissions, we go back to the expected case.

* [skip ci] fix insert geojson tests to reflect new paths

* [skip ci] fix enum test for better error message

* [skip ci] fix header test for better error message

* [skip ci] fix fragment cycle test for better error message

* [skip ci] fix error message for type mismatch

* [skip ci] fix variable path in test

* [skip ci] adjust tests after bug fix

* [skip ci] more tests fixing

* Add hdb_catalog.current_setting abstraction for reading Hasura settings

As the comment in the function’s definition explains, this is needed to
work around an awkward Postgres behavior.

* [skip ci] Update CONTRIBUTING.md to mention Node setup for Python tests

* [skip ci] Add missing Python tests env var to CONTRIBUTING.md

* [skip ci] fix order of result when subscription is run with multiple nodes

* [skip ci] no-op refactor: fix a warning in Internal/Parser.hs

* [skip ci] throw error when a subscription contains remote joins

* [skip ci] Enable easier profiling by hiding AssertNF behind a flag

In order to compile a profiling build, run:

$ cabal new-build -f profiling --enable-profiling

* [skip ci] Fix two warnings

We used to lookup the objects that implement a given interface by filtering all
objects in the schema document.  However, one of the tests expects us to
generate a warning if the provided `implements` field of an introspection query
specifies an object not implementing some interface.  So we use that field
instead.

* [skip ci] Fix warnings by commenting out query plan caching

* [skip ci] improve masking/commenting query caching related code & few warning fixes

* [skip ci] Fixed compiler warnings in graphql-parser-hs

* Sync non-Haskell assets with master

* [skip ci] add a test inserting invalid GraphQL but valid JSON value in a jsonb column

* [skip ci] Avoid converting to/from Map

* [skip ci] Apply some hlint suggestions

* [skip ci] remove redundant constraints from buildLiveQueryPlan and explainGQLQuery

* [skip ci] add NOTEs about missing Tracing constraints in PDV from master

* Remove -fdefer-typed-holes, fix warnings

* Update cabal.project.freeze

* Limit GHC’s heap size to 8GB in CI to avoid the OOM killer

* Commit package-lock.json for Python tests’ remote schema server

* restrict env variables start with HASURA_GRAPHQL_ for headers configuration in actions, event triggers & remote schemas (hasura#5519)

* restrict env variables start with HASURA_GRAPHQL_ for headers definition in actions & event triggers

* update CHANGELOG.md

* Apply suggestions from code review

Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* add test for table_by_pk node when roles doesn't have permission to PK

* [skip ci] fix introspection query if any enum column present in primary key (fix hasura#5200) (hasura#5522)

* [skip ci] test case fix for a6450e1

* [skip ci] add tests to agg queries when role doesn't have access to any cols

* fix backend test

* Simplify subscription execution

* [skip ci] add test to check if required headers are present while querying

* Suppose, table B is related to table A and to query B certain headers are
  necessary, then the test checks that we are throwing error when the header
  is not set when B is queried through A

* fix mutations not checking for view mutability

* [skip ci] add variable type checking and corresponding tests

* [skip ci] add test to check if update headers are present while doing an upsert

* [skip ci] add positive counterparts to some of the negative permission tests

* fix args missing their description in introspect

* [skip ci] Remove unused function; insert missing markNotReusable call

* [skip ci] Add a Note about InputValue

* [skip ci] Delete LegacySchema/ 🎉

* [skip ci] Delete GraphQL/{Resolve,Validate}/ 🎉

* [skip ci] Delete top-level Resolve/Validate modules; tidy .cabal file

* [skip ci] Delete LegacySchema top-level module

Somehow I missed this one.

* fix input value to json

* [skip ci] elaborate on JSON objects in GraphQL

* [skip ci] add missing file

* [skip ci] add a test with subscription containing remote joins

* add a test with remote joins in mutation output

* [skip ci] Add some comments to Schema/Mutation.hs

* [skip ci] Remove no longer needed code from RemoteServer.hs

* [skip ci] Use a helper function to generate conflict clause parsers

* [skip ci] fix type checker error in fields with default value

* capitalize the header keys in select_articles_without_required_headers

* Somehow, this was the reason the tests were failing. I have no idea, why!

* [skip ci] Add a long Note about optional fields and nullability

* Improve comments a bit; simplify Schema/Common.hs a bit

* [skip ci] full implementation of 5.8.5 type checking.

* [skip ci] fix validation test teardown

* [skip ci] fix schema stitching test

* fix remote schema ignoring enum nullability

* [skip ci] fix fieldOptional to not discard nullability

* revert nullability of use_spheroid

* fix comment

* add required remote fields with arguments for tests

* [skip ci] add missing docstrings

* [skip ci] fixed description of remote fields

* [skip ci] change docstring for consistency

* fix several schema inconsistencies

* revert behaviour change in function arguments parsing

* fix remaining nullability issues in new schema

* minor no-op refactor; use isListType from graphql-parser-hs

* use nullability of remote schema node, while creating a Remote reln

* fix 'ID' input coercing & action 'ID' type relationship mapping

* include ASTs in MonadExecuteQuery

* needed for PRO code-base

* Delete code for "interfaces implementing ifaces" (draft GraphQL spec)

Previously I started writing some code that adds support for a future GraphQL
feature where interfaces may themselves be sub-types of other interfaces.
However, this code was incomplete, and partially incorrect.  So this commit
deletes support for that entirely.

* Ignore a remote schema test during the upgrade/downgrade test

The PDV refactor does a better job at exposing a minimal set of types through
introspection.  In particular, not every type that is present in a remote schema
is re-exposed by Hasura.  The test
test_schema_stitching.py::TestRemoteSchemaBasic::test_introspection assumed that
all types were re-exposed, which is not required for GraphQL compatibility, in
order to test some aspect of our support for remote schemas.

So while this particular test has been updated on PDV, the PDV branch now does
not pass the old test, which we argue to be incorrect.  Hence this test is
disabled while we await a release, after which we can re-enable it.

This also re-enables a test that was previously disabled for similar, though
unrelated, reasons.

* add haddock documentation to the action's field parsers

* Deslecting some tests in server-upgrade

Some tests with current build are failing on server upgrade
which it should not. The response is more accurate than
what it was.

Also the upgrade tests were not throwing errors when the test is
expected to return an error, but succeeds. The test framework is
patched to catch this case.

* [skip ci] Add a long Note about interfaces and object types

* send the response headers back to client after running a query

* Deselect a few more tests during upgrade/downgrade test

* Update commit_diff.txt

* change log kind from db_migrate to catalog_migrate (hasura#5531)

* Show method and complete URI in traced HTTP calls (hasura#5525)

Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* restrict env variables start with HASURA_GRAPHQL_ for headers configuration in actions, event triggers & remote schemas (hasura#5519)

* restrict env variables start with HASURA_GRAPHQL_ for headers definition in actions & event triggers

* update CHANGELOG.md

* Apply suggestions from code review

Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>

* fix introspection query if any enum column present in primary key (fix hasura#5200) (hasura#5522)

* Fix telemetry reporting of transport (websocket was reported as http)

* add log kinds in cli-migrations image (hasura#5529)

* add log kinds in cli-migrations image

* give hint to resolve timeout error

* minor changes and CHANGELOG

* server: set hasura.tracecontext in RQL mutations [hasura#5542] (hasura#5555)

* server: set hasura.tracecontext in RQL mutations [hasura#5542]

* Update test suite

Co-authored-by: Tirumarai Selvan <tiru@hasura.io>

* Add bulldozer auto-merge and -update configuration

We still need to add the github app (as of time of opening this PR)

Afterwards devs should be able to allow bulldozer to automatically
"update" the branch, merging in parent when it changes, as well as
automatically merge when all checks pass.

This is opt-in by adding the `auto-update-auto-merge` label to the PR.

* Remove 'bulldozer' config, try 'kodiak' for auto-merge

see: https://github.com/chdsbd/kodiak

The main issue that bit us was not being able to auto update forked
branches, also:
palantir/bulldozer#66
palantir/bulldozer#145

* Cherry-picked all commits

* [skip ci] Slightly improve formatting

* Revert "fix introspection query if any enum column present in primary key (fix hasura#5200) (hasura#5522)"

This reverts commit 0f9a5af.

This undoes a cherry-pick of 34288e1 that was
already done previously in a6450e1, and
subsequently fixed for PDV in 70e89dc

* Do a small bit of tidying in Hasura.GraphQL.Parser.Collect

* Fix cherry-picking work

Some previous cherry-picks ended up modifying code that is commented out

* [skip ci] clarified comment regarding insert representation

* [skip ci] removed obsolete todos

* cosmetic change

* fix action error message

* [skip ci] remove obsolete comment

* [skip ci] synchronize stylish haskell extensions list

* use previously defined scalar names in parsers rather than ad-hoc literals

* Apply most syntax hlint hints.

* Clarify comment on update mutation.

* [skip ci] Clarify what fields should be specified for objects

* Update "_inc" description.

* Use record types rather than tuples fo IntrospectionResult and ParsedIntrospection

* Get rid of checkFieldNamesUnique (use Data.List.Extended.duplicates)

* Throw more errors when collecting query root names

* [skip ci] clean column parser comment

* Remove dead code inserted in ab65b39

* avoid converting to non-empty list where not needed

* add note and TODO about the disabled checks in PDV

* minor refactor in remoteField' function

* Unify two getObject methods

* Nitpicks in Remote.hs

* Update CHANGELOG.md

* Revert "Unify two getObject methods"

This reverts commit bd6bb40.

We do need two different getObject functions as the corresponding error message is different

* Fix error message in Remote.hs

* Update CHANGELOG.md

Co-authored-by: Auke Booij <auke@tulcod.com>

* Apply suggested Changelog fix.

Co-authored-by: Auke Booij <auke@tulcod.com>

* Fix typo in Changelog.

* [skip ci] Update changelog.

* reuse type names to avoid duplication

* Fix Hashable instance for Definition

The presence of `Maybe Unique`, and an optional description, as part of
`Definition`s, means that `Definition`s that are considered `Eq`ual may get
different hashes.  This can happen, for instance, when one object is memoized
but another is not.

* [skip ci] Update commit_diff.txt

* Bump parser version.

* Bump freeze file after changes in parser.

* [skip ci] Incorporate commits from master

* Fix developer flag in server/cabal.project.freeze

Co-authored-by: Auke Booij <auke@tulcod.com>

* Deselect a changed ENUM test for upgrade/downgrade CI

* Deselect test here as well

* [skip ci] remove dead code

* Disable more tests for upgrade/downgrade

* Fix which test gets deselected

* Revert "Add hdb_catalog.current_setting abstraction for reading Hasura settings"

This reverts commit 66e85ab.

* Remove circular reference in cabal.project.freeze

Co-authored-by: Karthikeyan Chinnakonda <karthikeyan@hasura.io>
Co-authored-by: Auke Booij <auke@hasura.io>
Co-authored-by: Tirumarai Selvan <tiru@hasura.io>
Co-authored-by: Marion Schleifer <marion@hasura.io>
Co-authored-by: Aleksandra Sikora <ola.zxcvbnm@gmail.com>
Co-authored-by: Brandon Simmons <brandon.m.simmons@gmail.com>
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>
Co-authored-by: Anon Ray <rayanon004@gmail.com>
Co-authored-by: rakeshkky <12475069+rakeshkky@users.noreply.github.com>
Co-authored-by: Anon Ray <ecthiender@users.noreply.github.com>
Co-authored-by: Vamshi Surabhi <vamshi@hasura.io>
Co-authored-by: Antoine Leblanc <antoine@hasura.io>
Co-authored-by: Brandon Simmons <brandon@hasura.io>
Co-authored-by: Phil Freeman <phil@hasura.io>
Co-authored-by: Lyndon Maydwell <lyndon@sordina.net>
Co-authored-by: Phil Freeman <paf31@cantab.net>
Co-authored-by: Naveen Naidu <naveennaidu479@gmail.com>
Co-authored-by: Karthikeyan Chinnakonda <chkarthikeyan95@gmail.com>
Co-authored-by: Nizar Malangadan <nizar-m@users.noreply.github.com>
Co-authored-by: Antoine Leblanc <crucuny@gmail.com>
Co-authored-by: Auke Booij <auke@tulcod.com>
@abooij
Copy link
Contributor

abooij commented Jul 20, 2021

This branch has been archived here.

hasura-bot pushed a commit that referenced this pull request Jul 27, 2021
Query plan caching was introduced by - I believe - #1934 in order to reduce the query response latency. During the development of PDV in #4111, it was found out that the new architecture (for which query plan caching wasn't implemented) performed comparably to the pre-PDV architecture with caching. Hence, it was decided to leave query plan caching until some day in the future when it was deemed necessary.

Well, we're in the future now, and there still isn't a convincing argument for query plan caching. So the time has come to remove some references to query plan caching from the codebase. For the most part, any code being removed would probably not be very well suited to the post-PDV architecture of query execution, so arguably not much is lost.

Apart from simplifying the code, this PR will contribute towards making the GraphQL schema generation more modular, testable, and easier to profile. I'd like to eventually work towards a situation in which it's easy to generate a GraphQL schema parser *in isolation*, without being connected to a database, and then parse a GraphQL query *in isolation*, without even listening any HTTP port. It is important that both of these operations can be examined in detail, and in isolation, since they are two major performance bottlenecks, as well as phases where many important upcoming features hook into.

Implementation

The following have been removed:
- The entirety of `server/src-lib/Hasura/GraphQL/Execute/Plan.hs`
- The core phases of query parsing and execution no longer have any references to query plan caching. Note that this is not to be confused with query *response* caching, which is not affected by this PR. This includes removal of the types:
- - `Opaque`, which is replaced by a tuple. Note that the old implementation was broken and did not adequately hide the constructors.
- - `QueryReusability` (and the `markNotReusable` method). Notably, the implementation of the `ParseT` monad now consists of two, rather than three, monad transformers.
- Cache-related tests (in `server/src-test/Hasura/CacheBoundedSpec.hs`) have been removed .
- References to query plan caching in the documentation.
- The `planCacheOptions` in the `TenantConfig` type class was removed. However, during parsing, unrecognized fields in the YAML config get ignored, so this does not cause a breaking change. (Confirmed manually, as well as in consultation with @sordina.)
- The metrics no longer send cache hit/miss messages.

There are a few places in which one can still find references to query plan caching:

- We still accept the `--query-plan-cache-size` command-line option for backwards compatibility. The `HASURA_QUERY_PLAN_CACHE_SIZE` environment variable is not read.

hasura/graphql-engine-mono#1815

GitOrigin-RevId: 17d92b2
hasura-bot pushed a commit that referenced this pull request Feb 17, 2022
We build the GraphQL schema by combining building blocks such as `tableSelectionSet` and `columnParser`. These building blocks individually build `{InputFields,Field,}Parser` objects. Those object specify the valid GraphQL schema.

Since the GraphQL schema is role-dependent, at some point we need to know what fragment of the GraphQL schema a specific role is allowed to access, and this is stored in `{Sel,Upd,Ins,Del}PermInfo` objects.

We have passed around these permission objects as function arguments to the schema building blocks since we first started dealing with permissions during the PDV refactor - see 5168b99 in #4111. This means that, for instance, `tableSelectionSet` has as its type:
```haskell
tableSelectionSet ::
  forall b r m n.
  MonadBuildSchema b r m n =>
  SourceName ->
  TableInfo b ->
  SelPermInfo b ->
  m (Parser 'Output n (AnnotatedFields b))
```

There are three reasons to change this.

1. We often pass a `Maybe (xPermInfo b)` instead of a proper `xPermInfo b`, and it's not clear what the intended semantics of this is. Some potential improvements on the data types involved are discussed in issue hasura/graphql-engine-mono#3125.
2. In most cases we also already pass a `TableInfo b`, and together with the `MonadRole` that is usually also in scope, this means that we could look up the required permissions regardless: so passing the permissions explicitly undermines the "single source of truth" principle. Breaking this principle also makes the code more difficult to read.
3. We are working towards role-based parsers (see hasura/graphql-engine-mono#2711), where the `{InputFields,Field,}Parser` objects are constructed in a role-invariant way, so that we have a single object that can be used for all roles. In particular, this means that the schema building blocks _need_ to be constructed in a role-invariant way. While this PR doesn't accomplish that, it does reduce the amount of role-specific arguments being passed, thus fixing hasura/graphql-engine-mono#3068.

Concretely, this PR simply drops the `xPermInfo b` argument from almost all schema building blocks. Instead these objects are looked up from the `TableInfo b` as-needed. The resulting code is considerably simpler and shorter.

One way to interpret this change is as follows. Before this PR, we figured out permissions at the top-level in `Hasura.GraphQL.Schema`, passing down the obtained `xPermInfo` objects as required. After this PR, we have a bottom-up approach where the schema building blocks themselves decide whether they want to be included for a particular role.

So this moves some permission logic out of `Hasura.GraphQL.Schema`, which is very complex.

PR-URL: hasura/graphql-engine-mono#3608
GitOrigin-RevId: 51a744f34ec7d57bc8077667ae7f9cb9c4f6c962
hasura-bot pushed a commit that referenced this pull request May 23, 2022
This file was no longer needed after the PDV refactor (#4111). It was replaced by `Hasura.GraphQL.Schema.Introspect`.

PR-URL: hasura/graphql-engine-mono#4532
GitOrigin-RevId: 61a5f44a9e68238d61095a3f176b2d5847c63307
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/server Related to server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC] Refactor GraphQL schema generation and query validation to move parsing out of resolvers