Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change scalar to support serialization. #194

Open
yaacovCR opened this issue Mar 17, 2020 · 10 comments
Open

Change scalar to support serialization. #194

yaacovCR opened this issue Mar 17, 2020 · 10 comments

Comments

@yaacovCR
Copy link

Helpful for schema proxying, see:

https://github.com/yaacovCR/graphql-tools-fork/blob/with-apollo-upload-client/src/scalars/GraphQLUpload.ts#L10

Maybe your package could provide two scalars, one for server?

@jaydenseric
Copy link
Owner

Are you able to explain in more detail what "schema proxying" is, and how it works, for this change to be necessary? It's a little hard to make an informed decision without trying to reverse engineer the tools you’re working on.

If there is no better way, I'm not against the idea of changing the GraphQLUpload scalar to serialize: value => value — as long as there are still helpful errors somehow for people misusing the Upload scalar; perhaps if they mistakenly use it as the type for fields in non input types.

@yaacovCR
Copy link
Author

Link: https://www.graphql-tools.com/docs/schema-stitching

Schema proxying is inaccurate, in a nutshell it is one graphql server proxying a portion of a query to another.

It requires reserializing parsed input arguments.

@mike-marcacci
Copy link
Collaborator

Hi @yaacovCR, upload objects contain streams of bytes that were originally sent via multipart HTTP request. I'm certainly no expert when it comes to Apollo's schema stitching pattern, but it's my understanding that it assumes that all scalar values can be represented in JSON. This assumption directly opposes the premise and design of this library which leverages the GraphQL multipart request spec to efficiently make large binary payloads accessible to GraphQL resolvers.

The Upload scalar is designed to be used as an input only (notice that it is not called File or Blob) because there was no desire to change the response type of a GraphQL server from application/json. There may be some real benefit to using a multipart/mixed content type in responses, and if such behavior were to be specified, then support for large binary scalars become more symmetrical. Of course, this doesn't really help your use case though.

I'm going to leave this open for now, but I anticipate that it will get closed as being "out of scope."

Does this conflicting ideology make sense to you? Is there a specific serialization format that you think would actually be appropriate here?

@yaacovCR
Copy link
Author

yaacovCR commented Nov 30, 2020

Hi @mike-marcacci !

Great to get some attention on this, even if negative. Thanks for all your work on this library as well as fs-capacitor (and for your advocacy in favor of TS in upstream graphql-js!). While I'm at it, thanks as well to @jaydenseric for his work on this library!!!

In terms of this issue, I do hope you and @jaydenseric reconsider. The only reason I see for serialization being disabled is the suggestion that there is no use case, but there is indeed a use case....while serialization is usually used for preparing GraphQLOutputType values, but is also can be used to reverse the parsing of GraphQLInputType values for when a resolver needs to proxy them to a different service.

A few points:

(1) Apollo spun off schema stitching in favor of Federation some time ago. Currently, schema stitching as part of the graphql-tools library is managed by the Guild. I also help out when I can.

(2) Federation is a pattern of GraphQL subschema development -- that when adhered to -- is compatible with Federation-compatible gateway servers, with a canonical implementation by Apollo. Schema stitching is the creation of a new gateway schema that proxies requests to any GraphQL subschema, merging types as necessary. Schema stitching therefore creates a new schema that can be served by any GraphQL server. Configuration allows any subschema to be merged on the gateway, although certain patterns of subschema development may require less configuration.

(3) I am not familiar with how Federation handles scalars, but based on above, schema stitching handles them perfectly fine no matter what the internal representation is, as the subschemas are all perfectly valid GraphQL schemas, and the gateway schema is also a canonical GraphQL-JS GraphQLSchema.

(4) In fact, I spun up a demo using graphql-upload in which files are proxied as streams via the gateway schema to the subschemas (https://github.com/ardatan/graphql-tools/blob/be1a1575bb299b64993e81bca78cd053789cfee0/packages/links/tests/upload.test.ts)

(5) The only "trick" required to make this work is that the executor for a subschema needs to replace the individual upload scalars (which are promises resolve to an object from which the stream can be extracted) with the stream itself. @jaydenseric was kind enough to make the createUploadLink configurable with custom options that allow this to be easily done when combined with a custom Link that awaits the results of all those promises.

(6) Well, the other trick is that there is a bug/limitation in the more widely-available FormData server polyfill or a bug/limitation within the node-fetch v2, depending on how you look at it, that does not let FormData stream streams. The example above uses a custom FormData polyfill as a workaround, but there is an easier workaround with node-fetch v3. See form-data/form-data#394 (comment).

(7) In any case, my demonstration relies on a custom scalar that simply allows GraphQLUpload variables to be serialized. Here it is:

const GraphQLUpload = new GraphQLScalarType({
  name: 'Upload',
  description: 'The `Upload` scalar type represents a file upload.',
  parseValue: value => {
    if (value != null && isPromise(value.promise)) {
      // graphql-upload v10
      return value.promise;
    } else if (isPromise(value)) {
      // graphql-upload v9
      return value;
    }
    throw new GraphQLError('Upload value invalid.');
  },
  // serialization requires to support schema stitching
  serialize: value => value,
  parseLiteral: ast => {
    throw new GraphQLError('Upload literal unsupported.', ast);
  },
});

Note that the above scalar also works when used with v9 and v10, but that's completely besides the point.

The key is that serialization should just not throw, which is the current behavior....that is really unhelpful.

Anyways, if you've gotten this far, thanks for reading!

To return to where I started, this library seems to forbid serialization because there is no use case, and attempts to warn end-users that they are using the library incorrectly--but there is a use case. Serialization is usually for preparing GraphQLOutputType values, but is also can be used to reverse the parsing of GraphQLInputType values for when they need to be sent to a different service.

On the other hand, perhaps you might feel that this use case is so specialized, and so easy to work around by simply redefining the GraphQLUpload scalar, as I have done, that this is not worth changing. I happen to disagree, but I see that point as pretty fair.

I think that if you decide that you truly won't fix, you may as well close the issue, however. :)

@yaacovCR
Copy link
Author

Above has been edited.

@jaydenseric
Copy link
Owner

@yaacovCR

The problem is that what you are trying to do is not really serialization. The custom scalar serialize method is supposed to serialize the scalar to a JSON value, which doesn’t make sense for a file upload promise/stream.

In my earlier comment:

Are you able to explain in more detail what "schema proxying" is, and how it works, for this change to be necessary?

I'm conceptually familiar with what schema stitching is; when I asked for clarification about "schema proxying" it was because I wasn't sure if that was what you were referring to as I hadn't heard it called "proxying" before.

By asking "how it works", I was hoping you would share links to specific lines of code where the schema stitching implementation tries to "serialize" the scalar. Chances are there is an obvious, elegant solution. For example:

 new GraphQLScalarType({
   name: 'Upload',
   description: 'The `Upload` scalar type represents a file upload.',
   parseValue(value) {
     if (value instanceof Upload) return value.promise;
     throw new GraphQLError('Upload value invalid.');
   },
   parseLiteral(ast) {
     throw new GraphQLError('Upload literal unsupported.', ast);
   },
   serialize() {
     throw new GraphQLError('Upload serialization unsupported.');
   },
+  extensions: {
+    // Need to decide on the best name for this…
+    serializeForSchemaStitching: value => value
+  }
 })

Then in the schema stitching implementation code, by convention it could first check for scalar.extensions.serializeForSchemaStitching and use that, or else fall back to regular scalar.serialize.

I haven't used extensions for a custom scalar yet, but this seems to be the sort of thing it’s intended for.

Something else I hoped you would answer:

as long as there are still helpful errors somehow for people misusing the Upload scalar; perhaps if they mistakenly use it as the type for fields in non input types.

With your alternative GraphQLUpload scalar that does serialize: value => value, what will happen if a user accidentally makes a schema like this:

scalar Upload

type Query {
  userCount: Upload! # Was supposed to be Int!
}

Then queries userCount? I'm a bit rusty, so not really sure off the top of my head what will happen. Would they correctly get a GraphQL serialization error, is that what happens with the official GraphQLUpload scalar?

Maybe they would get a Upload value invalid GraphQL error before that point:

throw new GraphQLError('Upload value invalid.');

It's super dumb, but what if a user tries to use the Upload scalar going the other way, as a "download". What if they even construct an Upload instance for it's value so it won't error at the parseValue step.

@yaacovCR
Copy link
Author

Hi @jaydenseric !

Thanks again to you both for engaging with me on this. I appreciate your time, it's definitely not taken for granted.

I think you are correct that I don't have a good workaround available for guarding against uses of the Upload scalar as an output value if you remove the throw from serialize. From what I recall, that's what discouraged me from replying to your earlier post, and I guess when I saw @mike-marcacci's follow-up, I didn't scroll up and see what we were up to in this discussion.

I also should have acknowledged your point instead of just not responding!

I think the workaround you have suggested is fine enough, although it does make me a little sad to have to resort to that just to still be able to guard against using the scalar as an input. We can insert a check for that within makeExecutableSchema from graphql-tools, but that will not help the wider ecosystem. From what I recall, there has been some discussion within the graphql spec working groups about how to define some types to be able to be used for BOTH input and output, and in that context it might make sense to formally restrict a type to just input, but that would have to wait for another day, probably far in the future.

Of course, I watch this repository, and so I see how many questions you get in terms of basic usage of this very popular package, so I imagine from your perspective and time, the more checks the better!

Here are those code links. They confirm that the workaround you have suggested would be viable:
https://github.com/ardatan/graphql-tools/blob/294dedda609e366c60d149b194714d54210c4a03/packages/delegate/src/transforms/AddArgumentsAsVariables.ts#L148

https://github.com/ardatan/graphql-tools/blob/ae7c968deb2e6f51024a385abfc6455c0db5a5df/packages/utils/src/transformInputValue.ts#L36-L38

So, would you be interested in adding it? Or do you think the few users who would actually make use of this would be better served by just using my current workaround to override the scalar?

All the best, @yaacovCR

@yaacovCR
Copy link
Author

yaacovCR commented May 6, 2021

Note that graphql/graphql-js#3049 alludes to above discussion. Looks like graphql-js will start using serialize for input values as workaround for issues involving default values. This, as noted there, is not technically correct, perhaps temporary. Movement on that issue might help here as well.

@jaydenseric
Copy link
Owner

@yaacovCR thanks for your comment there, glad to have you on the case :)

I had a look to see if there was anything actionable on this issue to include in our coming major release, but it seems to be in a state of flux so we might just ship this version and look at a permanent solution based on what GraphQL.js comes up with in a following release.

@Songworks
Copy link

Hi there!

Sort of came upon this issue because I got the Upload literal unsupported. exception when using a delegateToSchema from graphql-tools and I guess somewhere in the depths of that one serialization happens...
Anyway, for for now I use a custom scalar with serialization sort of allowed (serialize: value => value) as mentioned before (thanks by the way! @yaacovCR ).

But my lazy +1 for this anyway. :o)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants