Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload stream fires "data" after "end" #131

Closed
juona opened this issue Jan 14, 2019 · 10 comments · Fixed by #132
Closed

Upload stream fires "data" after "end" #131

juona opened this issue Jan 14, 2019 · 10 comments · Fixed by #132
Labels

Comments

@juona
Copy link

juona commented Jan 14, 2019

Hi again,

Node v10.14.1, graphql-upload v8.0.3.

An odd one here, possibly not a bug, might also be a mistake on my side. Either way, would appreciate your advice! : ]

First, simplified example code. I use this code to upload a file to my GraphQL server. The type of input.icon is Upload.

The call to setTimeout simulates certain asynchronous logic present in my code. I perform an asynchronous call somewhere in the program, wait for a response, and only then I pipe the upload stream to where I want it to go.

const create = input => 
  new Promise(resolve => {
    input.icon.then(file => {
      const stream = file.createReadStream();

      setTimeout(() => {
        stream.on("data", () => {
          console.log("Receiving data");
        });

        stream.on("end", () => {
          console.log("Stream ended");
          resolve();
        });
      }, 300);
    });
  });

This code works fine and prints Receiving data a number of times and finishes with Stream ended. However, there is a condition when this does not work:

  • The uploaded file is really small - smaller than the readableHighWaterMark.
  • The "end" event handler is literally attached before the data handler, like this:
setTimeout(() => {
  stream.on("end", () => {
    console.log("Stream ended");
    resolve();
  });

  stream.on("data", () => {
    console.log("Receiving data");
  });
}, 300);

In this case, the code will print Stream ended first, and then Receiving data.

Some more notes:

  • The code works fine if I use a different type of stream, not the one from graphql-upload. E.g. if I write const stream = fs.createReadStream("/home/juona/ex.tar.bz2"); to read the same file from the local file system, instead of uploading it via const stream = file.createReadStream();, it works just fine.
  • Using stream.pause() before setTimeout and then stream.resume() anywhere in the timeout callback fixes the problem.
  • The code also works correctly if setTimeout is removed or if the timeout is very low (e.g. if I use process.nextTick() instead), regardless of the order of event handler attachment. I would assume that the timeout has to be longer than the time it takes to upload the file.
  • It does not matter what kind of file is being used, only its size.

I am having trouble understanding what is happening here. In the actual application I use a bunch of other stream libraries to transform the upload and one of them fails because it expects stream.end to be called after at at least one stream.write, which I guess is a fair expectation. Also the problem is more complex in the real scenario since stream.pause() does not really help in that case, but I have to start the investigation somewhere...

Could perhaps be an issue in Node itself or a compatibility problem, but since it matters whether or not I use the upload stream or a different one, I am coming to you first.

Any thoughts?

@mike-marcacci
Copy link
Collaborator

Hey @juona, thank you so much for the fantastic details here. I have a suspicion here, and I will probably have some time later this afternoon to investigate further. I’ll comment back once I have more details.

@jaydenseric
Copy link
Owner

I'm pretty sure the right event to listen for when handling file upload streams in resolvers is finish, not end:

https://github.com/jaydenseric/apollo-upload-examples/blob/d2bf708591ad276929976d3e4e26ae8005bf4b87/api/resolvers.mjs#L31

@jaydenseric
Copy link
Owner

Well, it depends if you have the listener after piping to a writable stream or not.

https://stackoverflow.com/a/34310963/1596978

@juona
Copy link
Author

juona commented Jan 15, 2019

Hey,

Thank you for your responses!

@jaydenseric end looks to be for read streams whereas finish is for writing. While this example code is only a demonstration, the behaviour causes issues when piping too, since under the hood pipes work based on these same events and the write stream finishes before it's done writing.

@mike-marcacci thanks, I hope this does not turn into a big waste of time!

@juona
Copy link
Author

juona commented Jan 15, 2019

A bit more information, if that helps. If I pipe the stream and use stream.pause() after piping, the end event still fires, which is somewhat unexpected. I understand that pausing a piped stream is not quite reliable, but the behaviour differs from that of a stream created via a regular fs.createReadStream(), which I believe is odd.

const stream = file.createReadStream();

stream.pause();

setTimeout(() => {
  stream.pipe(
    through(
      function() {
        logger.info("DATA"); // <-- this will not be logged
      },
      function() {
        logger.info("END"); // <-- this WILL be logged
        resolve();
      }
    )
  );
  stream.pause();
}, 1000);

@mike-marcacci
Copy link
Collaborator

mike-marcacci commented Jan 15, 2019

OK, I got a moment to track down some stuff here. Before I go any further, I want to make sure you are talking about data events, and not write events (the latter is in this issue's title, but the former in its content).

Assuming this is data I tried to reproduce this with in a standalone way with fs-capacitor. Based on the behavior you're describing, if there's a bug, it would likely be coming from there, since it manages the streams from .createReadableStream().

Despite many attempts, I couldn't get the same kind of behavior you're talking about. However, fs-capacitor does do something that I suspect might be related. For terminating events (error, finish, end) it will fire off new listeners added after the event already occurred. This is tremendously convenient, but may be contributing to your problem. As soon as you add the data listener, it resumes the stream; if the entirety of the contents is contained in its internal buffer, then it will end. For this reason, and due to the fact that I've seen inconsistent behavior across node stream versions and various libraries, I always listen to data last. The problem here is that I respond to these past terminating events in process.nextTick, which will happen before data event, hence the behavior you're seeing.

I'm going to remove this special behavior from fs-capacitor and then make a PR here; none of the core functionality will be effected, but several of the test utilities will.

Thanks again for all the great investigation prior to posting an issue.

EDIT: I'm going to make this change regardless, because these features were more a convenience for me, and aren't necessary. But if you want to test my theory against your codebase, remove this and this in your fs-capacitor dependency, in the lib/index.mjs if you're using native esm or lib/index.js otherwise.

@mike-marcacci
Copy link
Collaborator

I just added PR #132 from branch fix/131 which includes my removal of special handling for terminating events. @juona, when you get a chance, can you try using branch, or v2.0.0 of fs-capacitor?

@juona juona changed the title Upload stream fires "write" after "end" Upload stream fires "data" after "end" Jan 15, 2019
@juona
Copy link
Author

juona commented Jan 15, 2019

Hey @mike-marcacci,

I believe I roughly understand your explanation (I don't know the internals of fs-capacitor or what it's for even), but I do know for sure that your fix works! : ] well done and thank you very much for this. I am using the fix branch for now hoping that a new release of graphql-upload comes soon.

@jaydenseric
Copy link
Owner

@juona it's on the way, give me a few minutes 🚀

jaydenseric added a commit that referenced this issue Jan 15, 2019
Update fs-capacitor and corresponding tests, fixes #131 .
@jaydenseric
Copy link
Owner

The fix is published in v8.0.4!

michaelbromley added a commit to vendure-ecommerce/vendure that referenced this issue Jan 16, 2019
When running the populate script, I was getting an ERR_STREAM_WRITE_AFTER_END error each time. It seems related to jaydenseric/graphql-upload#131 but unfortunately this is a transitive dependency of apollo-server-express and I can't directly upgrade to the latest version of graphql-upload. This work-around seems to fix things though.
krasivyy3954 added a commit to krasivyy3954/react-graphql-upload that referenced this issue Jan 6, 2023
Update fs-capacitor and corresponding tests, fixes jaydenseric/graphql-upload#131 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants