Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not call ".end()" in the callback of ".write" #195

Closed
kettanaito opened this issue Mar 10, 2022 · 11 comments
Closed

Do not call ".end()" in the callback of ".write" #195

kettanaito opened this issue Mar 10, 2022 · 11 comments
Assignees

Comments

@kettanaito
Copy link

kettanaito commented Mar 10, 2022

Hey 👋

I've got a question about the way this library calls the original .end() method on the ClientRequest. It does so in the callback of .write() (given there is data to send so that the callback gets called):

follow-redirects/index.js

Lines 127 to 130 in 13136e9

this.write(data, encoding, function () {
self._ended = true;
currentRequest.end(null, null, callback);
});

I think I understand the intention here: send this chunk, when done (regardless of whether it's successful or not—because you're not checking the error callback argument) mark the request as ready.

Question

Can we execute .write and .end in parallel?

this.write(data, encoding, function () {
  self._ended = true;
});

// This call is moved out of the write's callback.
currentRequest.end(null, null, callback);

this._ending = true;

Context

I'm a maintainer of the API mocking library called Mock Service Worker. We intercept requests in Node.js using the @mswjs/interceptors library, which augments the native ClientRequest class. We use class augmentation instead of stubbing it to allow as much of the Node.js default behavior to execute without having to polyfill it. Since we are an API mocking tool, we cannot really write request body chunks until we know there's (or isn't) a request handler for this particular request. That's why we only collect body chunks in the .write() method:

https://github.com/mswjs/interceptors/blob/219f9f3cd82f5baf3297f7c6095caa356c148d31/src/interceptors/ClientRequest/NodeClientRequest.ts#L85-L97

We then write those chunks only for passthrough requests when the request handler is known. That is possible to check only in the .end method:

https://github.com/mswjs/interceptors/blob/219f9f3cd82f5baf3297f7c6095caa356c148d31/src/interceptors/ClientRequest/NodeClientRequest.ts#L170-L179

Now since RedirectableRequest only calls .end in the callback of .write, when our library is applied the .end is never called. That happens because we do not call the .write callbacks immediately (not semantic, no chunks are being written just yet so it feels wrong to confirm the result by calling a callback), but instead are referred to the .end. Note that this difference is purely an implementation detail which should have no effect on the end-consumer. For them, the expected result is intercepted request and mocked response.

Common usage of .end

I've also noticed that commonly the .write and .end are called in parallel:

const req = http.get()
req.write('hello world')
req.end()

That is the way to send request body chunk and end a request in about any example I can find. I've never seen the write(chunk, () => this.end()) usage before but I understand it must have reasons behind it.

What I'd like to know

  1. What are the reasons to call .end only in the callback of .write?
  2. What are the implications of calling these two methods in parallel?
  3. Do you have any recommendations on how to handle your flow in our "interceptors" library?

Thank you for looking into this.

@kettanaito
Copy link
Author

If there are no objections to this, I'd like to open a pull request with the proposed changes.

@kettanaito
Copy link
Author

kettanaito commented Apr 20, 2022

I've opened a pull request with the proposed change. All tests are passing. As I've mentioned above, this is not a significant change behavior-wise but it's rather crucial in the context of how @mswjs/interceptors can listen to the end calls on ClientRequest.

I'm not proposing this change for follow-directs to be compatible with arbitrary third-party tooling though. It's an uncommon pattern to end client request within the write pipeline, so the proposed change aligns the source with how client request is commonly used. As a result of this change, it does become compatible with MSW toolchain, for which thousands of developers would be thankful.

@RubenVerborgh
Copy link
Collaborator

Thanks for your patience, @kettanaito.

Can we execute .write and .end in parallel?

I see no explicit expressions either way in the Stream documentation, so I'd be inclined to err on the side of caution and say "no". But if the tests run, then we should actually be safe.

That happens because we do not call the .write callbacks immediately (not semantic, no chunks are being written just yet so it feels wrong to confirm the result by calling a callback), but instead are referred to the .end.

Mhm interesting. However, it seems you'd run into trouble in several other scenarios as well.

I think that your collecting of chunks constitutes writing in your case, so the callback should be called. I.e., you mock awat the API, so you have flushed it in your stream (but perhaps not in underlying objects). Any objections to that other than "it feels wrong"; i.e., would things break?

@kettanaito
Copy link
Author

kettanaito commented Apr 20, 2022

I think that your collecting of chunks constitutes writing in your case, so the callback should be called

I considered that but was not sure of exposing the request instance in those callbacks which has not been initiated yet. Calling write callbacks is safe if people introduce their own side-effects but if they decide to analyze the state of the request upon each write we'd be providing them with an idle state, as the request hasn't even started yet in the mocked context.

I'm not that experienced with Node.js, so perhaps my worry is premature. But it's important to note the time difference at which those callbacks will be called:

# Regular (non-mocked) request
connection
write
 - cb
write
 - cb
end
# Mocked request
— connection (errors suppressed)
write (gather written body)
 - cb
write (...)
 - cb
end
  - can finally determine if request should be mocked
  - mock() / original.end()

Do you see any potential issues if we call write callbacks while the request hasn't actually written anything yet?

I.e., you mock awat the API, so you have flushed it in your stream (but perhaps not in underlying objects).

No actual objections but the one above.

@kettanaito
Copy link
Author

I honestly don't mind adjusting the interceptors library as well. What matters is finding the proper solution, and I know I could use your expertise in discussing whether our current write behavior in the interceptor is efficient.

@RubenVerborgh
Copy link
Collaborator

Do you see any potential issues if we call write callbacks while the request hasn't actually written anything yet?

No; from the perspective of the direct consumer, it is written: it's just that your writing is a memory buffer. It's success, so the callback should be invoked.

What matters is finding the proper solution

I think both work, but I think changing the interceptor is the conceptually correct and safest one.

@kettanaito
Copy link
Author

Regarding this change on the interceptor's side, I've discovered one more thing that may block us from doing so. If I invoke the write callback immediately, then I have no access to the potential error argument that ClientRequest exposes when writing a request body chunk fails. We can treat write callbacks as optimistic in the mocked scenario, but in the passthrough scenario, there's no way to know if writing request body failed or not. To find that out, we'd have to actually write the chunk, which we do in the end method via super.write(). Because only in the end method can we be sure that a particular request is mocked.

@RubenVerborgh
Copy link
Collaborator

@kettanaito I'm not sure I follow. In the passthrough scenario, if the error occurs, you can still emit it during end?
Could you provide a bulleted example where it fails?

@kettanaito
Copy link
Author

Sure thing!

Imagine the following usage scenario:

const req = http.request('/resource')

req.write('chunk', (error) => {
  if (error) console.error('writing chunk failed!')
})

req.end()

Next, let's assume we're mocking the http.get using the mentioned interceptors library. We've gathered the call to req.write and, as I'm exploring right now, we've called the provided write callback immediately. But here's the thing: we don't know if GET /resource should be mocked until we've called the .end() (that's the only method that can safely dispatch a Promise to look up any relevant request handlers).

Now, if this request should be bypassed, we have to actually write its request body (sent it to the actual server) upon .end(). Until then, we were in an ambiguous scenario, so the .write() call has only gathered the chunks (and executed the callbacks). Since we need to send the actual request body, we're calling super.write() on each gathered chunk in the .end() as the next step. Now we've got two issues:

  1. Write callback(s) have already been called. Now it's either we call them again upon the actual chunk write, or ignore them entirely to prevent duplicate calls.
  2. If we choose to ignore them, we don't know whether writing them to the actual socket connection succeeded (the server might've rejected a chunk). And it's only in the .write() call where we are exposed the error argument of the callback to know that.

@RubenVerborgh
Copy link
Collaborator

RubenVerborgh commented May 23, 2022

  • Write callback(s) have already been called. Now it's either we call them again upon the actual chunk write, or ignore them entirely to prevent duplicate calls.

You've fulfilled the contract; the write callbacks have been called.

  • If we choose to ignore them, we don't know whether writing them to the actual socket connection succeeded (the server might've rejected a chunk). And it's only in the .write() call where we are exposed the error argument of the callback to know that.

There will be no errors here, because none occur at that stage. Any errors will be emitted as an error event on the stream or through the end callback, and that's fine.

@kettanaito
Copy link
Author

I'm closing this because the higher level issues has been fixed (see this). Thanks for all the help and suggestions on the matter!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants