nextTick cause data loss in first tick #243

Closed
Shogun147 opened this Issue Feb 24, 2013 · 8 comments

Projects

None yet

3 participants

@Shogun147

Few days ago there was a commit automatically call nextTick on some synchronous function calls.

What this does is that any code from series for example will run in the nextTick, but what if we need to attach to some object in same tick and listen his events? In previous version this works ok. But now we just lose first tick.

Maybe will be better to have an optional third parameter for this functions which will run them without nextTick?

@brianmaissy
Contributor

I'm very interested in this issue. Commit 6ad64ac also seems a bit strange to me, but for a different reason I think.

@Shogun147, can you provide an example of your use case?

@Shogun147

@brianmaissy - the general idea is like so:

var Http = require('http');
var Fs   = require('fs');

var Formidable = require('formidable');
var Async      = require('async');

var content = Fs.readFileSync('./home.html');

Http.createServer(function(request, response) {
  if (request.method === 'POST') {
    console.log('Tick:', process.tickId);

    // suppose this is a list of "middlewares"
    Async.series([
      function(callback) {
        console.log('Tick:', process.tickId); // only first one in run in the same tick
        console.log('new '+ request.method +' request');

        callback();
      },
      function(callback) {
        console.log('Tick:', process.tickId); // this is in nextTick

        var form = new Formidable.IncomingForm();

        // formidable will never get 'end' event and we will block here
        form.parse(request, function(error, fields, files) {
          console.log(error || fields);

          callback();
        });
      }
    ], function(error, results) {
      response.writeHead(200, { 'Content-Type': 'text/html' });
      response.end(content);
    });

    return;
  }

  response.writeHead(200, { 'Content-Type': 'text/html' });
  response.end(content);

}).listen(8000);

(function () { 
  process.tickId= 0; 
  (function fn() { 
    process.tickId++; 
    process.nextTick(fn);
  })(); 
})(); 
@brianmaissy
Contributor

I see the problem now.

For anyone having trouble seeing the problem:

Add the following inside the server's request handler:

request.on('end', function(){
  console.log('Tick:', process.tickId, 'request end');
});

The http.ServerRequest fires its end event immediately. If a handler is not attached in the same event loop cycle, the event will be lost.

Now here's my answer:

Since async is intended to solve the problem of managing asynchronous control flow, it was written with the expectation that the functions passed to it would be asynchronous. Therefore, making guarantees about things running in the same tick of the event loop was not something of interest, even where it is possible and desirable.

However, your problem here is not just with async. Assuming async's behavior here was changed, if any of your pieces of middleware were asynchronous in nature (that is, they waited until a later tick before calling back) then you would have the same problem.

Therefore I think we should look at this issue as a more inherent problem with using asynchronous code in this way. That is, the programmer must be careful when using any asynchronous calls inside a handler which takes as a parameter an EventEmitter.

In terms of practical solutions for you:

Either:
a. Get all the data out of the request before making any asynchronous calls. Either pass the request to formidable immediately, or if your middleware needs to do things before the data gets to formidable, read the data from the request manually, and then pass that to formidable (assuming that is possible with formidable's API).
b. Attach a little handler to the event in question which will store the event until later, when you can release it at will. This might be cleaner in terms of the logical flow of your function. Something along the lines of:

// do this in the server's request handler, before calling any asynchronous functions:
request.on('end', function(e){
  request.releaseEvent = function(){
    request.emit('end');
  };
});
form.parse(request, function(error, fields, files) {
  console.log(error || fields);
  callback();
});
// when ready to receive the event:
request.releaseEvent();

Obviously you need more than just the end event, so you could build fancier versions of this that 'freeze' an EventEmitter completely, storing a queue of all its events, and then release them upon command.

@Shogun147

The http.ServerRequest fires its end event immediately. If a handler is not attached in the same event loop cycle, the event will be lost.

Well not quite so i think, it emits end event in the same tick only if there is no enough data (request is to small).

This is 50/50 really, i know what Async is for and i know what i try to do with it, and this works ok in previews versions. Async Control Flow methods are exactly for this i think.
I thought would not be a problem to add an optional parameter which will run this functions sync or async(by default)...

@brianmaissy
Contributor

I'm actually of the opinion that all iterators should be invoked by async on the next tick as a convention, but @caolan disagrees with me. So I'd be interested to see what he thinks about this.

@Shogun147

@brianmaissy - both approaches have the right to life, the developer must have an option to choose what he really need for, sync or async. But yeah, need more opinions about this.

@caolan
Owner
caolan commented Mar 2, 2013

@Shogun147 I'm sorry we introduced a backwards incompatible change in your case, I didn't think it was likely to affect anyone's use of async. That said, your use of the library seems quite unusual, why would you sequence two synchronous functions using async?

I'm on the fence about this. I wasn't particularly fond of introducing the automatic nextTick anyway, but a huge number of issues on here are about stack overflows due to the use of sync functions. Personally I think it's a misuse of the library, but people are determined to keep doing it.

@brianmaissy
Contributor

I think it's reasonable to expect the users to understand that async makes no promises about the ticks in which functions are called, given that it was designed for async functions. Then we are free to use it when necessary, and not use it otherwise (for simplicity and speed).

Async is useful for synchronous functions written in the callback style, but anyone who wants to use it that way should just know to wrap their functions in nextTick. I propose adding a note somewhere in the README about this. I can go ahead and do it and submit a pull request, or if you want we can open a new issue to discuss the wording.

Either this or #232 should be closed for now, and the other also when the README gets amended.

@caolan caolan closed this Mar 28, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment