Same emit gets picked up by all listening workers #10

Closed
joscha opened this Issue Dec 11, 2012 · 48 comments

Projects

None yet

2 participants

@joscha

It seems as if sometimes (you can see it using test.js) a single emitted event gets picked up by all workers - I guess this is because they read it both from MongoDB before the handled flag gets set. Any ideas on how to prevent this? Or do I have to implement my own locking for this?

@jdarling
Owner

What version of Mongo and what version of Node? I've not seen this with Mongo 2.0.2 and Node 8.11. The way the selector code is setup it has to get a lock/update on the record before it can update. Also, what version of the Mongo Native Driver are you using? I've noticed that the newest version I've had to make updates (that are currently being tested) to default SafeMode to true (forcing the write locks). This may be related to that, I'll load test with SafeMode false tomorrow if I have the systems available to see if that may be the issue.

@joscha

Using Mongo 2.2.0 and Node 0.8.11 with mongodb driver 1.2.2. I currently use the pull request #6 from @mfrobben, maybe that's the problem, because SafeMode has been enabled on that one as well...if I can help you with anything, just let me know!

@jdarling jdarling added a commit that referenced this issue Dec 12, 2012
@jdarling Upgrades and fixes
v0.2.9
* Change SafeDBDriver default value from false to true, this fixes the
issue with multiple listeners picking up the same message since Mongo
doesn't perform record locking on updates if this isn't true.
* Fix autoStart
* Resolves #9 and #10

v0.2.8
* Upgraded code for new MongoDB Native Drivers (thanks mfrobben for
starting points)
* Readme cleanup (thanks ttezel for pointing this out and fixing it)
* Resolves #7 and #6
f6617da
@jdarling
Owner

This should be fixed in the latest version 0.2.9, please test and see if you can re-create.

@joscha

Ah, nice - I will check it out over the weekend!

@joscha

I just checked with the test.js sample from readme.md - emits still seem to get picked up multiple times.
What I did:

  • call node test.js in three different console windows.
  • call start() in all of them
  • call eatTest() in two (1 and 2) of them
  • call load() in the other one (3)

Check (1) and (2) for similar eat: ### outputs.

@jdarling
Owner

Will do some more testing with this in the coming days and see if I can't repeat it and resolve this. May be a bit though as I am having to deal with some family issues that are taking much of my free time.

@joscha

Hey, are there any news on this? I can still see this issue...

@jdarling
Owner

I'm busy re-working (refactoring if you prefer) the code base for the 0.3.0 release and hope to address this in that release. From what I can tell this has to do with the write concerns and the fact that two different threads can get a valid write even though neither gets a majority write. In the new code base I'm taking advantage of the writer concern 'majority' built into MongoDB to hopefully address this.

The other thing I'm working on is when the Mongo instance goes away, today at least, the listeners get lost in lala land since they don't get the disconnect notice. I'm working closely with the MongoDB Native guys to work through this problem in the best possible way.

As a temporary work around, you might try removing the safe: true setting, and instead putting in a w: 'majority' option in the code that calls findAndModify inside QueueListener.js

Note: I have not tried the above, but it is one of the major changes from 0.2.x to 0.3.x

@joscha

My project is not due for a few weeks... Refactoring sounds great - if there is anything to test for 0.3.x, let me know!

@jdarling
Owner

I have a very loose version of 0.3.0 put together that seems to fix the issue with multiple clients getting the same message and processing it. If you are interested in testing it out it can be downloaded from: https://skydrive.live.com/redir?resid=36F0B4E3286E9F61!7767&authkey=!AOtZQk38fVdJ3-o

This should be backwards compatible for the most part. I've removed partial callbacks as they don't make sense for the most part. Other than that the methods should be the same and the options should work properly. You can now use connectionString in the MQ options instead of having to supply all the details separate if you would like.

The underlying code will likely change as this version still does not seem to get notifications when a server goes away. It could hang the listeners. This mainly affects systems where there are bad network latency so may not have shown up for you yet.

@jdarling
Owner

Updated the above link to allow for callbacks and broadcast. If you don't expect a response broadcast would be the better thing to use :)

@joscha

With this I am getting a

TypeError: Object #<MongoClient> has no method 'collection'
    at MongoConnection.collection (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:275:13)
    at MongoConnection.checkConnection (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:248:5)
    at MongoConnection.collection (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:274:8)
    at MongoConnection.ensureCapped (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:304:10)
    at MongoConnection.checkConnection (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:248:5)
    at MongoConnection.ensureCapped (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:293:8)
    at MongoMQ.start (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoMQ.js:41:14)
    at MongoConnection.use (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:268:5)
    at MongoConnection.open.switchToDb (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:215:12)
    at connected (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:170:11)
@joscha

I use it like this (worked with 0.2.x):

MongoMQ = require 'mongomq'

queue = new MongoMQ
    autoStart:      true
    host:           config.db.host
    collectionName: 'mongomq'
    database:       config.db.name
@jdarling
Owner

Looks like your using Coffee Script. Mind posting a simple sample that shows the issue? The new version does an explicit test for undefined variables and won't override NULL's. My first thought is that maybe config.db.host and/or name are returning NULL instead of undefined.

When I try basically the exact same thing in pure JS everything works fine. All I did was change the hello example to use autoStart instead of calling start:

greeter.js

var MQ = require('../../lib/MongoMQ');
var MC = require('../../lib/MongoConnection');
var UUID = require('node-uuid');

var options = {databaseName: 'tests', queueCollection: 'capped_collection', autoStart: true};

var mq = module.exports = new MQ(options);

var log;

var handleRecord = function(err, data, next){
  if(!err){
    log.insert({handled: data}, {w:0});
    next('Hello '+(data||'world')+'!');
  }else{
    console.log('err: ');
    next();
  }
};

mq.on('greet', handleRecord);

(function(){
  var logger = new MC(options);
  logger.open(function(err, mc){
    if(err){
      console.log('ERROR: ', err);
    }else{
      mc.collection('log', function(err, loggingCollection){
        log = loggingCollection;
        //mq.start(function(err){
          if(err){
            console.log(err);
          }
        //});
      });
    }
  });
})();

talker.js

var MC = require('../../lib/MongoConnection');
var MQ = require('../../lib/MongoMQ');
var UUID = require('node-uuid');

var options = {databaseName: 'tests', queueCollection: 'capped_collection', autoStart: true};

var mq = module.exports = new MQ(options);

var recordNumber = 0;
var putRecord = function(){
  setTimeout(function(){
    console.log('emitting');
    mq.emit('greet', recordNumber, function(err, data){
      console.log('Response:');
      console.log(' err>', err);
      console.log(' dat>', data);
    });
    recordNumber++;
    putRecord();
  }, 1000);
};

(function(){
  var logger = new MC(options);
  logger.open(function(err, mc){
    if(err){
      console.log('ERROR: ', err);
    }else{
      mc.collection('log', function(err, loggingCollection){
        loggingCollection.remove({},  function(){
          //mq.start(function(err){
            putRecord();
          //});
        });
      });
    }
  });
})();
@jdarling
Owner

Forgot to mention, if it is a case of NULL instead of undefined I'll put in a test for NULL and undefined on the host and name properties.

@joscha

definitely neither null nor undefined - I really just dropped in the 0.3 version - changed no settings, etc. Once I reverted to 0.2 it worked again.

@jdarling
Owner

I think I found it, in 0.2 it accepted database or databaseName. Made the change in 0.3 and reuploaded the zip. Can you try again please.

I'm really at a loss with Coffee Script but if I copy/paste what you put above into a .coffee file and run it it doesn't throw any errors. Of course, it doesn't do anything either since it doesn't emit or consume anything. If you could expand the coffee/test.coffee example and post it back here to emit and consume messages it would really help.

@joscha

Sorry, the snippet is part of a bigger project all written in coffeescript - it is only syntactical sugar though, I will post the JS in future.

Regarding the new zip: now the connect works, but as soon as I want to put something into the queue:

TypeError: Object false has no method 'collection'
    at MongoConnection.collection (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:281:13)
    at MongoConnection.checkConnection (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:256:5)
    at MongoConnection.collection (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:280:8)
    at MongoMQ.emit (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoMQ.js:77:10)
    at MongoConnection.checkConnection (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoConnection.js:256:5)
    at MongoMQ.emit (/Users/joscha/Documents/workspace/x/node_modules/mongomq/lib/MongoMQ.js:76:8)
    at Promise.createMail (/Users/joscha/Documents/workspace/x/app/resources/mails.js:78:17)
    at Promise.addBack (/Users/joscha/Documents/workspace/x/node_modules/mongoose/lib/promise.js:128:8)
    at Promise.EventEmitter.emit (events.js:99:17)
    at Promise.emit (/Users/joscha/Documents/workspace/x/node_modules/mongoose/lib/promise.js:66:38)
@joscha

Seems as if the self.db is false instead of an object.

@jdarling
Owner

Actually, line 256 in MongoConnection is saying that your trying to do something before the connection with the server is established.

Your code doesn't by chance look something similar to below does it?

var mq = new MongoMQ({autoStart: true, ...})

...

mq.emit(...);

If so then I believe what is going on is that MongoMQ isn't stood up and listening yet so it hasn't started up fully and got a complete connection to the DB server itself. Instead try passing in your emit's in the start callback:

var mq = new MongoMQ({autoStart: true, ...}, function(err, mqInstance){
  mq.emit(...);
});

This assures that the connection is fully established and stable before you try and use it. The new drivers seem to be a bit more efficient in backgrounding while connecting. I can get a very similar result in JS part of the time if I don't use the callbacks.

I would try to add autoStart into the checkConnection handler but that causes issues of its own with spinning up (and down) multiple connections to the DB in the case where one connection is already in flight.

@jdarling
Owner

This does point out some issues with the current code though, I raise the errors but I never actually throw them :(. Will look at the best way to modify the code to handle this better.

@joscha

My code does look slightly similar to this, however the time between autoStart and my first emit is about one minute - care to add a helper like let's say:

mq.ready(function() {
   // this gets called right away is established or gets delayed until the connection is there
});

I am using the queue in three different places - two of them are workers on other CPUs, so I would like to reuse my initialization file and just include it, to have the same code base.

@jdarling
Owner

I had already started doing that inside the QueueManager class and will be adding similar to the MongoMQ and MongoConnection classes. A minute later should be more than enough to establish the connection.

I'll also try setting up a longer wait sample to see if something else is going wrong with the timeouts.

@joscha

I think emit should actually take care of that automatically, e.g. if I call emit while the connection still gets established, I shouldn't get an error, it should just be emitted once the connection is there...

@jdarling
Owner

I'll see what I can do, as I agree it does make more sense if the autoStart flag is true or if you have already connected that the system recover nicely. In your code you should still check for errors in the off chance I miss something. I updated all the calls earlier (will upload changes tomorrow) to properly pass back error objects if/when an error occurs.

If I may ask, what does your Mongo environment look like? ReplicaSet or stand alone? OS, # machines, etc... Could be useful trying to track this down.

@joscha

The environment is currently a deve machine - so one local MongoDB, standalone on OSX on one machine.

@jdarling jdarling was assigned Jan 10, 2013
@jdarling
Owner

I seem to have everything working in Windows at least. Sorry but I don't have a Mac to test on, so your the Mac/Coffee guy :). I've updated the zip file if you could test it again to see how things are doing.

I have updated the library to use a common lib.js so you will have to change code back to require('MongoMQ').MongoMQ

ready is available and it does queue up emits (using nextTick to wait) while things get established. I have tested a long wait period and it doesn't seem to timeout in that scenario.

@joscha

This version does not even start up any more:

/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/connection/server.js:485
        throw err;
              ^
ReferenceError: err is not defined
    at MongoConnection.close (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/lib/MongoConnection.js:179:33)
    at MongoMQ.start (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/lib/MongoMQ.js:79:18)
    at Eventer.checkNoError (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/lib/eventer.js:47:5)
    at MongoConnection.ensureCapped (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/lib/MongoConnection.js:275:31)
    at Collection.isCapped (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/collection.js:1440:7)
    at Collection.options (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/collection.js:1425:7)
    at Cursor.nextObject.commandHandler (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/cursor.js:513:48)
    at Db._executeQueryCommand (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/db.js:1804:12)
    at Cursor.nextObject (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/cursor.js:548:13)
    at Collection.options (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/collection.js:1424:12)
@joscha

Are there any settings I need to change?

@jdarling
Owner

Sorry copy/paste error on line 179 in MongoConnection.js:
self.emitter.call('closed', err);
should be:
self.emitter.call('closed');

Will update the zip and repost, but you should be able to just change it.

@joscha

I changed it and get this now:

/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/connection/server.js:485
        throw err;
              ^
Error: no open connections
    at Db._executeQueryCommand (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/db.js:1804:21)
    at Cursor.nextObject (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/cursor.js:548:13)
    at Collection.options (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/collection.js:1424:12)
    at Db.collectionsInfo (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/db.js:385:5)
    at Collection.options (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/collection.js:1422:11)
    at Collection.isCapped (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/collection.js:1438:8)
    at MongoConnection.ensureCapped (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/lib/MongoConnection.js:274:24)
    at Db.collection (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/node_modules/mongodb/lib/mongodb/db.js:495:44)
    at MongoConnection.collection (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/lib/MongoConnection.js:229:15)
    at MongoMQ.checkConnection (/Users/joscha/Documents/workspace/bla/node_modules/mongomq/lib/MongoMQ.js:56:5)
@joscha

I tried changing database to databaseName, but it does not seem to make a difference.

@jdarling
Owner

Are you calling close anywhere directly in your code? No open connections points to the underlying database connection being closed explicitly, or Mongo itself isn't up and running/reachable.

I could do a lot more to help if I had some simple example showing the issue, please put together a minimal example and post so I can run it locally. Right now I'm just guessing as the new code works in all of the old test cases and the new examples without any issues on all test systems I have available.

@joscha

I created a gist here. You can run it with coffee server in the main dir after adding a node_modules directory and adding the mongomq module to it.

@joscha

I just restarted MongoDB, its definitely up - it also works with the old version - and no, I don't call close explicitly.

@jdarling
Owner

Thanks this gets me into the right direction. Seems that when using a single host there is a bug where the collection object is saying it doesn't have a connection. I'm working to track this down now.

Can you try (or tell me how to) one more thing for me, change host: 'localhost' to an array called servers:
servers: ['localhost']

I don't run into this issue connecting to a replicaset or connecting as a replicaset, but when specifying the host directly as you do I can duplicate the issue. This may be a very simple workaround until I find a fix.

@jdarling
Owner

Nevermind that won't work, you get back an error about no primary :(

@jdarling
Owner

Note: if I remove your host option, basically forcing to connect to localhost, everything works fine with your Gist. This may be a temporary workaround.

@joscha

Hmm...maybe some wrong default values?

@joscha

If you branch the repo and check in your new code, I can help search :)

@jdarling
Owner

Got it :), Just uploaded a patched version. I'd branch but I'm still a github idiot on the proper merge. Now I'm properly building the host connection string. I'm still not sure why a single server doesn't work without a connection string, but I can come back to that. Please see if the updated zip fixes things for you.

@jdarling
Owner

Your url export should be "mongodb://#{exports.db.host}:{exports.db.port||27017}/#{exports.db.name}" to be a proper mongodb connection string btw.

@joscha

Yes! It works and no duplicates show up!! Great work!

@jdarling
Owner

WAHOO LOL, as soon as I get the documentation cleaned up I'll be pushing v0.3 to GitHub. Hopefully some time this weekend.

@joscha

👍

@jdarling
Owner

FYI While working on the Repl and Documentation I've found a few small typos (edge case usage things), I'll let you know in this thread when I post v0.3 so you can make sure and update just in case.

@joscha

ok

@jdarling
Owner

v0.3.0 has been pushed with all fixes applied. Only thing left is to update the readme.md file. Closing and thanks for the help :)

@jdarling jdarling closed this Jan 11, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment