Race condition for same ports #6

Closed
stolsma opened this Issue Aug 12, 2011 · 6 comments

Comments

Projects
None yet
2 participants
@stolsma
Contributor

stolsma commented Aug 12, 2011

Today I started 5 apps at the same time (all doing the same with the same code) with haibu, all trying to bind to port 8000. But I noticed strange things happening. Sometimes one or two apps had to be started twice by forever. I noticed it because of the logging msgs.

It all points to line 103 and further in net.js where, because of the process.nextTick, another 'process' (in my case another app) can bind first to the 'free' port. And the other app will fail with a EADDRINUSE error, which crashes the app...

Luckily I remembered that @bmeck also made a pull request for node (see joyent/node#1412) and I noticed that that code is different then the current code for _doListen in net.js. Especially the code between the 2nd try..catch is different where in the pull code there is another check for EADDRINUSE which will trigger a new _doListen call...

Implementing the pull code solved the problem so maybe can someone implement the new code in net.js ??

Thanks!!

@stolsma

This comment has been minimized.

Show comment Hide comment
@stolsma

stolsma Aug 12, 2011

Contributor

BTW the actual and desired in the carapace::port event is always giving the same port number (the new port number not the requested one). i.e. desired and port are always the same, even when _doListen sees the desired port in use and finds a higher free one.

Contributor

stolsma commented Aug 12, 2011

BTW the actual and desired in the carapace::port event is always giving the same port number (the new port number not the requested one). i.e. desired and port are always the same, even when _doListen sees the desired port in use and finds a higher free one.

@bmeck

This comment has been minimized.

Show comment Hide comment
@bmeck

bmeck Aug 12, 2011

Contributor

Removing the recursion should have fixed this. Let me know if you still have errors.

Contributor

bmeck commented Aug 12, 2011

Removing the recursion should have fixed this. Let me know if you still have errors.

@stolsma

This comment has been minimized.

Show comment Hide comment
@stolsma

stolsma Aug 12, 2011

Contributor

Nope it still crashes at:

Error: EADDRINUSE, Address already in use
    at Array.<anonymous> (/home/sander/Development/apiary/node_modules/haibu-carapace/lib/net.js:108:7)
    at EventEmitter._tickCallback (node.js:126:26)

I added a check for EADDRINUSE in the try... catch (line 131) and that solves it. It looks like:

     catch (err) {
        if (err.code === 'EADDRINUSE') {
            self._doListen(desired + 1, ip);
        }
        else {
            self.close();
            self.emit('error', err);
        }
        return
    }

See https://gist.github.com/1142394 for the total code I debugged together. I don't use your for loop!! Thats why I didn't want to make a pull of it... (the other reason for not making a pull is that I don't fully understand the why and what of the old code and I don't want to break things...)

Contributor

stolsma commented Aug 12, 2011

Nope it still crashes at:

Error: EADDRINUSE, Address already in use
    at Array.<anonymous> (/home/sander/Development/apiary/node_modules/haibu-carapace/lib/net.js:108:7)
    at EventEmitter._tickCallback (node.js:126:26)

I added a check for EADDRINUSE in the try... catch (line 131) and that solves it. It looks like:

     catch (err) {
        if (err.code === 'EADDRINUSE') {
            self._doListen(desired + 1, ip);
        }
        else {
            self.close();
            self.emit('error', err);
        }
        return
    }

See https://gist.github.com/1142394 for the total code I debugged together. I don't use your for loop!! Thats why I didn't want to make a pull of it... (the other reason for not making a pull is that I don't fully understand the why and what of the old code and I don't want to break things...)

@bmeck

This comment has been minimized.

Show comment Hide comment
@bmeck

bmeck Aug 12, 2011

Contributor

Ah I see. An ugly but working fix is to wrap in another call, recursion with a new port will mess w/ actual and desired when the events are emitted. Just use the original desired port, eventual consistency will take place.

Contributor

bmeck commented Aug 12, 2011

Ah I see. An ugly but working fix is to wrap in another call, recursion with a new port will mess w/ actual and desired when the events are emitted. Just use the original desired port, eventual consistency will take place.

stolsma added a commit to stolsma/haibu-carapace that referenced this issue Aug 12, 2011

@stolsma

This comment has been minimized.

Show comment Hide comment
@stolsma

stolsma Aug 12, 2011

Contributor

Your fix solved it almost only I got now:

Error: EINVAL, Invalid argument
    at Server._doListen (/home/sander/Development/apiary/node_modules/haibu-carapace/lib/net.js:73:15)
    at Array.<anonymous> (/home/sander/Development/apiary/node_modules/haibu-carapace/lib/net.js:137:14)
    at EventEmitter._tickCallback (node.js:126:26)

Putting in a self.fd = socket(self.type); line solved that.

In my pull request I also solved the event data (correct desired and port property data).

Contributor

stolsma commented Aug 12, 2011

Your fix solved it almost only I got now:

Error: EINVAL, Invalid argument
    at Server._doListen (/home/sander/Development/apiary/node_modules/haibu-carapace/lib/net.js:73:15)
    at Array.<anonymous> (/home/sander/Development/apiary/node_modules/haibu-carapace/lib/net.js:137:14)
    at EventEmitter._tickCallback (node.js:126:26)

Putting in a self.fd = socket(self.type); line solved that.

In my pull request I also solved the event data (correct desired and port property data).

@stolsma

This comment has been minimized.

Show comment Hide comment
@stolsma

stolsma Aug 13, 2011

Contributor

@bmeck and @indexzero

Ok, I created a test program for this issue. The repo is here: git@github.com:stolsma/haibu-carapace-test.git

Running with node test.js will create most of the times the following error:

Error: EADDRINUSE, Address already in use
    at Array.<anonymous> (/home/sander/Development/carapacetest/node_modules/haibu-carapace/lib/net.js:104:7)
    at EventEmitter._tickCallback (node.js:126:26)

Sometimes also the following error is raised but I think thats related to the crash of the connected child:

node.js:134
        throw e; // process.nextTick error, or 'error' event on first tick
        ^
Error: ECONNRESET, Connection reset by peer
    at Socket._writeImpl (net.js:159:14)
    at Socket._writeOut (net.js:450:25)
    at Socket.write (net.js:377:17)
    at EventEmitter.<anonymous> (/home/sander/Development/carapacetest/node_modules/hook.io/node_modules/dnode/index.js:234:20)
    at EventEmitter.emit (events.js:64:17)
    at EventEmitter.request (/home/sander/Development/carapacetest/node_modules/hook.io/node_modules/dnode/node_modules/dnode-protocol/index.js:51:14)
    at Object.message (/home/sander/Development/carapacetest/node_modules/hook.io/node_modules/dnode/node_modules/dnode-protocol/index.js:80:26)
    at [object Object].<anonymous> (/home/sander/Development/carapacetest/node_modules/hook.io/lib/hookio/hook.js:235:16)
    at [object Object].<anonymous> (/home/sander/Development/carapacetest/node_modules/hook.io/node_modules/eventemitter2/lib/eventemitter2.js:185:22)
    at [object Object].emit (/home/sander/Development/carapacetest/node_modules/hook.io/lib/hookio/hook.js:119:38)

Hope this is enough to crate the same errors at your computers...

Contributor

stolsma commented Aug 13, 2011

@bmeck and @indexzero

Ok, I created a test program for this issue. The repo is here: git@github.com:stolsma/haibu-carapace-test.git

Running with node test.js will create most of the times the following error:

Error: EADDRINUSE, Address already in use
    at Array.<anonymous> (/home/sander/Development/carapacetest/node_modules/haibu-carapace/lib/net.js:104:7)
    at EventEmitter._tickCallback (node.js:126:26)

Sometimes also the following error is raised but I think thats related to the crash of the connected child:

node.js:134
        throw e; // process.nextTick error, or 'error' event on first tick
        ^
Error: ECONNRESET, Connection reset by peer
    at Socket._writeImpl (net.js:159:14)
    at Socket._writeOut (net.js:450:25)
    at Socket.write (net.js:377:17)
    at EventEmitter.<anonymous> (/home/sander/Development/carapacetest/node_modules/hook.io/node_modules/dnode/index.js:234:20)
    at EventEmitter.emit (events.js:64:17)
    at EventEmitter.request (/home/sander/Development/carapacetest/node_modules/hook.io/node_modules/dnode/node_modules/dnode-protocol/index.js:51:14)
    at Object.message (/home/sander/Development/carapacetest/node_modules/hook.io/node_modules/dnode/node_modules/dnode-protocol/index.js:80:26)
    at [object Object].<anonymous> (/home/sander/Development/carapacetest/node_modules/hook.io/lib/hookio/hook.js:235:16)
    at [object Object].<anonymous> (/home/sander/Development/carapacetest/node_modules/hook.io/node_modules/eventemitter2/lib/eventemitter2.js:185:22)
    at [object Object].emit (/home/sander/Development/carapacetest/node_modules/hook.io/lib/hookio/hook.js:119:38)

Hope this is enough to crate the same errors at your computers...

bmeck added a commit that referenced this issue Aug 15, 2011

@bmeck bmeck closed this Aug 15, 2011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment