Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EMS: Must declare number of nodes to use. Input:NaN #11

Closed
kessler opened this issue Jun 9, 2016 · 9 comments
Closed

EMS: Must declare number of nodes to use. Input:NaN #11

kessler opened this issue Jun 9, 2016 · 9 comments

Comments

@kessler
Copy link

kessler commented Jun 9, 2016

Hi,

This situation was mentioned in #10, here is the code, but this time around I don't have any shared state with the spawning process:

index.js:

const ems = require('ems')(4, false, 'fj')

ems.parallel(function () {
    var x = 1
        console.log(x)
})

process.exit(0)

Running node index.js yields:

1
EMS: Must declare number of nodes to use.  Input:NaN
EMS: Must declare number of nodes to use.  Input:NaN
EMS: Must declare number of nodes to use.  Input:NaN

Running node index.js 4 yields:

1
1
1
1

Node version is v5.5.0

Anything I missed?

@kessler
Copy link
Author

kessler commented Jun 9, 2016

Ok, I figured it out, EMS generates a script called EMSThreadStub.js:

// Automatically Generated EMS Slave Thread Script
// To edit this file, see ems.js:emsThreadStub()
var ems = require("ems")(parseInt(process.argv[2]));
process.on("message", function(msg) {
    eval("func = " + msg.func);
    func.apply(null, msg.args);
} );

I'm guessing that process.argv[2] is the reason for those messages.

Is there a way to go around this? (without modifying ems code)

@mogill
Copy link
Owner

mogill commented Jun 10, 2016

Hi,

The short answer is code modifications are required.

To execute EMS barriers and joins, every process needs to know how many processes there are and a way to determine which process started first so initialization only happens once. There is more than one way of solving this, and command-line arguments was a regrettable choice.

This problem also turned up in the Python port, which seems like a good reason to fix this now before Python is released.

I hope to remove command line arguments from EMS this weekend and have a fix by Monday. My apologies for delays getting back to you and implementing a better fork-join scheme.

             -Jace

@kessler
Copy link
Author

kessler commented Jun 12, 2016

Hi,
No apologies are necessary! this is an awesome project, way to go!
May I inquire as to what approach are you going to take to circumvent this?

@mogill
Copy link
Owner

mogill commented Jun 13, 2016

Please update to v1.3.6 (git/npm) which no longer depends on command line arguments and should fix the problem you ran into. That information is now communicated from the master to slave process via environment variables.

I was hoping to rework initialization for both Python and JS but wasn't able to finish this weekend, that version also gets rid of the thread stub program file created by EMS.

Let me know if this patch didn't help and/or there are other assumptions that are causing problems!

@kessler
Copy link
Author

kessler commented Jun 17, 2016

I might just be paranoid, but environment variables feels like a risky play... perhaps send the information to slaves via stdin?

@mogill
Copy link
Owner

mogill commented Jun 17, 2016

I didn't think there was much risk setting environment variables in the forked processes, was there a particular risk I hadn't considered?

I did experiment with using stdin/out for slave processes but Node detects the slave processes' stdin is not attached to a terminal and a naive implementation did not work. The version of this feature on the Python branch is, in fact, using stdin/out to set up the slave processes, but isn't complete yet. Another constraint is I'd like to keep the Python and Node implementations similar to the extent that's possible so they behave similarly for end-users and for development/debugging.

@kessler
Copy link
Author

kessler commented Jun 17, 2016

The only "risk" I can think of right now is the fact that the environment might be altered or changed somehow without your control.

Are you just passing msg.func and msg.args?

I did experiment with using stdin/out for slave processes but Node detects the slave processes' stdin is not attached to a terminal

Are you referring to this: https://github.com/SyntheticSemantics/ems/blob/master/nodejs/ems.js#L632 ?

@mogill
Copy link
Owner

mogill commented Jun 17, 2016

The environment variables are per-process in UNIX, they are copied from the parent process to the child process' environment when it is forked, and once executing neither process may see or modify another's environment variables.

The fork at ems.js:633 is creating the child processes whether Bulk Synchronous Parallelism (bsp) or Fork-Join (fj) is being used. For BSP, the script executed is the currently executing script, for FJ the script is the the thread stub declared a few lines above. In both cases, the original command line arguments are passed to the slave process, however the FJ thread stub ignores them.

In the Python branch this fork is a spawn. fork has additional semantics where Node sets up a socket communication channel, the child knows Node forked it, and it does something special with stdin. spawn is a simple process creation that doesn't have Node specific semantics, so Node behaves as if it were run from a bash script.

The msg.func and msg.args are used by fork-join -- they are the actual function code and arguments from the master process to be executed by the thread stub. BSP processes are all running the user's script and start execution at the top.

@mogill
Copy link
Owner

mogill commented Oct 24, 2016

The environment variable instead of command line argument is now working for Python as well, so I'm going to close this since it seems to work as a language independent solution.

v1.4 is now available, which supports object sharing with Python3. Py2 on the way.

@mogill mogill closed this as completed Oct 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants