-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Worker is not defined
in Node v6
#27
Comments
FYI, using |
Please take a look at #23 , Node.js does not ship with a native web worker implementation which is why you are seeing the |
I for some reason was under the impression that for node this library would not be using Worker but something else like Since I don't see Web Worker coming to node any time soon, it should be very beneficial to state this piece of dependency more clearly in the documentation. Pointing to a preferred Worker implementation in the documentation, at least for node users, would be even better. The current readme, this part:
reads more like saying the use of "web workers" does not apply to node. |
The library does not make use of As I mentioned in #23 the third party library recommendation I have at the moment will be this one https://www.npmjs.com/package/webworker-threads, just make sure you define |
I've gone ahead and pushed out v4.1.1, please update to v4.1.1 and your issues should be resolved. https://github.com/austinksmith/Hamsters.js/releases/tag/v4.1.1 |
Same as There's an issue blocking installing with these modules: "dependencies": {
"hamsters.js": "^4.1.0",
"webworker-threads": "^0.7.11"
} Case 1Without using Case 2With Case 3With Do you know what might be going on? Since you are claiming it supports node, the library must had worked for you. What combination of packages and platforms did you successfully achieved parallelism? |
Please create a new ticket for the new issue you're encountering, v4.1.1 solves the problem related to this ticket. |
Reopening this, I mixed it up with the other one you had open. |
I'm implementing a fix that will be released in 4.1.2 which will allow this to work with any 3rd party implementation, the solution i had previously isn't working currently. If you use 4.1.1 for now you will be able to write your logic and have it in place for when 4.1.2 drops. |
@Zodiase take a look at this release please and let me know if you encounter more issues. https://github.com/austinksmith/Hamsters.js/releases/tag/v4.1.2 |
@Zodiase this issue will close within 48 hours without a response. |
Process still won't terminate or yield any result. Tested with node |
@Zodiase I've gone ahead and used your example and resolved the issue you are mentioning, it looks like a change I made back in v3.9.* caused the recent incompatibilities. This is now resolved and I've successfully tested your example using this release. Please give v4.1.3 a try and report back, thanks. https://github.com/austinksmith/Hamsters.js/releases/tag/v4.1.3 |
(sorry for barging in although im not Zodiase I did encounter the same problem with previous builds, but since this Issue was already there, did not report a duplicate) |
That's great to hear @Hizoul as I had only tested it with web-workerthreads so far, I think the implementation should be fairly consistent across the board. In the future don't worry about commenting on an open issue, the more people able to shed light on an issue the better. If you don't mind me asking what is your current use case for the library and how is the performance you're seeing in Node? Node is the least tested and performance profiled platform for the library as support has been a bit of a hassle to get right. |
Well my use case is rather an edge case for now, just playing around with isomorphic multi-core usage in js for some existing point calculations. In the beginning I tried hamster.running everything, but realized very quickly that if you do not use the TypedArray's, then copy-times are a HUGE bottle-neck. Meaning single-threaded node is much faster than calling hamsters.run 1000x with huge objects that are not taking advantage of TypedArray. The code I now have could be achieved with nodes cluster as well, but the goal was portable code that could theoretically run on web/rn as well (which I actually don't need yet) :P Here's an abstract of my Usage: const users = [] // ~2k users
const userActions = [] // ~20k actionobjects
hamsters.run({array: users, userActions}, () => {
for (let user of params.array) {
rtn.data.push(makeScoreObjectForUser(user, userActions))
}
}, (res) => {
// synchronous sort + insert do db
}, hamsters.maxThreads, true) |
Yes serialization can be a pretty big bottleneck, you might be able to get around this by manually stringifying your data and parsing it within a thread, in some browsers that provides a pretty decent performance boost. Older versions I was manually creating array buffers to pass normal arrays using transferrables but I don't think it's going to get you past the userActions array since that contains objects I'm not sure if its as easy to simply pass a reference over. I might push a release today that restores that array buffer functionality if you are willing to report back on whether or not it was a good improvement. const users = new Float32Array([]) // ~2k users
const userActions = [] // ~20k actionobjects
hamsters.run({array: users, userActions}, () => {
for (let user of params.array) {
rtn.data.push(makeScoreObjectForUser(user, userActions))
}
}, (res) => {
// synchronous sort + insert do db
}, hamsters.maxThreads, true, null, null, 'ascAlpha'); You might be able to get away with having the library do your sorting for you by making use of https://github.com/austinksmith/Hamsters.js/wiki/Sorting and if it doesn't impact your input data making any of your input arrays a typed array will bring some significant performance improvements even if you are not defining The library has some growing pains and I'm working on improving the way it's used to make less easily parallelizable tasks more performant, on a side note you might try switching your for(let user of params.array) into a native for(var i = 0.... loop as a native for loop is an order of magnitude faster than the for you are using right now. You can also probably lower the amount of threads you are scaling across either in your function or in your initialization to a reasonable thread count like 2 or 4 and see performance improvements since you will be spending less time communicating between threads and more time executing logic within a thread which helps maximize the improvement you can see. |
Just to provide some more clarity, in your params object your input array is going to be split into smaller chunks and each thread is going to receive a small chunk of that input array to work with, the problem is that your userActions array is not going to be broken into smaller pieces and is instead going to be duplicated to every thread so your serialization costs are going to go up exponentially with the threads you add, keeping it to 2 threads or 4 threads will keep that costs down to a reasonable level. So basically userActions -> 1 thread will be 1 serialization to the thread and 1 serialization back from the thread, increase that by how many threads you have and at 8 threads you are hitting that serialization costs 16 times in total. |
This limitation is only temporary as once JavaScript adds support for shared atomics that will allow us to completely avoid the costs of duplicating that data to every thread, it's an unfortunate side effect of just how poorly (in my opinion) they implemented the worker features in JavaScript which likely stemmed from a small minded belief system that is very against the idea of threading in general...why people like Douglas Crockford hate multithreading and made it as difficult as possible is beyond me honestly. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics |
Wow thank you very much for the very detailed explanation and performance Improvements suggestions. Very Happy to see such detailed help and I do agree with everything you say. Unfortunately the makeScoreObjectForUser is very deeply nested structure {a: {b: {c: 3}}}, with varying keys per user. Then all those numbers (per deepest key) are aggregated into an array at that deep key again synchronously in the callback function and these small arrays are also sorted (and then tagged with position value thats why i sort) and these are then converted to db objects and inserted. Since the last step in the callback only takes about 1 sec there's no need for improvement there but the initial filtering + calculation that I'm now splitting across 4 rather than 1 "Thread" are the thing that's slow. I'm also very probably thinking way too complicated with the data structuring and all. But it works now and this is the fourth reimplementation going from:
runtime for full calculation (2. was already directly db and in 3. i made all mongodb findcalls in parallel with complicated queries thats why it locked up), so for now there's no need for further improvements :/ Regarding the growing pains, I found the library quite easy to understand and easy to apply to existing code. I have worked with webworkers directly before so that probably helped. I guess some people might be confused about:
And I do agree Javascript "Threading" with webworkers is unneccessarily complicated. It has always been a hassle, and node's cluster also sucks because you're executing the same code twice and need to check for Cluster.isMaster x) (although i think i recently saw a webworker similar api for clustering in node's own apis recently). And regarding atomics thanks for the pointer, I was not aware that that was coming but it sound like a nice feature, i will keep an eye out for it. Yes the webworker API is very crippled. I think it might also be because it was invented in a time where you tried to limit the browser's capability while still enhancing available API's. Not like now where they try to enable the Web to operate as a unified Operating-System-API (e.g. Battery-Status 🤢) |
@austinksmith Although I noticed one minor issue: In my example provided previously, |
@Zodiase The maximum threads you are defining is a global setting limiting the maximum number of threads the library is to make use of within the thread pool. If you define If you do not define a thread count in your This also allows the library to scale the work across any number of clients regardless of what their logical core count is, since |
I'm trying to run the example code as such:
but got error:
The text was updated successfully, but these errors were encountered: