Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add queue.map() and tests #14

Closed
wants to merge 1 commit into from
Closed

Conversation

ricardobeat
Copy link

Using a loop to add tasks breaks chaining and mandates saving the queue to a variable, making it unnecessarily complex. I had to use an external helper:

asyncMap = (q, fn, items) -> q.defer(fn, item) for item in items; q

This pull is a more complete implementation, similar to .deferAll at #8. I've read the discussion and I don't think any magic needs to be done with the arguments, you can achieve anything using bind/apply/call.

Related question: why doesn't queue use a constructor/prototype? It's completely opaque ATM, impossible to extend. That might also bring performance gains.

@mbostock
Copy link
Member

mbostock commented Apr 6, 2013

Thanks for the pull request.

If I’m reading this correct, you’ve implemented a special case of queue.deferAll option B where only the first argument can vary across invocations, and other arguments must be constant. It seems like this captures some common use cases, but I don’t have a great sense of whether it’s sufficiently exhaustive. Given this limitation, I still prefer parsimony, although I agree it’s annoying to have to break chaining when deferring a dynamic number of tasks.

I suppose another way of doing this would be queue.deferAll(functions), where deferAll doesn’t take any arguments. Then if you want to pass arguments to your tasks, you have to create closures. Something like:

queue(1)
   .deferAll(files.map(function(f) {
     return function(callback) {
       fs.readFile(f, callback);
     };
    })
    .awaitAll(done);

That’s a bit verbose in the case where arguments are homogenous across tasks, but it makes the fewest assumptions.

Another way of enabling chaining directly is to provide a call method, which just calls the specified function, passing in the queue:

queue(1)
   .call(function(q) {
     files.forEach(function(f) {
       q.defer(fs.readFile, f);
     });
    })
    .awaitAll(done);

But I’m not a big fan of that either (although I found it useful in D3; see selection.call).

As for my preference for privileged methods, that’s simply because I prefer strict encapsulation to exposing private state with an underscore. That said I am about to commit a change so that the queue is defined in one go as an object literal, rather than assigning to an empty object; this will improve performance. If you want to implement some microbenchmarks that demonstrate the prototype approach being noticeably faster than my new commit, then I might be willing to sacrifice encapsulation.

As for extending the queue API, seems like the simplest approach is to fork or to write a wrapper class that does the same, rather than patching the prototype.

@mbostock mbostock closed this Apr 6, 2013
@ricardobeat
Copy link
Author

Hi Mike,

I think this is the most common case, not a special one. It's simplicity makes it very useful and understandable, in case you need different arguments for each object you can .defer them one by one as before. Here is the code I ended up using:

getFiles: (paths, cb) ->
    asyncMap(queue(), fs.stat, paths).awaitAll (err, stats) ->
        if stats then paths = stats.map (stat, i) ->
            path.join(paths[i], '*' if stat.isDirectory())
        asyncMap(queue(), glob, paths).awaitAll (err, files) ->
            cb Array::concat.apply [], files

With this patch it could become

    queue().map(fs.stat, paths).awaitAll (err, stats) ->
        if stats then paths = stats.map (stat, i) ->
            path.join(paths[i], '*' if stat.isDirectory())
        queue().map(glob, paths).awaitAll (err, files) ->
            cb Array::concat.apply [], files

The alternative is repetition:

    q = queue()
    q.defer(fs.stat, path) for path in paths
    q.awaitAll (err, stats) ->
        if stats then paths = stats.map (stat, i) ->
            path.join(paths[i], '*' if stat.isDirectory())
        q = queue()
        q.defer(glob, path) for path in paths
        q.awaitAll (err, files) ->
            cb Array::concat.apply [], files

A .call method (or just passing the active queue to task.apply()) would be interesting, but wouldn't do much for this situation, a lot more code for no evident gain:

queue().defer((paths, next) ->
    @defer(fs.stat, path) for path in paths
    next()
, paths).awaitAll (err, stats) ->
    queue().defer (paths, cb) ->
        @defer(glob, path) for path in paths
        next()
    , paths
, paths).awaitAll (err, files) ->

For comparison, this what it could look like if I used the async module:

async.map paths, fs.stat, (err, stats) ->
        if stats then paths = stats.map (stat, i) ->
            path.join(paths[i], '*' if stat.isDirectory())
        async.concat paths, glob, (err, files) ->
            cb files

Almost equal to the first method - queue.map() makes this possible using a ton less code underneath.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants