Synchronous is evil? #1

MajorBreakfast · 2014-02-11T08:23:06Z

What about going asynchronous using promises and stating multiple files at once?

joliss · 2014-02-11T14:05:54Z

For disk IO, usually synchronous is faster. There are some rare edge cases (cold cache due to low RAM with spinning magnet disk) where async is faster, but I don't care about these. So it's intentionally synchronous-only.

stefanpenner · 2014-02-18T14:29:07Z

@joliss I wonder if node actually handles the async story better, although the latency might be higher, I would suspect the native concurrency would enable higher throughput. I suspect as N grows, it may be more appealing.

eg. sync walk might be faster when N=1, but when N=10 or N=100 I would suspect allowing node to handle the concurrency natively would be faster.

This is all my own assumptions, based on my understanding of node and the heroics it goes through to simulate sync.

joliss · 2014-02-18T15:26:44Z

I would have thought so too, but it seems that unless you're actually blocking on a slow disk or network bus, sync is faster. For instance, globbing 100k files with node-glob, with warm cache, takes 10 seconds sync vs. ~20 seconds async, and the async timing has quite a bit of variance. I'm guessing that all the bookkeeping for the parallel requests slows it down.

stefanpenner · 2014-02-18T16:16:49Z

crazy

MajorBreakfast · 2014-02-18T17:46:57Z

AAAAND: The results are out.

Sync: 60-63ms
Async with promises: 40-43ms

Both had to examine my desktop which currently has clones of the emberjs and app kit repos lying on it.

I guess async clearly wins :)
Btw. the code I wrote isn't optimized, yet. I'm adding the for loops now :)

var fs = require('fs');
var RSVP = require('rsvp');
var stat = RSVP.denodeify(fs.stat);
var readdir = RSVP.denodeify(fs.readdir);

function walk(baseDir, relativePath) {
  return stat(baseDir + '/' + relativePath)
        .then(function(stats) {
          if (stats.isDirectory()) {
            return readdir(baseDir + '/' + relativePath)
                   .then(function (entries) {
                      return RSVP.all(entries.map(function(entry) {
                        return walk(baseDir, relativePath + entry + '/')
                    }))
                    .then(function(entries) {
                      return Array.prototype.concat.apply([relativePath], entries)
                    });
            });
          } else {
            return [relativePath];
          }
        });
};

joliss · 2014-02-18T21:31:18Z

That implementation errors. I have a benchmark script; if you give me a functioning implementation I'll run it.

MajorBreakfast · 2014-02-18T21:31:26Z

https://github.com/MajorBreakfast/walk-as-promised

@stefanpenner now also takes a look at it to see where the bottlenecks are.

Sync took 2245 ms
walkAsPromised took 1811 ms
walkAsPromisedCStyle took 1923 ms

joliss · 2014-02-18T21:37:07Z

On Linux, I'm getting

Sync took 6 ms
walkAsPromised took 15 ms
walkAsPromisedCStyle took 10 ms
Sync took 2 ms
walkAsPromised took 10 ms
walkAsPromisedCStyle took 10 ms

If I create 100k files, I get:

Sync took 1094 ms
walkAsPromised took 5662 ms
walkAsPromisedCStyle took 4887 ms
Sync took 1108 ms
walkAsPromised took 7154 ms
walkAsPromisedCStyle took 8305 ms

So walk-sync seems to be faster by an order of magnitude. Are you getting something different?

MajorBreakfast · 2014-02-18T21:42:48Z

On windows async is faster

For 45909 files:

Sync took 2345 ms
walkAsPromised took 1726 ms
walkAsPromisedCStyle took 1800 ms
Sync took 2270 ms
walkAsPromised took 1723 ms
walkAsPromisedCStyle took 1753 ms

MajorBreakfast · 2014-02-18T21:50:30Z

On Max OS Mountain Lion (Not the same machine)

For 58701 files:

Sync took 5327 ms
walkAsPromised took 4860 ms
walkAsPromisedCStyle took 4886 ms
Sync took 2533 ms
walkAsPromised took 4991 ms
walkAsPromisedCStyle took 4811 ms

stefanpenner · 2014-02-18T22:46:03Z

in theory hiding that behind an abstraction would enable "async" for windows" and "sync for "linux".

I going to try and investigate some this evening.

joliss · 2014-02-18T23:35:07Z

in theory hiding that behind an abstraction would enable "async" for windows" and "sync for "linux".

Well, just don't send me a PR for that. ;-) I'm not going to be able to maintain custom code for Windows; certainly not for performance optimizations.

I going to try and investigate some this evening.

Let me know what you find out if you get around to it. It seems interesting that sync is slower on Windows.

MajorBreakfast · 2014-02-19T08:48:03Z

Okay new results:

Mac

Sync took 4599 ms for 58751 files
walkPromises took 4642 ms for 58752 files
walkAsPromisedCStyle took 4239 ms for 58752 files
walkFlattenAfterwards took 4818 ms for 58752 files
walkCallbacksFlattenAfterwards took 2056 ms for 58752 files
walkCallbacks took 2450 ms for 58752 files
Sync took 2048 ms for 58751 files
walkPromises took 6427 ms for 58752 files
walkAsPromisedCStyle took 5028 ms for 58752 files
walkFlattenAfterwards took 6181 ms for 58752 files
walkCallbacksFlattenAfterwards took 2262 ms for 58752 files
walkCallbacks took 2194 ms for 58752 files

Windows

Sync took 2691 ms for 45955 files
walkPromises took 2049 ms for 45956 files
walkAsPromisedCStyle took 1911 ms for 45956 files
walkFlattenAfterwards took 2076 ms for 45956 files
walkCallbacksFlattenAfterwards took 1070 ms for 45956 files
walkCallbacks took 1217 ms for 45956 files
Sync took 2556 ms for 45955 files
walkPromises took 1781 ms for 45956 files
walkAsPromisedCStyle took 1704 ms for 45956 files
walkFlattenAfterwards took 1780 ms for 45956 files
walkCallbacksFlattenAfterwards took 1085 ms for 45956 files
walkCallbacks took 1068 ms for 45956 files

The flatten afterwards versions depend on lodash. "Callbacks" means that it doesn't use promises internally, it returns a promise, though.

Btw. the additional entry is '/' in case you're wondering why the async versions' file count is +1.

MajorBreakfast · 2014-02-19T09:43:12Z

I updated the benchmark again. From what I see the results are:

It's about 10-20% slower on Mac, although with the advantage that it's non blocking
It's about 50-60% faster on Windows

Depends on how it performs on linux, but you could always abstract the actual implemenation away if you decide to use both:

// walk.js
var Promise = require('rsvp').Promise;

if (/^win/.test(process.platform)) {
  module.exports = require('walk-as-promised')
} else {
  module.exports = function() { return Promise.resolve(require('walk-sync').apply(null, arguments)) }
}

MajorBreakfast · 2014-02-19T10:38:38Z

I wondered all the time why you first read in all the folders and then stat every file again to extract the things you then hash. Well, now it's clear: You don't use node-walk-sync itself in broccoli, just something similar.
Could you move the actual tree hashing thing into an external repository, too? Right now I don't see very well how it is all interconnected. If you do that, I can build an async version for windows.

MajorBreakfast · 2014-02-19T20:55:35Z

https://github.com/MajorBreakfast/hash-tree-as-promised
https://github.com/MajorBreakfast/walk-tree-as-promised

joliss · 2014-02-19T21:45:50Z

Very interesting. I'll have to investigate. Out of bandwidth at the moment, but will get back to this. Thanks @MajorBreakfast for your benchmark repos!

ming-codes · 2014-04-30T01:10:25Z

Perhaps has to do with slow hard drive?

I noticed walk sync uses lstatsync to stat each file sequentially. If the disk is slow, it can potentially make a difference. If the stat is async, you leave it to the disk to optimize batch read.

evanplaice · 2015-02-17T05:05:26Z

Are you using HDD or SSD for the benchmarks? The way that SSDs access data is fundamentally different from how HDDs access data.

4. NAND-flash pages and blocks
Cells are grouped into a grid, called a block, and blocks are grouped into planes. The smallest unit through which a block can be read or written is a page. Pages cannot be erased individually, only whole blocks can be erased. The size of a NAND-flash page size can vary, and most drive have pages of size 2 KB, 4 KB, 8 KB or 16 KB. Most SSDs have blocks of 128 or 256 pages, which means that the size of a block can vary between 256 KB and 4 MB. For example, the Samsung SSD 840 EVO has blocks of size 2048 KB, and each block contains 256 pages of 8 KB each.

5. Reads are aligned on page size
*It is not possible to read less than one page at once. One can of course only request just one byte from the operating system, but a full page will be retrieved in the SSD, forcing a lot more data to be read than necessary.

21. A large single-threaded read is better than many small concurrent reads
Concurrent random reads cannot fully make use of the readahead mechanism. In addition, multiple Logical Block Addresses may end up on the same chip, not taking advantage or of the internal parallelism. A large read operation will access sequential addresses and will therefore be able to use the readahead buffer if present, and use the internal parallelism. Consequently if the use case allows it, it is better to issue a large read request.

Source:
http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/

Following that reasoning, it would make sense that read performance would be better on a SSD when handled in a synchronous manner.

stefanpenner · 2015-02-17T06:50:22Z

@evanplaice I did some splunking when @MajorBreakfast pointed out the unix/windows difference. If memory serves it seemed to boil down to how libuv handles async and sync on each platform. On win the async path is merely more optimal. When comparing, I recall the unix async path looked like it could be further optimized. This was quite some time ago and the state of libuv may have changed

stefanpenner · 2015-08-07T21:47:22Z

@joliss I think we can close this.

joliss closed this as completed Feb 11, 2014

joliss reopened this Feb 19, 2014

MajorBreakfast mentioned this issue Feb 25, 2014

windows support broccolijs/broccoli#39

Closed

joliss closed this as completed Aug 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronous is evil? #1

Synchronous is evil? #1

MajorBreakfast commented Feb 11, 2014

joliss commented Feb 11, 2014

stefanpenner commented Feb 18, 2014

joliss commented Feb 18, 2014

stefanpenner commented Feb 18, 2014

MajorBreakfast commented Feb 18, 2014

joliss commented Feb 18, 2014

MajorBreakfast commented Feb 18, 2014

joliss commented Feb 18, 2014

MajorBreakfast commented Feb 18, 2014

MajorBreakfast commented Feb 18, 2014

stefanpenner commented Feb 18, 2014

joliss commented Feb 18, 2014

MajorBreakfast commented Feb 19, 2014

MajorBreakfast commented Feb 19, 2014

MajorBreakfast commented Feb 19, 2014

MajorBreakfast commented Feb 19, 2014

joliss commented Feb 19, 2014

ming-codes commented Apr 30, 2014

evanplaice commented Feb 17, 2015

stefanpenner commented Feb 17, 2015

stefanpenner commented Aug 7, 2015

Synchronous is evil? #1

Synchronous is evil? #1

Comments

MajorBreakfast commented Feb 11, 2014

joliss commented Feb 11, 2014

stefanpenner commented Feb 18, 2014

joliss commented Feb 18, 2014

stefanpenner commented Feb 18, 2014

MajorBreakfast commented Feb 18, 2014

joliss commented Feb 18, 2014

MajorBreakfast commented Feb 18, 2014

joliss commented Feb 18, 2014

MajorBreakfast commented Feb 18, 2014

MajorBreakfast commented Feb 18, 2014

stefanpenner commented Feb 18, 2014

joliss commented Feb 18, 2014

MajorBreakfast commented Feb 19, 2014

MajorBreakfast commented Feb 19, 2014

MajorBreakfast commented Feb 19, 2014

MajorBreakfast commented Feb 19, 2014

joliss commented Feb 19, 2014

ming-codes commented Apr 30, 2014

evanplaice commented Feb 17, 2015

stefanpenner commented Feb 17, 2015

stefanpenner commented Aug 7, 2015