Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous is evil? #1

Closed
MajorBreakfast opened this issue Feb 11, 2014 · 21 comments
Closed

Synchronous is evil? #1

MajorBreakfast opened this issue Feb 11, 2014 · 21 comments

Comments

@MajorBreakfast
Copy link

What about going asynchronous using promises and stating multiple files at once?

@joliss
Copy link
Owner

joliss commented Feb 11, 2014

For disk IO, usually synchronous is faster. There are some rare edge cases (cold cache due to low RAM with spinning magnet disk) where async is faster, but I don't care about these. So it's intentionally synchronous-only.

@joliss joliss closed this as completed Feb 11, 2014
@stefanpenner
Copy link
Collaborator

@joliss I wonder if node actually handles the async story better, although the latency might be higher, I would suspect the native concurrency would enable higher throughput. I suspect as N grows, it may be more appealing.

eg. sync walk might be faster when N=1, but when N=10 or N=100 I would suspect allowing node to handle the concurrency natively would be faster.

This is all my own assumptions, based on my understanding of node and the heroics it goes through to simulate sync.

@joliss
Copy link
Owner

joliss commented Feb 18, 2014

I would have thought so too, but it seems that unless you're actually blocking on a slow disk or network bus, sync is faster. For instance, globbing 100k files with node-glob, with warm cache, takes 10 seconds sync vs. ~20 seconds async, and the async timing has quite a bit of variance. I'm guessing that all the bookkeeping for the parallel requests slows it down.

@stefanpenner
Copy link
Collaborator

crazy

@MajorBreakfast
Copy link
Author

AAAAND: The results are out.

Sync: 60-63ms
Async with promises: 40-43ms

Both had to examine my desktop which currently has clones of the emberjs and app kit repos lying on it.

I guess async clearly wins :)
Btw. the code I wrote isn't optimized, yet. I'm adding the for loops now :)

var fs = require('fs');
var RSVP = require('rsvp');
var stat = RSVP.denodeify(fs.stat);
var readdir = RSVP.denodeify(fs.readdir);

function walk(baseDir, relativePath) {
  return stat(baseDir + '/' + relativePath)
        .then(function(stats) {
          if (stats.isDirectory()) {
            return readdir(baseDir + '/' + relativePath)
                   .then(function (entries) {
                      return RSVP.all(entries.map(function(entry) {
                        return walk(baseDir, relativePath + entry + '/')
                    }))
                    .then(function(entries) {
                      return Array.prototype.concat.apply([relativePath], entries)
                    });
            });
          } else {
            return [relativePath];
          }
        });
};

@joliss
Copy link
Owner

joliss commented Feb 18, 2014

That implementation errors. I have a benchmark script; if you give me a functioning implementation I'll run it.

@MajorBreakfast
Copy link
Author

https://github.com/MajorBreakfast/walk-as-promised

@stefanpenner now also takes a look at it to see where the bottlenecks are.

Sync took 2245 ms
walkAsPromised took 1811 ms
walkAsPromisedCStyle took 1923 ms

@joliss
Copy link
Owner

joliss commented Feb 18, 2014

On Linux, I'm getting

Sync took 6 ms
walkAsPromised took 15 ms
walkAsPromisedCStyle took 10 ms
Sync took 2 ms
walkAsPromised took 10 ms
walkAsPromisedCStyle took 10 ms

If I create 100k files, I get:

Sync took 1094 ms
walkAsPromised took 5662 ms
walkAsPromisedCStyle took 4887 ms
Sync took 1108 ms
walkAsPromised took 7154 ms
walkAsPromisedCStyle took 8305 ms

So walk-sync seems to be faster by an order of magnitude. Are you getting something different?

@MajorBreakfast
Copy link
Author

On windows async is faster

For 45909 files:

Sync took 2345 ms
walkAsPromised took 1726 ms
walkAsPromisedCStyle took 1800 ms
Sync took 2270 ms
walkAsPromised took 1723 ms
walkAsPromisedCStyle took 1753 ms

@MajorBreakfast
Copy link
Author

On Max OS Mountain Lion (Not the same machine)

For 58701 files:

Sync took 5327 ms
walkAsPromised took 4860 ms
walkAsPromisedCStyle took 4886 ms
Sync took 2533 ms
walkAsPromised took 4991 ms
walkAsPromisedCStyle took 4811 ms

@stefanpenner
Copy link
Collaborator

in theory hiding that behind an abstraction would enable "async" for windows" and "sync for "linux".

I going to try and investigate some this evening.

@joliss
Copy link
Owner

joliss commented Feb 18, 2014

in theory hiding that behind an abstraction would enable "async" for windows" and "sync for "linux".

Well, just don't send me a PR for that. ;-) I'm not going to be able to maintain custom code for Windows; certainly not for performance optimizations.

I going to try and investigate some this evening.

Let me know what you find out if you get around to it. It seems interesting that sync is slower on Windows.

@MajorBreakfast
Copy link
Author

Okay new results:

Mac

Sync took 4599 ms for 58751 files
walkPromises took 4642 ms for 58752 files
walkAsPromisedCStyle took 4239 ms for 58752 files
walkFlattenAfterwards took 4818 ms for 58752 files
walkCallbacksFlattenAfterwards took 2056 ms for 58752 files
walkCallbacks took 2450 ms for 58752 files
Sync took 2048 ms for 58751 files
walkPromises took 6427 ms for 58752 files
walkAsPromisedCStyle took 5028 ms for 58752 files
walkFlattenAfterwards took 6181 ms for 58752 files
walkCallbacksFlattenAfterwards took 2262 ms for 58752 files
walkCallbacks took 2194 ms for 58752 files

Windows

Sync took 2691 ms for 45955 files
walkPromises took 2049 ms for 45956 files
walkAsPromisedCStyle took 1911 ms for 45956 files
walkFlattenAfterwards took 2076 ms for 45956 files
walkCallbacksFlattenAfterwards took 1070 ms for 45956 files
walkCallbacks took 1217 ms for 45956 files
Sync took 2556 ms for 45955 files
walkPromises took 1781 ms for 45956 files
walkAsPromisedCStyle took 1704 ms for 45956 files
walkFlattenAfterwards took 1780 ms for 45956 files
walkCallbacksFlattenAfterwards took 1085 ms for 45956 files
walkCallbacks took 1068 ms for 45956 files

The flatten afterwards versions depend on lodash. "Callbacks" means that it doesn't use promises internally, it returns a promise, though.

Btw. the additional entry is '/' in case you're wondering why the async versions' file count is +1.

@MajorBreakfast
Copy link
Author

I updated the benchmark again. From what I see the results are:

  • It's about 10-20% slower on Mac, although with the advantage that it's non blocking
  • It's about 50-60% faster on Windows

Depends on how it performs on linux, but you could always abstract the actual implemenation away if you decide to use both:

// walk.js
var Promise = require('rsvp').Promise;

if (/^win/.test(process.platform)) {
  module.exports = require('walk-as-promised')
} else {
  module.exports = function() { return Promise.resolve(require('walk-sync').apply(null, arguments)) }
}

@MajorBreakfast
Copy link
Author

I wondered all the time why you first read in all the folders and then stat every file again to extract the things you then hash. Well, now it's clear: You don't use node-walk-sync itself in broccoli, just something similar.
Could you move the actual tree hashing thing into an external repository, too? Right now I don't see very well how it is all interconnected. If you do that, I can build an async version for windows.

@joliss
Copy link
Owner

joliss commented Feb 19, 2014

Very interesting. I'll have to investigate. Out of bandwidth at the moment, but will get back to this. Thanks @MajorBreakfast for your benchmark repos!

@ming-codes
Copy link

Perhaps has to do with slow hard drive?

I noticed walk sync uses lstatsync to stat each file sequentially. If the disk is slow, it can potentially make a difference. If the stat is async, you leave it to the disk to optimize batch read.

@evanplaice
Copy link

Are you using HDD or SSD for the benchmarks? The way that SSDs access data is fundamentally different from how HDDs access data.

4. NAND-flash pages and blocks
Cells are grouped into a grid, called a block, and blocks are grouped into planes. The smallest unit through which a block can be read or written is a page. Pages cannot be erased individually, only whole blocks can be erased. The size of a NAND-flash page size can vary, and most drive have pages of size 2 KB, 4 KB, 8 KB or 16 KB. Most SSDs have blocks of 128 or 256 pages, which means that the size of a block can vary between 256 KB and 4 MB. For example, the Samsung SSD 840 EVO has blocks of size 2048 KB, and each block contains 256 pages of 8 KB each.

5. Reads are aligned on page size
*It is not possible to read less than one page at once. One can of course only request just one byte from the operating system, but a full page will be retrieved in the SSD, forcing a lot more data to be read than necessary.

21. A large single-threaded read is better than many small concurrent reads
Concurrent random reads cannot fully make use of the readahead mechanism. In addition, multiple Logical Block Addresses may end up on the same chip, not taking advantage or of the internal parallelism. A large read operation will access sequential addresses and will therefore be able to use the readahead buffer if present, and use the internal parallelism. Consequently if the use case allows it, it is better to issue a large read request.

Source:
http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/

Following that reasoning, it would make sense that read performance would be better on a SSD when handled in a synchronous manner.

@stefanpenner
Copy link
Collaborator

@evanplaice I did some splunking when @MajorBreakfast pointed out the unix/windows difference. If memory serves it seemed to boil down to how libuv handles async and sync on each platform. On win the async path is merely more optimal. When comparing, I recall the unix async path looked like it could be further optimized. This was quite some time ago and the state of libuv may have changed

@stefanpenner
Copy link
Collaborator

@joliss I think we can close this.

@joliss joliss closed this as completed Aug 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants