Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: improve error performance for readSync #50033

Merged
merged 5 commits into from
Oct 21, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions benchmark/fs/bench-readSync.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
'use strict';

const common = require('../common');
const fs = require('fs');
const tmpdir = require('../../test/common/tmpdir');
tmpdir.refresh();

const tmpfile = tmpdir.resolve(`.existing-file-${process.pid}`);
fs.writeFileSync(tmpfile, 'this-is-for-a-benchmark', 'utf8');

const bench = common.createBenchmark(main, {
type: ['existing', 'non-existing'],
paramType: ['offset-and-length', 'no-offset-and-length'],
n: [1e4],
});

function main({ n, type, paramType }) {
let fd;

switch (type) {
case 'existing':
fd = fs.openSync(tmpfile, 'r', 0o666);
break;
case 'non-existing':
fd = 1 << 30;
break;
default:
new Error('Invalid type');
}

bench.start();
switch (paramType) {
case 'offset-and-length':
for (let i = 0; i < n; i++) {
try {
fs.readSync(fd, Buffer.alloc(1), 0, 1, 0);
Copy link
Member

@joyeecheung joyeecheung Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the buffer should be reused, as what's normally done in the wild. Also "reading a file from offset 1 many many times, one character at a time" do not seem to be a very typical use case. It would be better to tweak the benchmark so that it reads a big file in a reused buffer (with a length of fs.statSync(file).blksize) and then read without offset until the file ends, which is the most common way to use this API. We probably do not need an additional case for the offset as reading a file repeatedly from the same offset is probably not common enough in the wild to optimize for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your good comments.
I tried modifying it, but I'm not sure if it worked as you said.
Would it be better to remove paramType and test it by passing only buffer?

} catch {
// Continue regardless of error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this fail if type is existing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't happen.
I tested only existing separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't happen.
I tested only existing separately.

Do you mean you only run this benchmark with existing? But this would still be part of the benchmark when e.g. the entire fs benchmark is run. So we should still avoid invalid/redundant combinations to prevent the benchmark from spending too much time on those and increasing the total runtime of the benchmark.

For now it seems the offset-and-length case is redundant, the no-offset-and-length case is already enough. The ultimate difference between 'offset-and-length' and 'no-offset-and-length' actually comes down to the position parameter - both actually would have the same offset and length after argument handling. But 'offset-and-length' would have the position set to 0 while the 'no-offset-and-length' would use the default position null (which means it will continue to read from the current position). I don't think this difference is really worth a separate benchmark because by testing them both we are only going to see how the underlying file system and the actual disk perform in difference cases after we remove what's in Node.js's control, but that isn't really something we need to care about. I think we can just keep the offset-and-length one and remove no-offset-and-length.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark is performed on non-existing and existing.
What I meant was that I separately confirmed that existing did not fail.
no-offset-and-length was removed.

}
}
break;
case 'no-offset-and-length':
for (let i = 0; i < n; i++) {
Copy link
Member

@joyeecheung joyeecheung Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here either n needs to be equal or less than tmpfile.len/buffer.length, or tmpfile.len should be buffer.length * n, otherwise we'd keep reading the EOF after we get there and this depends a lot on how the file system is implemented, so we would get very different numbers from different OS/disks. It would be more appropriate to make the file length instead of n a configuration of the benchmark.

Copy link
Contributor Author

@pluris pluris Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to you, it helped me a lot in understanding the API. :)
If you replace n with file length, it looks like buffersize should be 1.
As you said, how about keeping n, setting the appropriate buffer length value (ex: 1024), and set the file size to buffer.length * n?

try {
fs.readSync(fd, Buffer.alloc(1));
} catch {
// Continue regardless of error.
}
}
break;
default:
new Error('Invalid type');
}
bench.end(n);

if (fd === 'existing') fs.closeSync(fd);
anonrig marked this conversation as resolved.
Show resolved Hide resolved
}
6 changes: 1 addition & 5 deletions lib/fs.js
Original file line number Diff line number Diff line change
Expand Up @@ -751,11 +751,7 @@ function readSync(fd, buffer, offsetOrOptions, length, position) {
validatePosition(position, 'position', length);
}

const ctx = {};
const result = binding.read(fd, buffer, offset, length, position,
undefined, ctx);
handleErrorFromBinding(ctx);
return result;
return binding.read(fd, buffer, offset, length, position);
}

/**
Expand Down
14 changes: 7 additions & 7 deletions src/node_file.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2364,17 +2364,17 @@ static void Read(const FunctionCallbackInfo<Value>& args) {
char* buf = buffer_data + off;
uv_buf_t uvbuf = uv_buf_init(buf, len);

FSReqBase* req_wrap_async = GetReqWrap(args, 5);
if (req_wrap_async != nullptr) { // read(fd, buffer, offset, len, pos, req)
if (argc > 5) { // read(fd, buffer, offset, len, pos, req)
FSReqBase* req_wrap_async = GetReqWrap(args, 5);
CHECK_NOT_NULL(req_wrap_async);
FS_ASYNC_TRACE_BEGIN0(UV_FS_READ, req_wrap_async)
AsyncCall(env, req_wrap_async, args, "read", UTF8, AfterInteger,
uv_fs_read, fd, &uvbuf, 1, pos);
} else { // read(fd, buffer, offset, len, pos, undefined, ctx)
CHECK_EQ(argc, 7);
FSReqWrapSync req_wrap_sync;
} else { // read(fd, buffer, offset, len, pos)
FSReqWrapSync req_wrap_sync("read");
FS_SYNC_TRACE_BEGIN(read);
const int bytesRead = SyncCall(env, args[6], &req_wrap_sync, "read",
uv_fs_read, fd, &uvbuf, 1, pos);
const int bytesRead = SyncCallAndThrowOnError(
anonrig marked this conversation as resolved.
Show resolved Hide resolved
env, &req_wrap_sync, uv_fs_read, fd, &uvbuf, 1, pos);
FS_SYNC_TRACE_END(read, "bytesRead", bytesRead);
args.GetReturnValue().Set(bytesRead);
}
Expand Down
1 change: 1 addition & 0 deletions typings/internalBinding/fs.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ declare namespace InternalFSBinding {
function read(fd: number, buffer: ArrayBufferView, offset: number, length: number, position: number, req: FSReqCallback<number>): void;
function read(fd: number, buffer: ArrayBufferView, offset: number, length: number, position: number, req: undefined, ctx: FSSyncContext): number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function read(fd: number, buffer: ArrayBufferView, offset: number, length: number, position: number, req: undefined, ctx: FSSyncContext): number;

function read(fd: number, buffer: ArrayBufferView, offset: number, length: number, position: number, usePromises: typeof kUsePromises): Promise<number>;
function read(fd: number, buffer: ArrayBufferView, offset: number, length: number, position: number): number;

function readBuffers(fd: number, buffers: ArrayBufferView[], position: number, req: FSReqCallback<number>): void;
function readBuffers(fd: number, buffers: ArrayBufferView[], position: number, req: undefined, ctx: FSSyncContext): number;
Expand Down
Loading