Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue 14368 - rawRead performance #3127

Merged
merged 3 commits into from Apr 3, 2015

Conversation

charles-cooper
Copy link
Contributor

reduce performance gap between fread and rawRead
https://issues.dlang.org/show_bug.cgi?id=14368

Currently std.stdio rawRead is much slower than cstdio fread. Benchmark indicates that rawRead is roughly 2.5x slower than directly calling fread in a tight loop (not counting time in syscalls). The overhead comes from the compiler failing to inline calls to std.exception.enforce, calling errnoEnforce even when fread's return indicates success, and from buffer slicing overhead.

This patch fast paths the ordinary (no error) case, reducing the overhead to almost nil when compiled with gdc and about 1.35x when compiled with dmd.

@@ -725,7 +725,8 @@ $(D rawRead) always reads in binary mode on Windows.
{
import std.exception : enforce, errnoEnforce;

enforce(buffer.length, "rawRead must take a non-empty buffer");
if (!buffer.length)
enforce(false, "rawRead must take a non-empty buffer");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calling enforce, you can just throw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will do. Any idea why enforce does not get inlined? The body just consists of if (!value) throw new Exception(msg)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dmd does not inline functions with lazy parameters. It's a big problem for enforce, been around forever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Would it make sense for enforce to be implemented as a mixin instead of a function with lazy parameters?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires using mixin at call site. e.g.:

mixin(enforce(cond, msg));

which is ugly and awkward.

The pity here is that in 99% of cases, the message passed to enforce does not involve any computation. The whole point of using the lazy parameter is to defer construction of the exception message until you have confirmed the condition is false. If the construction takes 0 time, then there is no point to defer. An inlineEnforce function would be a kludge, but may get the job done for now.

Personally, I find enforce quite useless. It's not difficult to do if(!cond) throw Exception(msg);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's even more awkward than that, msg can't be a runtime string, it has to be a compile-time string that generates a string at runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

Or change enforce to take a strict parameter (and thus get auto inlined) and have a lazyEnforce which takes a lazy parameter for those cases where the benefit of laziness outweights the benefit of inlining.

Really though you're right, the compiler should be able to infer that there is no cost to strictly evaluating msg if it is immutable.

@schveiguy
Copy link
Member

Thanks, looks good to me.


enforce(buffer.length, "rawRead must take a non-empty buffer");
if (!buffer.length)
throw new Exception("rawRead must take a non-empty buffer");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this improving anything?

@andralex
Copy link
Member

Sorry, read the code before the comments. I need to see the code used for benchmarking. This looks really odd.

It seems to me we could get better improvements by eliminating a call to error() after each read.

@andralex
Copy link
Member

OK, I measured the code in the issue. Though the test code is contrived (hard to fine one who reads so little at a time yet cares about performance), it's worth improving things. Please eliminate fread_success (unnecessary and breaks naming convention) and let's pull this in. Thanks.

@schveiguy
Copy link
Member

It seems to me we could get better improvements by eliminating a call to error() after each read.

That is what this improvement does. If you read the documentation for fread, an error can only occur when the result of the call does not equal the requested read length. This code short circuits based on that principle. The only call to error should occur on a real error, or eof.

You may have already realized this, but it wasn't clear from your last note.

@charles-cooper
Copy link
Contributor Author

Sure, I can get rid of the once-used variable. But so far the original author of the code, two reviewers, and myself have initially been unclear about the semantics of fread's return value -- that ANY short read could be indicative of error, while buffer.length == freadResult indicates there was no error.

IMO having a one-off variable instead of a comment is self-documenting because a) it causes the maintainer to look twice, possibly checking man fread in the process and b) names the type of the result so it looks more like a crude form of pattern matching than an arbitrary check.

Taking into account the confusion so far, I propose that the below code is clearer because it communicates that slicing the buffer and checking errno makes semantic sense only when fread returns a short item count.

assert (freadResult <= buffer.length); // fread never returns result greater than nmemb
immutable possibleFailure = (freadResult != buffer.length);
if (possibleFailure)
{
    errnoEnforce(!error);
    auto safeSlice = buffer[0 .. freadResult]
    return safeSlice;
}
return buffer;

@schveiguy
Copy link
Member

MO having a one-off variable instead of a comment is self-documenting because a) it causes the maintainer to look twice, possibly checking man fread in the process and b) names the type of the result so it looks more like a crude form of pattern matching than an arbitrary check.

I disagree, the comment is sufficient. I'd rather just see a comment in there than give the compiler an excuse to waste cycles.

@charles-cooper
Copy link
Contributor Author

Hey, what is the status of this pull request? The most recent patch addresses the comments provided earlier.

@schveiguy
Copy link
Member

LGTM

FYI, updates to the branch do not get sent as a notification to email, so there's no way to know something like that changes unless you add a comment.

@schveiguy
Copy link
Member

Auto-merge toggled on

schveiguy added a commit that referenced this pull request Apr 3, 2015
Fix Issue 14368 - rawRead performance
@schveiguy schveiguy merged commit 4d30c1d into dlang:master Apr 3, 2015
@charles-cooper
Copy link
Contributor Author

I see -- Thank you!

@andralex
Copy link
Member

andralex commented Apr 3, 2015

@charles-cooper thanks for doing and arguing this work properly, and thx @schveiguy for reviewing and following up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants