-
-
Notifications
You must be signed in to change notification settings - Fork 740
use ScopeBuffer in std.file #2014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
auto h = CreateFileW(std.utf.toUTF16z(name), defaults); | ||
|
||
wchar[64] tmpbuf = void; | ||
auto sz = ScopeBuffer!wchar(tmpbuf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does ScopeBuffer
need an explicit template parameter? This looks to me like we could use a helper function with IFTI, e.g.:
auto sz = tmpbuf.scopeBuffer();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually if it can't even be a function (due to enregistering?) we could also use pass-by-alias:
auto sz = ScopeBuffer!tmpbuf;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the ScopeBuffer PR thread, putting the buffer as an alias parameter would cause a new ScopeBuffer to be instantiated for every use, rather than sharing a single instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kinda funny. IFTI works fine. Just use scopeBuffer factory function that is already present. This should have worked:
auto sz = tmpbuf.scopeBuffer();
If anything I'd mark explicit constructor as private as it simply adds more boilerplate in user code over 1-line helper function.
The code is a bit more verbose than I'd like, but I think it's fine in a low level IO lib, so worries there. The Or, if you judge the performance hit does not justify it, please at least leave in a comment about it in the code, for future reviewers (EG: Also, |
/* ******* | ||
* Reads from range t and writes as wchar's to s. Appends a 0. | ||
*/ | ||
private wchar* toUTFWz(T, S)(ref T t, ref S s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a good opportunity to enhance our existing toUTFz
from std.utf
to take an optional buffer (ok we can do this later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into doing it, and it'll take some careful work. It can be deferred for the moment.
I don't like this, the number of lines is exploding 5x over. |
Explicit memory management takes more lines of code than automatic. |
@monarchdodra I had overlooked that toUTFWz can throw, which would screw things up. I tried to make it nothrow, then found that the foreach can throw. Looks like I'll need to revise the code. In doing this, I also discovered that stat() and friends were incorrectly considered throwable. Here's a PR to fix that: dlang/druntime#742 |
Of course, the 64 is completely arbitrary. Considerations are:
|
from.toUTFWz(sz); | ||
auto fromlen = sz.length; | ||
auto p = to.toUTFWz(sz); | ||
scope(exit) sz.free(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yah, missed that one, and the other one below.
foreach (wchar w; t) | ||
s.put(w); | ||
s.put(0); | ||
return s[].ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a rationale for this return? It's not generic at all, and it also makes very little sense in the context of this function: Why return a pointer to the beginning to the buffer?
toUTFWz
should have no knowledge of how the output range S is implemented (or if it even has []
at all).
Why not return just void?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see how you are using it now. Though I can't say I completely agree with it. Why not return s
by reference. Then, you could use:
SetFileAttributesW(name.toUTFWz(sz)[].ptr, attributes);
//or, with .ptr as a member property:
SetFileAttributesW(name.toUTFWz(sz).ptr, attributes);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it obvious that all improvements you propose end up worse? toUTFz converts to a zero-terminated UTF string, and that's what it returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toUTFz converts to a zero-terminated UTF string, and that's what it returns.
Actually, no, that's not what it returns. That's the point I'm making.
Isn't it obvious that all improvements you propose end up worse?
:/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toUTFz converts to a zero-terminated UTF string, and that's what it returns.
Actually, no, that's not what it returns. That's the point I'm making.
Wait, then I'm missing something. Doesn't it return a pointer to the beginning of the converter string?
Isn't it obvious that all improvements you propose end up worse?
:/
What I see is that you take the expression in the return statement and sprinkle it in all caller code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, then I'm missing something. Doesn't it return a pointer to the beginning of the converter string?
It returns a pointer to the beginning of the buffer into which the string is converted, not at the begining of the converted string.
from.toUTFWz(sz);
auto fromlen = sz.length;
auto p = to.toUTFWz(sz); // points to 0 terminated from[]
Why would to.toUTFWz(sz)
point to from[]
:s ???
Also, It makes the un-written assumption that the ouput range S is of type ScopeBuffer
. IT wouldn't compile with Appender
, for example.
As I said when I finished reviewing, this is "nit" material, so we don't have to blow it out of proportions here. I'm OK if we keep it that way. I'm just saying that if we want to have a public toUTFWz
that takes an output range, that design won't fly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks for the explanation. I'm also fine with it either way.
This looks good to go to me. I saw nothing that actually needed addressing. Will leave open 24h, and then toggle the auto-merge if no issues are raised by then. |
This makes it painfully obvious that we need a ScopeBuffer with a destructor. The ScopeBuffer idiom is highly repetitive. Just looking at the code in this diff and contemplating the spectrum of plopping it in a lot of places clarifies that making the idiom shorter is a big deal. We should, I think, rename struct ScopeBuffer
{
private ScopeBufferImpl impl;
alias impl this;
~this() { impl.free(); }
} and use it wherever we need automatic destruction. |
@andralex Long live the destructor! ;) |
@andralex, resurrecting this discussion: IIRC Walter's reasoning behind not giving ScopeBuffer a destructor is that it then would no longer be POD (in the ABI sense), forcing it to be passed on the stack. This seems to have been a performance bottleneck in one of Walter's applications. I agree that the manual lifetime management is a big downside of the design, though, avoidable or not. |
Would be nice to go ahead with @andralex suggestion and simplify repetitive code. Nothing is ever passed to functions here. |
So I'll be the one who have to say: I's unacceptable ugly and dangerous. Current approach: res = c_func(dStr.toStringz()); It's not just slow, it is also invalid as there is no guarantee Proposed approach: // code before
c_func_res res;
{
char[64] tmpbuf = void;
auto sz = ScopeBuffer!char(tmpbuf);
scope(exit) sz.free();
res = c_func(dStr.toUTFCz(sz);)
}
// code after Without enclosing additional scope For a long time I'm proposing this approach: res = c_func(dStr.tempCString()); But never ever had any response from you guys whether or not it is usable in Phobos. Can anybody explain why this short and almost safe (except really incorrect usage, see docs) proposal is ignored? |
Formalized my proposal in pull #2332. |
I agree with destructor part being a blocker. |
this can be closed, as far as I see it. reason #2332 |
Does #2332 include similar update to std.file? I don't think so. |
@Dicebot It seems to do, grep the diff for 'std/file' - 140+ LOCs |
Yes. That pull replaced all |
Ah, ok, sorry, missed that. |
This is an initial run at eliminating the generation of GC allocated garbage in std.file by replacing calls to toStringz and toUTF16z with use of ScopeBuffer.
Does not change the user facing API at all.