-
-
Notifications
You must be signed in to change notification settings - Fork 415
add scopebuffer #739
add scopebuffer #739
Conversation
I'll only leave this single message, since I've said plenty in the other thread. I think this is a mistake. It's basically sweeping something under the rug before it even appears, because the design is dangerous. I really think this would make a brilliant addition to phobos in I feel by now that if I haven't been heard (or have been heard, but failed to convince), then there's nothing more I can add. I'll just be quiet now and let the others have their say. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a pity that we have to compromise safety for performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but D has @System for good reason.
I concur with @monarchdodra, I've repeatedly asked for numbers that would back the hackish decisions to no avail. There is a chance for good generic primitive that does all of presented for any type and doesn't cut fingers of those who use it. It is then usable far beyond As this pull stands it doesn't have the chance of being a publicly usable primitive, IMO. |
src/core/internal/scopebuffer.d
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this import seems to be incorrect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
I'd love you to elaborate a bit more on this because I think placing the module in For my money, I made a thorough pass through the module just now. Walter has nicely improved it following the meaningful reviews gotten. I can see how it can be immediately used as is, as well as backend for other work. It's a simple, primitive, "no room left below" artifact that can be used in a variety of nicer abstractions. It did look out of place in the middle of phobos like a dirty greasy rivet among a neat Lego set, but here it fits quite snugly. What is the core objection to this? |
Are there any plans to actually use this anywhere specific in druntime? Otherwise I don't see why it should go in core.internal. |
Question: Why is it important that Same with why it has to be a POD type and not have a destructor. |
Please understand that my main gripe is not what it is, but what it could be. Yes, Walter created something that gives mind shattering speeds. I do think this is awesome, and could probably help a ton of projects where even the tiniest of speed increases is of the essence. The thing is that because of this, it is barely usable for anything other than concatenating built-in types (chars, bytes, ints). Even using it with pointers/slices is tricky business (do they point the GC?). Walter basically hailed the thing as "the savior of Phobos", when in reality, all it'll do is maybe help the path building functions (which is still good yes) :/ What I think though is that with the tiniest of consessions, his At this point, If Walter really thinks this mind blowing speed justifies these concessions, then fine. He's probably right anyways. Just don't expect wide-spread use in Phobos. On the other-hand, I don't think it will prevent the introduction of a similar, say And that's what I want. An eye-blindingly fast replacement for Appender, which |
At Andrei's suggestion, I also added a realloc template parameter so the user can choose how to allocate storage. |
@CyberShadow it needs to be POD because otherwise it will not be enregistered. Since it will be used in tight loops, getting it in registers is everything. Squeezing it into two registers leaves the max number of registers available for other variables in those tight loops. Although doing this must be done with care, I've used it by passing/returning ScopeBuffers from functions, so I still want them copyable. Also, when it fits in 2 registers, the call/return sequence is done in registers rather than using hidden pointer arguments. Essentially, it's designed to fit in with how the 64 bit ABI is specified. |
OK, that makes sense when the ScopeBuffer is used in the same function declaring it.
But it's currently marked as non-copyable ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(lower <= upper && upper <= bufLen);
I know it's assertive code, but if this is meant to be used in a tight loop, we might as well not gratuitously slow it down in non-release.
Eh, I should remove that. |
Please remember that you marked it as such because it has both value and reference semantics. If you pass it around by value, and then the passed value gets modified, the original object will spontaneously corrupt (it may point to data that was removed). If Or at the very least, simply document that no more than 1 |
@monarchdodra the idea is that one can pass/return it to a function, temporarily transferring ownership, as in:
The call/return by register will be faster than by pointer. |
So like C++11's |
Yes, D has move semantics, and you can force it with std.algorithm.move in the rare case that it cannot be automatic. |
Why not put your ideas into practice and submit a PR for it? |
src/core/internal/scopebuffer.d
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete
awwight, who broke the autotester? |
I'll look forward to seeing it. Keep in mind that various features and the way they are used interact in strange ways and can have unexpected consequences in the code generated. It pays off to constantly look at the generated code while developing performance oriented code. And when I say "pays off", factors of 2 are not uncommon. |
I wonder why everyone is so obsessed with stdlib malloc/free. IMO the problem to solve is not the very unlikely collection that might be triggered during an allocation of a temporary buffer, but deterministic removal of that data because you know it didn't escape. GC.free offers just that, too. I've made a few measurements comparing the stdlib version with the GC using the example provided by blackwhale, with an argument of 250, which causes one allocation and one reallocation per iteration. On Win32, the GC version takes only 75% of the time of the stdlib version! There are some easy optimizations to the GC version which could make it par on Win64 aswell (removing the GC proxy, making the GC implementation nothrow, disabling setting memory to zero in malloc. etc). Maybe someone can make a similar measurement for linux... |
@WalterBright: I haven't been following this discussion so far, but are the performance results that drove the engineering decisions behind this available somewhere? It would be a shame for people wanting to compare a "safe" implementation to this to have to re-do the benchmarks all over again, especially as the detailed profile of your use case is not known. |
@rainers there are some C library functions that people have obsessed over making fast for 30 years. malloc/free is one, memcpy is another. Why not take advantage of that? I long ago gave up trying to compete with the C memcpy implementations. You're right that the malloc speed is less of an issue - ScopeBuffer is designed to try and minimize its use. The larger issue is the loop performance of adding items to the array. Besides, with the current iteration, the user can decide which allocator he wants to use. And, of course, there are the users for whom 'GC' is a trigger word for "I won't use it". By using malloc/free instead where the lifetime is known caters to that. |
@klickverbot as is normal for working with clients, the work is under NDA. I would have thought that it's non-controversial that getting variables into registers means they run faster. It doesn't really matter how much faster it is, just that it is faster. Individual results will also vary considerably based on which D compiler you use, which CPU, the data set, the surrounding code, etc. As for why making Phobos implementations run as fast as possible: users of libraries do not care in the slightest what is under the hood - they only care about what they see, which is the API, the size, and the speed. Smaller size and faster speed is always better. Getting the runtime library to be fast has high leverage in that it doesn't just speed up an app, it speeds up every app. It's second in leverage only to better compiler optimizations. I'd bet that hardly 1 programmer in 100,000 has ever even looked at the source code for their system's malloc/free/memcpy, and fewer than that would make much sense out of it. ScopeBuffer is a peach compared to those gears & wires :-) For another example, Andrei and I have worried about the performance implications of using ranges and algorithms, versus hand coding. If R&A turned out to be slower than hand coding, then R&A becomes a hard sell. D needs R&A to be zero cost abstractions. To my great pleasure, I have discovered that they really are zero cost abstractions, if (and only if) the person designing the range or algorithm takes care in their implementation. |
I'll pull this now. It won't appear in the documentation and will be strictly for internal use. After a few iterations on use, we can decide its future direction. |
Auto-merge toggled on |
Auto-merge toggled off |
Sorry, I agree with the pull, but let's not jump the gun. There still very useful review about the code coming in. Is this so urgent it can't even wait a day or two's worth of review? I mean, I made a comment where this doesn't even compile: ScopeBuffer!(int*) b;
int*[] s;
b.put(s); //Error: cannot implicitly convert expression (s[]) of type const(int*)[] to int*[] Let's at least give Walter the time to react to what was said in the actual review... |
Corrections. It doesn't even instanciate!. Just this: |
nice - @WalterBright could you please fix those. |
But I have posted innumerable replies here and in the Phobos thread on it! ScopeBuffer has been reviewed for a couple months now. I'll address the instantiation issue you discovered shortly. |
Also, where in druntime will this be used? |
I haven't looked. There are many places in Phobos where it can be fruitfully used, such as std.file. |
@monarchdodra fixed. |
As for why time is of the essence, there are a lot of much needed changes to Phobos that cannot be done without ScopeBuffer. I'd like to get these into the next release, so we can show major progress towards satisfying users who want to use Phobos with much less reliance on the GC. |
Why should it be in druntime if there are no use-cases for it there? (I don't doubt that there might be some, but as druntime is supposed to be somewhat minimal…) There is no reason why it couldn't go into an internal Phobos module, as far as I can see. |
As this is now for the internal package and apparently we're in a hurry, I no longer have objections to this even with the unsafe interface. We can always change things later more or less freely. As for right now, I still question whether it should be in druntime's or Phobos' internal package - Phobos use-cases have been mentioned, but druntime use-cases haven't. Since it can be moved anyway, maybe we should conservatively place it in Phobos until druntime use-cases are proven. |
This was discussed in dlang/phobos#1911, but the summary is people thought it was too dirty/hackish/unsafe for Phobos. I don't know if there are use cases in druntime or not, since I haven't looked. |
I'm really talking about the specific comments made in the code review. I said I was OK with the pull. Just that even then, it's standard procedure for something new/complex to be left open for a bit more than 2 days, to give anyone a chance to catch something.
I missed that, and I appologize. But I still think it's fair to give it an opportunity for a last review. BTW: @MartinNowak and I would be curious to have your reply regarding a variadic |
@WalterBright: As far as I can see, people objected to it becoming public API there, i.e. Also note that Phobos accessing a druntime-internal module would be a bit sketchy. We have been working hard to make Phobos "just another library", after all. |
Sorry, I had missed that. My thoughts are that although I used ScopeBuffer extensively, there was no use case for that. I tend to eschew adding things until there is a noticeable need for it, because it's much harder to take away unneeded features than it is to add them. I don't think it matters which internal package scopebuffer is in, after all, it is an internal package and users don't care about internal implementations. Phobos is dependent on druntime, and nothing about scopebuffer being in druntime changes that. |
This underwent significant scrutiny already, a whole lot more than a bunch of more significant work. It's gotten to the point it's unproductive to continue the stream of nits. I'm seeing potential use in druntime in core.sys.windows.generateSearchPath (far as I can tell it's incorrect at the moment), core.runtime.loadLibrary, and probably a few others, but neither has high impact. It is true there is no precedent of phobos using druntime internal code. So then let's move this to |
Yep, right decision. I wouldn't know of any performance sensitive code in druntime that would benefit from this. |
This is an alternative to dlang/phobos#1911 , adding it to druntime instead.