add std.internal.scopebuffer #1911

WalterBright · 2014-02-07T08:23:19Z

This adds a package std.buffer, and the first entry in that package, std.buffer.scopebuffer.

ScopeBuffer is an OutputRange that sits on the stack, and overflows to malloc/free. It's designed to help in eliminating gc allocation by lower level functions that return buffers, such as std.path.buildPath(). With some judicious user tuning of the initial stack size, it can virtually eliminate storage allocation.

Using it is @System, but the user of ScopeBuffer can be @trusted.

An output range like this is a precursor to eliminating the excessive gc use by functions such as buildPath().

burner · 2014-02-07T08:43:45Z

std/buffer/scopebuffer.d

IMO make it a unittest example.

Added static assert for it.

I see no reason why the below example should not use a documented unit test.

WalterBright · 2014-02-07T23:12:56Z

newsgroup discussion: http://forum.dlang.org/post/ld2586$17f6$1@digitalmars.com

jacob-carlborg · 2014-02-09T11:51:35Z

std/buffer/scopebuffer.d

Why is this block indented?

In general, do we want to document contracts?

Why is this block indented?

Because I thought it looked nicer.

In general, do we want to document contracts?

I don't find trivia to be helpful - in this case, any additional documentation would be trivia.

Because I thought it looked nicer.

I have not seen that anywhere else. It's not consistent with the reset of Phobos.

Indenting it is also consistent with how constraints are used in templates, and is analogous in that they apply to the function parameters.

Indenting it is also consistent with how constraints are used in templates, and is analogous in that they apply to the function parameters.

It's not consistent with any other in-block. I would say that template constraints are put on the same line as the function declaration if the length of the line allows that. Otherwise they're put on a new line and indented, following the standard style guide for splitting up a single line.

WalterBright · 2014-02-10T20:08:32Z

can you replace uint with size_t?

The idea here was to get the entire struct into two registers on 64 bit code, which significantly improves performance. size_t is not necessary because it's hard to conceive of a stack allocated temporary buffer larger than 4 gigs. So there is a very good reason why it is uint.

monarchdodra · 2014-02-11T13:21:20Z

std/buffer/scopebuffer.d

Please add

/// ditto size_t opDollar() { return i; }

here. "/// ditto" is optional

Any reason why you didn't suggest alias opDollar = length; here, as is usually done?

Because most ranges are not containers, and only have size_t length(). Containers, on the other hand, have void length(size_t desired). I think alias opDollar = length; would work correctly in this context, but it could have some weird edge cases... On the other hand, I am 100% confident that re-writing opDollar is correct 100% of the time.

I didn't spot the below overload, but opDollar being an overload set still works, so I think it would only affect code that messes with .opDollar explicitly.

added the alias. If that turns out to not work, will fix.

monarchdodra · 2014-02-11T13:51:06Z

I like this: It can work as a nice deterministic but high performance alternative for Appender or Array. It has less "functionality" than both, so isn't bogged down by "non-features": The fact that i owns its buffer at all times should give it a real edge performance wise. Such an object was requested before as a "Real Appender" by @denis-sh :
http://d.puremagic.com/issues/show_bug.cgi?id=11138

That said, I am concerned by the things complete lack of support for types with postblit, elaborate assignment or constructor. This is sad, because since because ScopedBuffer completely owns its elements, so is ideally placed to (correctly) elide postblits, and correctly manage life cycles.

JakobOvrum · 2014-02-11T14:01:56Z

std/buffer/scopebuffer.d

If this had been a documented unit test, you'd find this doesn't compile.

yeah, my bad.

JakobOvrum · 2014-03-15T15:39:11Z

This burden of proof is ridiculous. Enregistering is known to help a lot tight loops when performance is an issue. Those may or may not be common depending on the domain etc. but it's bizarre to now require to show the bottom line effect of achieving a common and useful optimization.

It's important in this case because it radically changes the interface. It goes from being safe and encapsulated to a leaky abstraction.

andralex · 2014-03-15T15:55:45Z

@JakobOvrum: it is important but the proof has been made.

DmitryOlshansky · 2014-03-15T15:58:17Z

This burden of proof is ridiculous. Enregistering is known to help a lot tight loops when performance is an issue. Those may or may not be common depending on the domain etc. but it's bizarre to now require to show the bottom line effect of achieving a common and useful optimization.

Seems to me that I'm in the minority that has to prove things, while others don't. If it's the well-known fact the proof must be trivial, isn't it? When PERFORMANCE in any form comes as an argument for DETRACTING from USABILITY, numbers and test-cases are MUST HAVE.

I feel like you're asking others to do your homework and show to see if's correct, because, of course, you know it all and doing it yourself has no point.

Anyway, here the simplest I've come up with:
https://gist.github.com/blackwhale/9569368

If anything I'm seeing that ~this is consistently faster. You may point out any failures in this snippet.

Some sample runs (compiled both files in one go with dmd -O -inline -release):

D:\D>dtor 25
22000000 time: 624us

D:\D>dtor 25
22000000 time: 621us

D:\D>dtor 25
22000000 time: 618us

D:\D>dtor 25
22000000 time: 618us

D:\D>dtor 45
39600000 time: 1137us

D:\D>dtor 45
39600000 time: 1132us

D:\D>dtor 45
39600000 time: 1135us

D:\D>manual 25
22000000 time: 646us

D:\D>manual 25
22000000 time: 647us

D:\D>manual 25
22000000 time: 654us

D:\D>manual 45
39600000 time: 1153us

D:\D>manual 45
39600000 time: 1161us

D:\D>manual 45
39600000 time: 1233us

andralex · 2014-03-15T16:00:20Z

Wait, https://gist.github.com/blackwhale/9569368 uses scope(exit), not a destructor. Big difference.

DmitryOlshansky · 2014-03-15T16:00:53Z

@andralex I change it back and forth.

DmitryOlshansky · 2014-03-15T16:14:31Z

@andralex

@blackwhale oic. At any rate, the need for proof can stop safely at enregistering.

Then of course, FIle also should avoid having a destructor, right? We'd be able to enregister that tiny struct! Less code and faster ;) But you'd agree that it makes no sense, precisely because the gains (if any) are minuscule compared to risk of a resource leak.

I believe that anything that aims to be faster should have quantifiable benefits. Be it very specially crafted benchmark, or some metric derived with profiler or other tool.

andralex · 2014-03-15T16:29:53Z

I agree we should add regular benchmarking as a matter of procedure to the toolchain. That said, this particular part of the discussion about proving performance improvements has run its course. I worked with Walter on said project. Yes, there are performance gains. Yes, in some cases they are spectacular. It's a memory buffer, which is very core. It doesn't take much expertise to figure that once enregistering happens, a lot of good stuff comes with it.

I understand there are material objections to this module, so let's better focus on those. We can make it private to Phobos, move it to druntime, undocumment it and build a safer abstraction on top of it, etc.

My own objection is that a very non-Phobosy module claims front and center stage position in std.buffer.scopebuffer. It should hang out in some internal/private/bits module.

DmitryOlshansky · 2014-03-15T16:49:09Z

I agree we should add regular benchmarking as a matter of procedure to the toolchain.

Great.

Yes, there are performance gains. Yes, in some cases they are spectacular. It's a memory buffer, which is very core.

I see that I failed to deliver the message. Said spectacular gains should be easily testable - just compile a version with destructor vs scope(exit) and tell us what's the difference in run-time, if anything I'm curious.

Is it this request that hard to accommodate? What's wrong with you guys? Why constant appeal to authority and doesn't take much expertise instead of simple facts?

(Especially as these versions must be just different commits on the same source tree)

That said, this particular part of the discussion about proving performance improvements has run its course

Nice. Well, uh-oh.

My own objection is that a very non-Phobosy module claims front and center stage position in std.buffer.scopebuffer. It should hang out in some internal/private/bits module.

Something I can agree with.

MartinNowak · 2014-03-15T17:06:41Z

Before drifting away into detailed implementation discussions please help to clarify a few things. @Dicebot summarized my concern very nicely (#1911 (comment)).
While I'm agreeing that this is a nice tool for certain tasks, this pull is a prime example for another uncoordinated and incomplete piece.

What problem are you trying to solve?
How does this relate to OutBuffer and Appender?
How will this help to avoid allocations in phobos functions?

It's just that I don't know how I could ever explain this mess to someone when advertising D.
I think a std.buffer package for optimized output ranges is a good start.

andralex · 2014-03-15T17:10:30Z

Is it this request that hard to accommodate? What's wrong with you guys? Why constant appeal to authority and doesn't take much expertise instead of simple facts?

No need to get agitated. The fact of the matter is most everybody in our group, including probably yourself, has made changes to code that they argued and alleged it improved performance, and there was seldom a request for (or providing of) hard proof.

The thing is, asking for hard proof without a systematic benchmarking framework is a tall order. One would need to build a synthetic benchmark that exercises code generation in similar ways as the real application, without pulling in a significant fraction of it. Once than, the few people on this review would be like, "mmkay, fine" and that entire work goes to nothing.

All of that would change if we did have a systematic benchmarking framework. I'd say it's very productive to champion one and use this discussion as part of the motivating example. Asking for exhaustive proof here is, in my opinion, a bit much.

I think concerns along the lines of @MartinNowak are the ones we need to address here.

DmitryOlshansky · 2014-03-15T18:39:44Z

I think I should probably refrain from commenting on this, so my final remarks on subject of low-level optimization at all costs.

The fact of the matter is most everybody in our group, including probably yourself, has made changes to code that they argued and alleged it improved performance, and there was seldom a request for (or providing of) hard proof.

We did, and I always ask for it. How informal the reported numbers are varies wildly from pull to pull, but none I recall have come through solely on being theoretically solid.

One would need to build a synthetic benchmark that exercises code generation in similar ways as the real application, without pulling in a significant fraction of it.

Since you reported spectacular gains, obviously they are easy to observe else how would you confirm it in the first place? Since application in question obviously was benchmarked on some real work, then just use it as an indicator, that is all I asked. No exhaustive proof required, just tell us what gains you've got (privately on your own project).

It's just that I suspect (no proof, but test runs with my tiny snippet suggest) that the gains of having it w/o destructor are immeasurable.

Once than, the few people on this review would be like, "mmkay, fine" and that entire work goes to nothing.

Which indeed happens and no amount of benchmarking harness allows us to forget the common use case we optimize for, that is a benchmark after all.

andralex · 2014-03-15T19:41:56Z

Since you reported spectacular gains, obviously they are easy to observe else how would you confirm it in the first place? Since application in question obviously was benchmarked on some real work, then just use it as an indicator, that is all I asked. No exhaustive proof required, just tell us what gains you've got (privately on your own project).

It's just that I suspect (no proof, but test runs with my tiny snippet suggest) that the gains of having it w/o destructor are immeasurable.

I don't understand this. Is it that Walter is lying if he doesn't tell you numbers? You refuse to take his word? He needs to sit down now and change code and collect numbers that show you things? What's this putting the hounds on someone all of a sudden whereas in all other case it's been like "yeah, fine, merge".

andralex · 2014-03-15T19:43:38Z

FWIW the project will be open sourced soon and available for scrutiny. I still think this obstinate asking for evidence is not proportional response.

ghost · 2014-03-15T20:12:21Z

OT: What project are you and Walter working on? Was it some kind of collaboration? Very interesting!

DmitryOlshansky · 2014-03-15T21:49:14Z

I don't understand this. Is it that Walter is lying if he doesn't tell you numbers? You refuse to take his word?

I trust the machinery was done right and the code generated must be looking awesome. I don't easily trust that sacrificing usability was justified.

Putting things into perspective - I started this rant as explicitly removing destructor in publicly advertised primitive "that saves Phobos from GC leaks" is ehm in need of a good reason.

Right now D has a large problem with Phobos leaking memory like a ship made out of a cheese grater. This problem is definitely putting people off from using D (rightly or wrongly). We must address it. ScopeBuffer is the answer to a lot of that, while delivering the best performance we can offer as well.

To put it simply - I don't know yet how much was brought in practice with trading away the destructor, and I failed to observe the gains myself. I do understand the significant loss in usability however. I thought ScopeBuffer would be a different primitive with wider scope ;) and simpler usage for generic code but I understand I can't affect that.

Now that it is core.internal.scopebuffer much of my original motivation for getting the justification
evaporated. I really don't care about the interface or usability of it any more.

He needs to sit down now and change code and collect numbers that show you things?

So you don't trust that I naturally believed he did something like that before disabling the destructor? Well, alternatively he could have solely observed ASM listings and focused on the code generated. Telling just that would help me understand things.

WalterBright · 2014-03-15T23:49:30Z

| What problem are you trying to solve?

Using the stack for temporary buffers rather than storage allocation. Avoid generating garbage. Highest possible speed at doing things like string processing.

How does this relate to OutBuffer and Appender?

They're too slow.

How will this help to avoid allocations in phobos functions?

Many phobos functions internally use GC allocations for temporary buffers, and then rely on a GC sweep to clean them up. These need to be replaced with ScopeBuffers as much as possible. std.file is a prime example.

andralex · 2014-03-16T19:55:05Z

added to druntime

WalterBright · 2014-03-16T22:56:12Z

Now moved to std.internal.scopebuffer

andralex · 2014-03-16T23:04:28Z

win64.mak

please remove this and make sure you test

andralex · 2014-03-16T23:37:35Z

Auto-merge toggled on

add std.internal.scopebuffer

monarchdodra · 2014-03-17T06:50:38Z

This is missing the void put(const(T)[] s)/ScopeBuffer!(int*) fix from druntime. Could someone write the patch?

burner reviewed Feb 7, 2014
View reviewed changes

jacob-carlborg reviewed Feb 9, 2014
View reviewed changes

monarchdodra reviewed Feb 11, 2014
View reviewed changes

JakobOvrum reviewed Feb 11, 2014
View reviewed changes

andralex closed this Mar 16, 2014

WalterBright reopened this Mar 16, 2014

andralex reviewed Mar 16, 2014
View reviewed changes

win64.mak Outdated

Copy link

Member

andralex Mar 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove this and make sure you test

add std.internal.scopebuffer

cb6b8a0

andralex added a commit that referenced this pull request Mar 17, 2014

Merge pull request #1911 from WalterBright/scopebuffer

60e3c54

add std.internal.scopebuffer

andralex merged commit 60e3c54 into dlang:master Mar 17, 2014

MartinNowak mentioned this pull request Apr 24, 2014

improve Bson/MongoDB performance vibe-d/vibe.d#633

Closed

Uh oh!

add std.internal.scopebuffer #1911

add std.internal.scopebuffer #1911

Uh oh!

Conversation

WalterBright commented Feb 7, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WalterBright commented Feb 7, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WalterBright commented Feb 10, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monarchdodra commented Feb 11, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JakobOvrum commented Mar 15, 2014

Uh oh!

andralex commented Mar 15, 2014

Uh oh!

DmitryOlshansky commented Mar 15, 2014

Uh oh!

andralex commented Mar 15, 2014

Uh oh!

DmitryOlshansky commented Mar 15, 2014

Uh oh!

DmitryOlshansky commented Mar 15, 2014

Uh oh!

andralex commented Mar 15, 2014

Uh oh!

DmitryOlshansky commented Mar 15, 2014

Uh oh!

MartinNowak commented Mar 15, 2014

Uh oh!

andralex commented Mar 15, 2014

Uh oh!

DmitryOlshansky commented Mar 15, 2014

Uh oh!

andralex commented Mar 15, 2014

Uh oh!

andralex commented Mar 15, 2014

Uh oh!

ghost commented Mar 15, 2014

Uh oh!

DmitryOlshansky commented Mar 15, 2014

Uh oh!

WalterBright commented Mar 15, 2014

Uh oh!

andralex commented Mar 16, 2014