fix issue 15293 #3802

aG0aep6G · 2015-11-10T23:02:38Z

ReadlnAppender tried to claim the capacity of the passed buffer, calling
assumeSafeAppend on the result so that on the next call it has a capacity
again that can be claimed.

The obvious problem with that: readln would stomp over memory that it has
not been given.

There was also a subtler problem with it (which caused issue 15293):
When readln wasn't called with the previous line, but with the original
buffer (byLine does that), then the passed buffer had no capacity, so
ReadlnAppender would not assumeSafeAppend when slicing the new line from
it. But without a new assumeSafeAppend, the last one would still be in
effect, possibly on a sub slice of the new line.

https://issues.dlang.org/show_bug.cgi?id=15293

dlang-bot · 2015-11-10T23:02:40Z

Fix	Bugzilla	Description
✓	15293	[REG2.069.0] std.stdio.readln(buffer) messes up buffer's capacity

schveiguy · 2015-11-12T14:50:39Z

ping @rainers
I don't think this is the right solution, because I don't agree with the unit tests, but I can't get the error to happen on my macOS system, so I'm having trouble diagnosing the issue.

aG0aep6G · 2015-11-12T15:09:21Z

I can't get the error to happen on my macOS system, so I'm having trouble diagnosing the issue.

ReadlnAppender is only used in version (DIGITAL_MARS_STDIO) and version (MICROSOFT_STDIO). OS X uses the version (HAS_GETDELIM) variant.

schveiguy · 2015-11-12T15:19:54Z

std/stdio.d

+        if (buf.length >= pos + n) // buf is already large enough
+            return;
+
+        if (buf.capacity >= pos + n)


This is going to kill performance. A call to capacity means lookup of the metadata in the GC. reserve(1) is called for every character added.

I think you can fix this by doing this instead:

auto curCap = buf.capacity if(curCap >= pos + n) { buf.length = curCap newBuf = true }

Setting buf.length to the capacity makes sense to me.

But why set newBuf = true? newBuf is supposed to be true only if the buffer has been allocated by ReadlnAppender. But a buffer from outside may have capacity, too.

schveiguy · 2015-11-12T15:21:21Z

ReadlnAppender is only used in [Windows]

Ah, ok. I forgot that.

schveiguy · 2015-11-12T16:40:10Z

std/stdio.d

            char[] nbuf = new char[ncap];
            memcpy(nbuf.ptr, buf.ptr, pos);
-            cap = nbuf.capacity;
            buf = nbuf.ptr[0 .. buf.length];   // remember initial length


This isn't right. The original code ignored buf.length and used cap. You are using buf.length for what cap did. So this essentially allocates no new capacity.

I think you want buf = nbuf

schveiguy · 2015-11-12T16:46:19Z

I don't think this is the right solution

I'm having second thoughts on this opinion. I didn't realize that the other versions (non ReadlnAppender) didn't care about the returned buffer's appending status.

I'm thinking that this update (after issues noted above) will be acceptable, because readln isn't meant for looping (and when it is, you can use the same mechanism as byLine, which should be high performance).

aG0aep6G · 2015-11-12T17:20:57Z

std/stdio.d

+        immutable curCap = buf.capacity;
+        if (curCap >= pos + n)
+        {
+            buf.length = curCap;


@schveiguy: If you think newBuf = true; should be here, please elaborate.

Sure.

I look at it this way:

buf is not big enough to hold the data, so we must expand

Since buf.capacity is non-zero, we know (or we can safely assume) that the data after buf is unused.

When we expand buf to its capacity, it's likely we will not use all the data, so the extra data we claim here for expanded capacity will be garbage if we don't assumeSafeAppend.

So if you set newBuf = true, then assumeSafeAppend will be called at the end, and the data we claimed but didn't use is still available for appending. Effectively it's the same thing as your original implementation, but without the capacity check in the inner loop.

Ok, I think I got it. In this scenario, the slice that's returned from data is guaranteed to be longer than the originally passed buffer. So it's safe to say that any capacity we didn't use is free for all. Amending.

aG0aep6G · 2015-11-12T17:21:13Z

Amended fixes for the noted issues.

ReadlnAppender tried to claim the capacity of the passed buffer, calling assumeSafeAppend on the result so that on the next call it has a capacity again that can be claimed. The obvious problem with that: readln would stomp over memory that it has not been given. There was also a subtler problem with it (which caused issue 15293): When readln wasn't called with the previous line, but with the original buffer (byLine does that), then the passed buffer had no capacity, so ReadlnAppender would not assumeSafeAppend when slicing the new line from it. But without a new assumeSafeAppend, the last one would still be in effect, possibly on a sub slice of the new line.

aG0aep6G · 2015-11-12T19:15:24Z

Amended the newBuf thing. Calling it safeAppend now, though. Thanks @schveiguy

schveiguy · 2015-11-12T19:25:21Z

LGTM. I think this is actually going to end up making byLine faster too, because assumeSafeAppend/capacity is only called on expansion (those need to dig into the GC metadata which can be expensive).

@rainers please review, this changes the memory usage semantics of readln, but I think for the better.

rainers · 2015-11-12T22:28:19Z

std/stdio.d

        assert(pos == 0);   // assume this is the only put call
-        if (b.length > cap)
+        if (b.length > max(buf.length, buf.capacity))


I think you should take advantage of the case b.length <= buf.length to avoid getting the capacity at all. This probably boils down to just calling putbuf unconditionally.

How about extracting a reserveWithoutAllocating from reserve that returns false instead of allocating a new buffer. And then change putonly to this:

void putonly(char[] b) { assert(pos == 0); // assume this is the only put call if (reserveWithoutAllocating(b.length)) memcpy(buf.ptr + pos, b.ptr, b.length); else buf = b.dup; pos = b.length; }

No .capacity call when buf.length is sufficient, at most one .capacity call. Tight allocation with putonly (not sure if this is important, but it seems to be the point of putonly), spacious allocation with putchar.

rainers · 2015-11-12T22:42:21Z

I agree the test case is valid. With a slightly more elaborate GC, that reclaims memory "freed" by assumeSafeAppend, it might reuse memory still referenced by other slices.

It just took 8 month for #2794 to be merged, and it was a more obvious memory corruption. It wasn't me who insisted on performance over correctness, but that was a concern to others, so you might want to check the benchmark given there to see the actual change in performance by this PR.

aG0aep6G · 2015-11-12T23:41:46Z

you might want to check the benchmark given there to see the actual change in performance by this PR.

no optimization flags:
readln1: 129 ms -> 158 ms
readln2: 148 ms -> 124 ms
byLine: 178 ms -> 161 ms

-release -O -inline:
readln1: 94 ms -> 125 ms
readln2: 110 ms -> 90 ms
byLine: 87 ms -> 77 ms

Base is 2.069.0. Tested in wine, not proper Windows.

rainers · 2015-11-13T08:13:42Z

Here are some numbers from my Windows system

GIT head                                              this PR
win32:
readln1: 117 ms                                       readln1: 139 ms
readln2: 133 ms                                       readln2: 94 ms
byLine: 104 ms                                        byLine: 86 ms
2192 bytes used, 1048576 bytes pool mem               941040 bytes used, 1048576 bytes pool mem
GC summary:    1 MB,    1 GC    0 ms                  GC summary:    1 MB,   10 GC    4 ms

win64:
readln1: 333 ms                                       readln1: 397 ms
readln2: 341 ms                                       readln2: 335 ms
byLine: 320 ms                                        byLine: 329 ms
5776 bytes used, 1048576 bytes pool mem               970112 bytes used, 1048576 bytes pool mem
GC summary:    1 MB,    1 GC    0 ms                  GC summary:    1 MB,   27 GC    4 ms

win32mscoff:
readln1: 277 ms                                       readln1: 355 ms
readln2: 290 ms                                       readln2: 287 ms
byLine: 263 ms                                        byLine: 278 ms
5328 bytes used, 1048576 bytes pool mem               178128 bytes used, 1048576 bytes pool mem
GC summary:    1 MB,    1 GC    0 ms                  GC summary:    1 MB,   28 GC    6 ms

GC activity happens in the readln1 test case only. We can estimate the number of actually allocated memory by multiplying the number of GC runs (-1 for the final run by the runtime) by the size of the pool (1MB). My test file has a size of 14 MB.

schveiguy · 2015-11-13T12:11:55Z

I think readln1 test is not optimal, and we are discouraging that method. Since byLine does something better and we have documented that, I wouldn't put any stock into the lower performance there.
readln2 test can be made more optimal:

void testReadln2()
{
    auto f = File(filename);
    size_t n = 0;

    StopWatch sw;
    sw.start();
    char storage[200];
    char buf = storage[]
    while(!f.eof())
    {
        char[] ln = buf;
        f.readln(ln);
        n += ln.length;
        if(ln.length > buf.length) buf = ln;
    }
    auto dur = sw.peek();
    writefln("readln2: %d ms", dur.to!("msecs", int));
}

I agree with the changes made, still LGTM.

schveiguy · 2015-11-13T12:14:06Z

but that was a concern to others

Note that the most obvious problem was byLine performance, which is now a moot point.

rainers · 2015-11-13T14:34:39Z

readln2 test can be made more optimal

Sure, but you have to know pretty much the internals of readln to do it. The test cases are rather trying the "obvious" approaches which were considered necessary to be fast.
As we have even added an example to the documentation how to get the performance of byLine, I don't object merging this PR. The DigitalMars runtime version even benefits for the more common cases, while the Microsoft runtime version has never been optimized for performance anyway.

schveiguy · 2015-11-13T15:39:48Z

Sure, but you have to know pretty much the internals of readln to do it.

Not really. If readln gives you a bigger buffer, then why wouldn't you prefer it over the stack-allocated smaller one?

Seems like we agree it's worth merging, so I'll do so.

schveiguy · 2015-11-13T15:40:38Z

Auto-merge toggled on

fix issue 15293

schveiguy reviewed Nov 12, 2015
View reviewed changes

aG0aep6G force-pushed the 15293 branch from 29d21b2 to 01288ab Compare November 12, 2015 17:18

aG0aep6G reviewed Nov 12, 2015
View reviewed changes

aG0aep6G force-pushed the 15293 branch from 01288ab to 994d6b8 Compare November 12, 2015 19:12

rainers reviewed Nov 12, 2015
View reviewed changes

avoid .capacity

15b550d

schveiguy added a commit that referenced this pull request Nov 13, 2015

Merge pull request #3802 from aG0aep6G/15293

fc77dbb

fix issue 15293

schveiguy merged commit fc77dbb into dlang:stable Nov 13, 2015

aG0aep6G deleted the 15293 branch January 17, 2016 22:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix issue 15293 #3802

fix issue 15293 #3802

aG0aep6G commented Nov 10, 2015

dlang-bot commented Nov 10, 2015

schveiguy commented Nov 12, 2015

aG0aep6G commented Nov 12, 2015

schveiguy Nov 12, 2015

schveiguy Nov 12, 2015

aG0aep6G Nov 12, 2015

schveiguy commented Nov 12, 2015

schveiguy Nov 12, 2015

schveiguy commented Nov 12, 2015

aG0aep6G Nov 12, 2015

schveiguy Nov 12, 2015

aG0aep6G Nov 12, 2015

aG0aep6G commented Nov 12, 2015

aG0aep6G commented Nov 12, 2015

schveiguy commented Nov 12, 2015

rainers Nov 12, 2015

aG0aep6G Nov 12, 2015

aG0aep6G Nov 12, 2015

rainers commented Nov 12, 2015

aG0aep6G commented Nov 12, 2015

rainers commented Nov 13, 2015

schveiguy commented Nov 13, 2015

schveiguy commented Nov 13, 2015

rainers commented Nov 13, 2015

schveiguy commented Nov 13, 2015

schveiguy commented Nov 13, 2015

fix issue 15293 #3802

fix issue 15293 #3802

Conversation

aG0aep6G commented Nov 10, 2015

dlang-bot commented Nov 10, 2015

schveiguy commented Nov 12, 2015

aG0aep6G commented Nov 12, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schveiguy commented Nov 12, 2015

Choose a reason for hiding this comment

schveiguy commented Nov 12, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aG0aep6G commented Nov 12, 2015

aG0aep6G commented Nov 12, 2015

schveiguy commented Nov 12, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rainers commented Nov 12, 2015

aG0aep6G commented Nov 12, 2015

rainers commented Nov 13, 2015

schveiguy commented Nov 13, 2015

schveiguy commented Nov 13, 2015

rainers commented Nov 13, 2015

schveiguy commented Nov 13, 2015

schveiguy commented Nov 13, 2015