New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4926314: Optimize Reader.read(CharBuffer) #1915
Changes from 5 commits
967b314
a88cd93
8d40558
d247b63
a8531c1
f2390ec
2915e13
275fd48
08948f9
8875c5d
fc29f3e
6886954
6dac38d
5fa832b
c4c859e
bbc4274
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
/* | ||
* Copyright (c) 1996, 2020, Oracle and/or its affiliates. All rights reserved. | ||
* Copyright (c) 1996, 2021, Oracle and/or its affiliates. All rights reserved. | ||
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. | ||
* | ||
* This code is free software; you can redistribute it and/or modify it | ||
|
@@ -184,12 +184,33 @@ protected Reader(Object lock) { | |
* @since 1.5 | ||
*/ | ||
public int read(CharBuffer target) throws IOException { | ||
int len = target.remaining(); | ||
char[] cbuf = new char[len]; | ||
int n = read(cbuf, 0, len); | ||
if (n > 0) | ||
target.put(cbuf, 0, n); | ||
return n; | ||
int nread; | ||
if (target.hasArray()) { | ||
char cbuf[] = target.array(); | ||
int off = target.arrayOffset() + target.position(); | ||
int len = target.remaining(); | ||
nread = this.read(cbuf, off, len); | ||
if (nread > 0) | ||
target.position(target.position() + nread); | ||
} else { | ||
int remaining = target.remaining(); | ||
char cbuf[] = new char[Math.min(remaining, TRANSFER_BUFFER_SIZE)]; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That would be possible. It would help in cases where a large Reader is read into one or several relatively small off-heap CharBuffers, requiring multiple #read calls. This can only be done when the caller is able to work with only a partial input. I don't know how common this case is. We could re-purpose #skipBuffer, it has the same maximum size (8192) but determined by a different constant (#maxSkipBufferSize instead of #TRANSFER_BUFFER_SIZE). That would likely require it to be renamed and maybe we should even remove #maxSkipBufferSize. We could also do the reallocation and growing similar as is currently done in #skip. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps a static final Lastly you mentioned C-style array declarations like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. I left #transferTo(Writer) untouched for now. Firstly it is not already behind a synchronized. Secondly it writes so there is no need for repeated calls. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would there be value in making this a (lazily created) member variable? That would allow a single instance to be reused. It seems likely that, if one call is made with a direct CharBuffer, subsequent calls will also be made with direct instances (probably same instance?). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure. It would help to reduce the allocation rate when reading a large amount of data into a small direct CharBuffer. I don't know how common that is. We would introduce an instance variable for one path in one method. |
||
nread = 0; | ||
synchronized (lock) { | ||
int n; | ||
do { | ||
// read to EOF which may read more or less than buffer size | ||
if ((n = read(cbuf, 0, Math.min(remaining, cbuf.length))) > 0) { | ||
target.put(cbuf, 0, n); | ||
nread += n; | ||
remaining -= n; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't there be a possibility for target.put(cbuf, 0, n) to throw BufferOverflowException ?
You have to limit the amount read in each iteration to be Math.min(remaining, cbuf.length) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're correct. I need to expand the unit tests. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed |
||
} | ||
// if the last call to read returned -1 or the number of bytes | ||
// requested have been read then break | ||
} while (n >= 0 && remaining > 0); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The code for case that the char buffer has a backing array looks okay but I'm not sure about the direct buffer/other cases. One concern is that this is a read method, not a transferXXX method so we shouldn't be calling the underlying read several times. You'll see what I mean if you consider the scenario where you read < rem, then read again and the second read blocks or throws. I'l also concerned about "workBuffer" adding more per-stream footprint for cases where skip or read(CB) is used. Objects such as InputStreamReader are already a problem due to the underlying stream decoder. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. So you propose to revert the off-heap path to the current master? That would be fine with me. The original bug and my motivation was only about the backing array case, the rest crept in. That would certainly keep the risk and impact lower. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's what @AlanBateman intended. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, let's bring it back to just eliminating the intermediate array when the buffer has a backing array. The other case that been examined separated if needed but we can't use the approach proposed in the current PR because it changes the semantics of read when there is a short-read followed by a blocking or exception throwing read. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
} | ||
} | ||
return nread; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for bringing this back to just the heap buffer case. This part looks good now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. |
||
} | ||
|
||
/** | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
/* | ||
* Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved. | ||
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. | ||
* | ||
* This code is free software; you can redistribute it and/or modify it | ||
* under the terms of the GNU General Public License version 2 only, as | ||
* published by the Free Software Foundation. | ||
* | ||
* This code is distributed in the hope that it will be useful, but WITHOUT | ||
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | ||
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | ||
* version 2 for more details (a copy is included in the LICENSE file that | ||
* accompanied this code). | ||
* | ||
* You should have received a copy of the GNU General Public License version | ||
* 2 along with this work; if not, write to the Free Software Foundation, | ||
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. | ||
* | ||
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA | ||
* or visit www.oracle.com if you need additional information or have any | ||
* questions. | ||
*/ | ||
|
||
/* | ||
* @test | ||
* @bug 4926314 | ||
* @summary Test for InputStreamReader#read(CharBuffer). | ||
* @run testng ReadCharBuffer | ||
*/ | ||
|
||
import org.testng.annotations.DataProvider; | ||
import org.testng.annotations.Test; | ||
|
||
import java.io.*; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's generally better not to use a wild card. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
import java.nio.ByteBuffer; | ||
import java.nio.CharBuffer; | ||
import java.util.Arrays; | ||
|
||
import static java.nio.charset.StandardCharsets.US_ASCII; | ||
import static org.testng.Assert.assertEquals; | ||
|
||
public class ReadCharBuffer { | ||
|
||
private static final int BUFFER_SIZE = 24; | ||
|
||
@DataProvider(name = "buffers") | ||
public Object[][] createBuffers() { | ||
// test both on-heap and off-heap buffers has they make use different code paths | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "as they may" |
||
return new Object[][]{ | ||
new Object[]{CharBuffer.allocate(BUFFER_SIZE)}, | ||
new Object[]{ByteBuffer.allocateDirect(BUFFER_SIZE * 2).asCharBuffer()} | ||
}; | ||
} | ||
|
||
@Test(dataProvider = "buffers") | ||
public void read(CharBuffer buffer) throws IOException { | ||
fillBuffer(buffer); | ||
|
||
try (Reader reader = new InputStreamReader(new ByteArrayInputStream("ABCDEFGHIJKLMNOPQRTUVWXYZ".getBytes(US_ASCII)), US_ASCII)) { | ||
buffer.limit(7); | ||
buffer.position(1); | ||
assertEquals(reader.read(buffer), 6); | ||
assertEquals(buffer.position(), 7); | ||
assertEquals(buffer.limit(), 7); | ||
|
||
buffer.limit(16); | ||
buffer.position(8); | ||
assertEquals(reader.read(buffer), 8); | ||
assertEquals(buffer.position(), 16); | ||
assertEquals(buffer.limit(), 16); | ||
} | ||
|
||
buffer.clear(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
assertEquals(buffer.toString(), "xABCDEFxGHIJKLMNxxxxxxxx"); | ||
} | ||
|
||
private void fillBuffer(CharBuffer buffer) { | ||
char[] filler = new char[BUFFER_SIZE]; | ||
Arrays.fill(filler, 'x'); | ||
buffer.put(filler); | ||
buffer.clear(); | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As
target
is mutable, I think you would do better to change lines 189-194 to something like:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done