Skip to content

IO-429: Check for long streams in IOUtils.toByteArray#175

Closed
leskin-in wants to merge 2 commits intoapache:masterfrom
leskin-in:IO-429-IOUtils-toByteArray-Overflow
Closed

IO-429: Check for long streams in IOUtils.toByteArray#175
leskin-in wants to merge 2 commits intoapache:masterfrom
leskin-in:IO-429-IOUtils-toByteArray-Overflow

Conversation

@leskin-in
Copy link

Throw an IllegalArgumentException when an InputStream provided to IOUtils.toByteArray() is longer than Integer.MAX_VALUE bytes. Processing of such long arrays is not possible, as arrays with long indices are forbidden by Java language specification.

@leskin-in
Copy link
Author

Adding a test for this change (e.g. to IOUtilsTestCase) seems quite complicated: ByteArrayInputStream, and BufferedInputStream both use common Java arrays, whose length is not enough for the test.

A solution I can think of is to use FileInputStream; but this would require the test case to create a 2Gb file, which IMHO does not look good.

It is also possible to implement a custom InputStream. It may turn to be useful in other tests.

@garydgregory
Copy link
Member

Why would you use a real file? Just test with a InfiniteCircularInputStream.

@leskin-in
Copy link
Author

@garydgregory thank you for your comment.

I have implemented a test using CircularInputStream; IOUtils.copyLarge() does not return if provided InifiniteCircularInputStream as input.

Note the new test case requires lots of memory, so heap size limit is increased. I do not know whether this is appropriate. The chosen value is the lowest possible, which I obtained empirically on my laptop.

IO-161 introduced heap size limit, and it has remained unchanged since then. But the reasons for this (and why 25M was chosen) are not clear to me.

pom.xml Outdated
<reuseForks>false</reuseForks>
<!-- limit memory size see IO-161 -->
<argLine>${argLine} -Xmx25M</argLine>
<argLine>${argLine} -Xmx4223M</argLine>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not acceptable, leave as is for now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left unchanged in cc28d3d.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try (CircularInputStream cin = new CircularInputStream(new byte[]{65, 65, 65}, ((long)Integer.MAX_VALUE) + 1L)) {
IOUtils.toByteArray(cin);
fail("IllegalArgumentException expected");
} catch (final IllegalArgumentException exc) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assertThrows()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed in cc28d3d.

try (final ByteArrayOutputStream output = new ByteArrayOutputStream()) {
copy(input, output);
if (copy(input, output) == -1) {
throw new IllegalArgumentException("Stream cannot be longer than Integer max value bytes");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leskin-in
This condition will not happen for a ByteArrayInputStream, instead of you'll get a IndexOutOfBoundsException or am I missing something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct if an InputStream is a ByteArrayInputStream. However, this method accepts generic InputStream, which may wrap more data. In that case, the implementation of copy() would return -1.

However, ByteArrayOutputStream, which is created in this method as an intermediate buffer, does not check for its underlying buffer overflow at write(). The checks in write() only ensure sanity of arguments passed to it; they are valid in case of copyLarge() (ultimately called by copy() mentioned above).

The ByteArrayOutputStream, however, can store more than Integer.MAX_VALUE bytes because it can use multiple underlying byte arrays.

@garydgregory
Copy link
Member

@leskin-in
May you please rebase on master and see if you can get green builds?

Throw an IllegalArgumentException when an InputStream provided to
IOUtils.toByteArray() is longer than Integer.MAX_VALUE bytes. Conversion
of such long arrays is not possible, as arrays with long indices are
forbidden by Java language specification.
@leskin-in
Copy link
Author

@leskin-in
May you please rebase on master and see if you can get green builds?

@garydgregory, thank you for reminding about this PR. I have rebased it against the current master and took into account the notes from your review.

Add a test case testToByteArray_InputStreamTooLong() to IOUtilsTestCase
to test how InputStreams with more than Integer.MAX_VALUE bytes are
processed by IOUtils.toByteArray().
@garydgregory
Copy link
Member

@leskin-in
Please see git master. I've solved this differently which also has the benefit of NOT consuming the whole input stream when the next read would result in the byte array being larger than Integer.MAX_VALUE.

@garydgregory
Copy link
Member

@leskin-in Note that GitHub Actions are currently down.

@leskin-in
Copy link
Author

@garydgregory , that is a nice solution.

As ThresholdingOutputStream.written is long, ThresholdingOutputStream.checkThreshold() works fine for Integer.MAX_VALUE.

Should this PR be closed then?

@garydgregory
Copy link
Member

A different solution is in git master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants