Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8321053: Use ByteArrayInputStream.buf directly when parameter of transferTo() is trusted #16893

Closed

Conversation

bplb
Copy link
Member

@bplb bplb commented Nov 30, 2023

Pass ByteArrayInputStream.buf directly to the OutputStream parameter of BAIS.transferTo only if the target stream is in the java.io package.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8321053: Use ByteArrayInputStream.buf directly when parameter of transferTo() is trusted (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16893/head:pull/16893
$ git checkout pull/16893

Update a local copy of the PR:
$ git checkout pull/16893
$ git pull https://git.openjdk.org/jdk.git pull/16893/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16893

View PR using the GUI difftool:
$ git pr show -t 16893

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16893.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 30, 2023

👋 Welcome back bpb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 30, 2023
@openjdk
Copy link

openjdk bot commented Nov 30, 2023

@bplb The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Nov 30, 2023
@mlbridge
Copy link

mlbridge bot commented Nov 30, 2023

@@ -207,10 +207,20 @@ public int readNBytes(byte[] b, int off, int len) {
public synchronized long transferTo(OutputStream out) throws IOException {
int len = count - pos;
if (len > 0) {
byte[] tmp;
if ("java.io".equals(out.getClass().getPackageName()))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this protection defeated with:

ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
UntrustedOutputStream uos = new UntrustedOutputStream();
bais.transferTo(new java.io.DataOutputStream(uos)); 

Or am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch: that in fact defeats the protection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in 176d516 not to trust FilterOutputStreams.

Copy link

@jmehrens jmehrens Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other alternative would be to walk ((FilterOutputStream)out).out and if everything in the out chain is in the "java." package then the out can be trusted.

byte[] tmp = null;
for (OutputStream os = out; os != null;) {
    if (os.getClass().getPackageName().startsWith("java.")) {
        if (os instanceof FilterOutputStream fos) {
            //loops in this chain is going to cause this code to never end.
            // self reference A -> A or transitive reference A -> B -> C ->A
            os = fos.out;
            continue;
        }
        break;
    }
            
    tmp = new byte[Integer.min(len, MAX_TRANSFER_SIZE)];
    break;
}

I don't like the approach of deny list, walking the chain as (subjectively) it seems too fragile.

Also I think I can break this version of the code with ChannelOutputStream. I didn't run this through a compiler nor test it but the idea is that ChannelOutputStream calls ByteBuffer.wrap(bs) and doesn't call ByteBuffer.asReadOnlyBuffer. So a malicious WritableByteChannel should be able to gain access to the original array:

WritableByteChannel wolf = new WritableByteChannel() {
public int write(ByteBuffer src) throws IOException {
      src.array()[0] = '0'; //oh no!
      return 0;
 }
};

ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
OutputStream wolfInSheepSuitAndTie = Channels.newOutputStream(wolf);
bais.transferTo(wolfInSheepSuitAndTie);

However, the ChannelOutputStream is in sun.nio.ch so on second thought it shouldn't break. The pattern is repeated in Channels.newOutputStream(AsynchronousByteChannel ch) so that should fail as it is in the "java." namespace.

I think an allow list would be safer but that brings all the drawbacks that Alan was talking about before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have done this incorrectly, but with this version of the above wolf I do not see any corruption:

        java.nio.channels.WritableByteChannel wolf =
            new java.nio.channels.WritableByteChannel() {
                private boolean closed = false;
                
                public int write(java.nio.ByteBuffer src) throws IOException {
                    int rem = src.remaining();
                    Arrays.fill(src.array(), src.arrayOffset() + src.position(),
                                src.arrayOffset() + src.limit(),
                                (byte)'0');
                    src.position(src.limit());
                    return rem;
                }

                public boolean isOpen() {
                    return !closed;
                }

                public void close() throws IOException {
                    closed = true;
                }
            };

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the problem that unless we have an explicit whitelist, we do open the risk of accidentially adding another wrapper stream in future to the JDK somewhere and forget to add it to the blacklist. So for safety, I would pleae for not using .startsWith() but explitly mention the actively proven-as-safe classes only. That way, the code might be slower (sad but true) but inherently future-proof.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case of Channels.newOutputStream(AsynchronousByteChannel) could be handled by changing the return value of that method. For example, sun.nio.ch.Streams could have a method OutputStream of(AsynchronousByteChannel) added to it which returned something like an AsynChannelOutputStream and we could use that.

That said, it is true that a deny list is not inherently future-proof like an allow list, as stated.

Copy link
Member Author

@bplb bplb Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a sufficiently future-proof deny list could be had by changing

211             if (out.getClass().getPackageName().startsWith("java.") &&

back to

211             if ("java.io".equals(out.getClass().getPackageName()) &&

That would for example dispense with the problematic Channels.newOutputStream(AynsynchronousByteChannel) case:

jshell> AsynchronousSocketChannel asc = AsynchronousSocketChannel.open()
asc ==> sun.nio.ch.UnixAsynchronousSocketChannelImpl[unconnected]

jshell> OutputStream out = Channels.newOutputStream(asc)
out ==> java.nio.channels.Channels$2@58c1670b

jshell> Class outClass = out.getClass()
outClass ==> class java.nio.channels.Channels$2

jshell> outClass.getPackageName()
$5 ==> "java.nio.channels"

Copy link

@jmehrens jmehrens Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if scope is limited to java.io you have deal with FilterOutputStream and ObjectOutputStream. I still haven't done a complete search so there could be other adapters I've yet to review.

Thinking of a different approach, what if ByteArrayInputStream actually recorded and used readlimit of the mark method? This allows us to safely leak or poison 'this.data' because once transferTo is called we safely change owner of the byte array if we know this stream is allowed to forget it existed. Effectively you could do optimizations like this (didn't test or compile this):

public synchronized long transferTo(OutputStream out) throws IOException {
     int len = count - pos;
     if (len > 0) {
         byte[] data = this.data;
         byte[] tmp = null;
         if (this.readLimit == 0) { //<- recorded by mark method, initial value on construction of this would be zero.
            data = this.data; //swap owner of bytes
            this.data = new byte[0];
            Arrays.fill(data, 0, pos, (byte) 0); // hide out of bounds data.
            Arrays.fill(data, count, data.length, (byte) 0); 
         } else {
            tmp = new byte[Integer.min(len, MAX_TRANSFER_SIZE)];
         }

            while (nwritten < len) {
                int nbyte = Integer.min(len - nwritten, MAX_TRANSFER_SIZE);
                out.write(buf, pos, nbyte);
                if (tmp != null) {
                    System.arraycopy(buf, pos, tmp, 0, nbyte);
               out.write(tmp, 0, nbyte);
                } else
                    out.write(buf, pos, nbyte);
                pos += nbyte;
                nwritten += nbyte; 
            }
            assert pos == count;
            if (data.length ==0) { //uphold rules of class.
                pos = count = mark = 0;
            }
        }
        return len;
}

This would approach avoids having to maintain an allow or deny list. The downside of this approach and that is the constructor of ByteInputStream doesn't copy the byte[] parameter. The caller is warned about this in the JavaDocs but it might be shocking to have data escape ByteArrayInputStream. Maybe that is deal breaker? Obviously there a compatibility issue with recording readLimit in the mark method as it states it does nothing.

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is getting too complicated. For the time being, I think it would be better simply to have a conservative allow-list and trust only the classes in it. The approach can always be broadened at a later date, but at least for now there would be protection against untrustworthy OutputStreams

@@ -207,10 +207,20 @@ public int readNBytes(byte[] b, int off, int len) {
public synchronized long transferTo(OutputStream out) throws IOException {
int len = count - pos;
if (len > 0) {
byte[] tmp;
if ("java.io".equals(out.getClass().getPackageName()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should trust all classes in java.* packages, i.e. the check should be

out.getClass().getPackageName().startsWith("java.")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change in 176d516 to use java. instead of java.io.

int nwritten = 0;
while (nwritten < len) {
int nbyte = Integer.min(len - nwritten, MAX_TRANSFER_SIZE);
out.write(buf, pos, nbyte);
if (tmp != null) {
System.arraycopy(buf, pos, tmp, 0, nbyte);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the overall performance of transferTo will be faster if we use System.arraycopy only once in line 215 to create a safe copy of the complete buf instead of calling it multiple times in a loop to create copies per slice. In that case we can omit the tmp == null case but simply use tmp = buf, making the code in the loop if-free.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a tradeoff here between number of invocations of arraycopy and amount of memory allocated for tmp. (We have seen this before in #14981 which I have allowed to languish.) The allocation limit is MAX_TRANSFER_SIZE which is presently 128 kB, so any transfer of size less than this will invoke arraycopy only once already.

@@ -207,10 +207,21 @@ public int readNBytes(byte[] b, int off, int len) {
public synchronized long transferTo(OutputStream out) throws IOException {
int len = count - pos;
if (len > 0) {
byte[] tmp;
if (out.getClass().getPackageName().startsWith("java.") &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has anybody actually estimated or measured if such an exception is actually useful / needed given the fact that System.arraycopy is fast native code and most buffers used by java.io-located streams are just few KB? Just asking as it could be the case that interpreting this Java bytecode could be slower than executing some ASM ops to create a few-KB copy, and we might do an "premature optimization" here.

// 'tmp' is null if and only if 'out' is trusted
byte[] tmp;
Class<?> outClass = out.getClass();
if (outClass.getPackageName().equals("java.io") &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what do we need this string-based check here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it's left over from a previous iteration. In any case, limiting it to a small number of output streams makes this easier to look at. BAOS and FOS seem okay, POP seems okay too but legacy and not interesting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it's left over from a previous iteration. In any case, limiting it to a small number of output streams makes this easier to look at. BAOS and FOS seem okay, POP seems okay too but legacy and not interesting.

Agreed for a rather short list of explicitly whitelisted implementations. We should get rid of the package check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked all the OutputStreams in the list for trustworthiness. The package check is vestigial; will remove. It could be useful if multiple packages were involved with multiple trusted classes in each.

outClass == PipedOutputStream.class)
tmp = null;
else
tmp = new byte[Integer.min(len, MAX_TRANSFER_SIZE)];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay, I'd probably rename tmp to something better, maybe tmpbuf.

@openjdk
Copy link

openjdk bot commented Dec 5, 2023

@bplb This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8321053: Use ByteArrayInputStream.buf directly when parameter of transferTo() is trusted

Reviewed-by: alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 88 new commits pushed to the master branch:

  • db5613a: 8317288: [macos] java/awt/Window/Grab/GrabTest.java: Press on the outside area didn't cause ungrab
  • b1cb374: 8320349: Simplify FileChooserSymLinkTest.java by using single-window testUI
  • 18c7922: 8321224: ct.sym for JDK 22 contains references to internal modules
  • 83ffc1a: 8320303: Allow PassFailJFrame to accept single window creator
  • fd31f6a: 8321183: Incorrect warning from cds about the modules file
  • 027b5db: 8321215: Incorrect x86 instruction encoding for VSIB addressing mode
  • 61d0db3: 8318468: compiler/tiered/LevelTransitionTest.java fails with -XX:CompileThreshold=100 -XX:TieredStopAtLevel=1
  • 87516e2: 8320943: Files/probeContentType/Basic.java fails on latest Windows 11 - content type mismatch
  • 800f347: 8321216: SerialGC attempts to access the card table beyond the end of the heap during card table scan
  • a1fe16b: 8321300: Cleanup TestHFA
  • ... and 78 more: https://git.openjdk.org/jdk/compare/c86431767e6802317dc2be6221a5d0990b976ddc...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 5, 2023
@bplb
Copy link
Member Author

bplb commented Dec 5, 2023

/integrate

@openjdk
Copy link

openjdk bot commented Dec 5, 2023

Going to push as commit b0d1450.
Since your change was applied there have been 92 commits pushed to the master branch:

  • acaf2c8: 8318590: JButton ignores margin when painting HTML text
  • d3df3eb: 8294699: Launcher causes lingering busy cursor
  • fddc02e: 8321225: [JVMCI] HotSpotResolvedObjectTypeImpl.isLeafClass shouldn't create strong references
  • 640d7f3: 8314327: Issues with JShell when using "local" execution engine
  • db5613a: 8317288: [macos] java/awt/Window/Grab/GrabTest.java: Press on the outside area didn't cause ungrab
  • b1cb374: 8320349: Simplify FileChooserSymLinkTest.java by using single-window testUI
  • 18c7922: 8321224: ct.sym for JDK 22 contains references to internal modules
  • 83ffc1a: 8320303: Allow PassFailJFrame to accept single window creator
  • fd31f6a: 8321183: Incorrect warning from cds about the modules file
  • 027b5db: 8321215: Incorrect x86 instruction encoding for VSIB addressing mode
  • ... and 82 more: https://git.openjdk.org/jdk/compare/c86431767e6802317dc2be6221a5d0990b976ddc...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 5, 2023
@openjdk openjdk bot closed this Dec 5, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 5, 2023
@openjdk
Copy link

openjdk bot commented Dec 5, 2023

@bplb Pushed as commit b0d1450.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@bplb bplb deleted the ByteArrayInputStream-transferTo-8321053 branch December 5, 2023 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
5 participants