Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8247614: java/nio/channels/DatagramChannel/Connect.java timed out #679

Closed
wants to merge 5 commits into from

Conversation

c-cleary
Copy link
Contributor

@c-cleary c-cleary commented Oct 15, 2020

Occasional failures of this test have been observed. However it was unclear as to the precise nature of the failure due to minimal logging and multiple features of DatagramChannel being tested in one run. Another issue is that on failure of either a Writer or Reader thread, the thread that did not fail waits until the test itself times out.

To attempt to mitigate these factors, the test was modified in the following manner:

  • Additional logging was added to help locate future points of failure in the test.
  • On L95, guards were added to make sure the test for a DatagramChannel throwing an AlreadyConnectedException does not end up using the same port as used for the connection on L86-87
  • wait(CompletableFuture<?>... futures) was implemented to throw a CompletionException if either the Reader or Writer fails, rather than waiting for the test to time out.

Seeking review in particular on implementation of the wait(CompletableFuture<?>... futures) function. As it stands currently the wait function waits for one of the given futures completes exceptionally. If that doesnt happen, it will wait for all futures to complete successfully.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Testing

Linux x64 Windows x64 macOS x64
Build ❌ (1/5 failed) ✔️ (2/2 passed) ✔️ (2/2 passed)
Test (tier1) ✔️ (9/9 passed) ✔️ (9/9 passed)

Failed test task

Issue

  • JDK-8247614: java/nio/channels/DatagramChannel/Connect.java timed out

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/679/head:pull/679
$ git checkout pull/679

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 15, 2020

👋 Welcome back ccleary! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 15, 2020

@ccleary-oracle The following label will be automatically applied to this pull request:

  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the nio nio-dev@openjdk.org label Oct 15, 2020
@c-cleary c-cleary marked this pull request as ready for review October 15, 2020 10:36
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 15, 2020
@openjdk
Copy link

openjdk bot commented Oct 15, 2020

@ccleary-oracle The label rfr is not a valid label. These labels are valid:

  • serviceability
  • hotspot
  • sound
  • hotspot-compiler
  • kulla
  • i18n
  • shenandoah
  • jdk
  • javadoc
  • 2d
  • security
  • swing
  • hotspot-runtime
  • jmx
  • build
  • nio
  • beans
  • core-libs
  • compiler
  • net
  • hotspot-gc
  • hotspot-jfr
  • awt

@mlbridge
Copy link

mlbridge bot commented Oct 15, 2020

Webrevs

Copy link
Member

@dfuch dfuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Conor,

Good work.
Chris made a good remark to me privately. The test is running in same VM mode, and no longer waits for all threads (that is Actor & Reactor) to complete if one them throws an exception.
We want, that - because we don't want the test to wait forever and timeout with no output (and no stack trace) if that happens.
However, this may leave one of the Actor/Reactor in a state where it will live forever in the running agent VM, and may keep its DatagramChannel open - and that's bad.

One way to avoid that would be to make Actor & Reactor Closeable, and have them both create their DatagramChannel in their constructor.

Then you could modify the test method in this way:

static void test() throws Exception {
        try (Reactor r = new Reactor();
               Actor a = new Actor(r.port());) {
            invoke(a, r);
        }
    }

which would ensure that both DatagramChannels are closed when the test terminate, and would ensure that the other actor/reactor is unblocked and terminates too.

}

public interface Sprintable extends Runnable {
public void throwException() throws Exception;
// This method waits until one of the given CompletableFutures completes exceptionally. In which case, it stops waiting for the other futures and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting: could you split this long line after exceptionally. ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, wildly long lines are annoying when looking at side-by-side diffs.

reader.throwException();
writer.throwException();
static void invoke(Runnable reader, Runnable writer) throws CompletionException {
CompletableFuture<Void> f1 = CompletableFuture.runAsync(writer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These run the reader and write in the common pool, are you sure that is what you want?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These run the reader and write in the common pool, are you sure that is what you want?

It shouldn't matter since there's no security manager. Are you worried that the common pool may not have enough threads available to run these two tasks concurrently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are blocking tasks. It's easy to create a thread pool in this test.

}

public interface Sprintable extends Runnable {
public void throwException() throws Exception;
// This method waits until one of the given CompletableFutures completes exceptionally. In which case, it stops waiting for the other futures and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, wildly long lines are annoying when looking at side-by-side diffs.

return null;
});
});
future.join();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would ExecutorService invokeAny help you hear?

Copy link
Member

@dfuch dfuch Oct 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so - we kind of want to do the opposite: what we want is stop the test if any of the two tasks throws an exception. The previous version of the test might have been blocked in the first call to Thread::join and not even notice that the other thread had exited - which meant that the test would fail in timeout and not even report the exception that made the other thread terminate. Here the allOf future will be completed as soon as: one task throws an exception (thanks to the dependent action), or the two tasks complete successfully.
There are many ways to do that - but I find the solution that Conor came up with rather elegant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, they might new combinators here in the future that makes this a bit easier.


// Reply to sender
dc.connect(sa);
bb.flip();
log.println("Reactor attempting to write: " + dc.getRemoteAddress().toString());
dc.write(bb);

// Clean up
dc.disconnect();
dc.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the dc.close to the finally block so the socket is closed when there is an exception thrown.

@c-cleary
Copy link
Contributor Author

Hi Conor,

Good work.
Chris made a good remark to me privately. The test is running in same VM mode, and no longer waits for all threads (that is Actor & Reactor) to complete if one them throws an exception.
We want, that - because we don't want the test to wait forever and timeout with no output (and no stack trace) if that happens.
However, this may leave one of the Actor/Reactor in a state where it will live forever in the running agent VM, and may keep its DatagramChannel open - and that's bad.

One way to avoid that would be to make Actor & Reactor Closeable, and have them both create their DatagramChannel in their constructor.

Then you could modify the test method in this way:

static void test() throws Exception {
        try (Reactor r = new Reactor();
               Actor a = new Actor(r.port());) {
            invoke(a, r);
        }
    }

which would ensure that both DatagramChannels are closed when the test terminate, and would ensure that the other actor/reactor is unblocked and terminates too.

@dfuch Actor & Reactor now implement AutoCloseable and are instantiated in try-with-resource arguments.

Copy link
Contributor Author

@c-cleary c-cleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These run the reader and write in the common pool, are you sure that is what you want?

@AlanBateman seeking review on use of Executor service on L56-63. A newCachedThreadPool() Executor is passed to both futures which are then joined in wait(f1, f2) with CompletableFuture.allOf(futures) .

Copy link
Member

@dfuch dfuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work altogether! One or two things to change yet and we'll get there :-)

Comment on lines 56 to 64
static void invoke(Runnable reader, Runnable writer) throws CompletionException {
ExecutorService threadPool = Executors.newCachedThreadPool();
try {
CompletableFuture<Void> f1 = CompletableFuture.runAsync(writer, threadPool);
CompletableFuture<Void> f2 = CompletableFuture.runAsync(reader, threadPool);
wait(f1, f2);
} finally {
threadPool.shutdown();
}
Copy link
Member

@dfuch dfuch Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be better to call threadPool.shutdown() after having closed the Actor/Reactor: closing the Actor/Reactor makes sure that both tasks terminate - and shutdown() will wait for the tasks to terminate. Therefore I'd suggest to create the threadPool in the test method instead, and pass it as first parameter to the invoke method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you suggested, I moved the threadPool.shutdown() call to the main test method instead of having it in invoke. While it seems that shutdown() does conduct an orderly shutdown of the pool, I agree that its best to be safe and to make absolutely certain that Actor/Reactor close properly..

test/jdk/java/nio/channels/DatagramChannel/Connect.java Outdated Show resolved Hide resolved
@openjdk
Copy link

openjdk bot commented Oct 21, 2020

@ccleary-oracle This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8247614: java/nio/channels/DatagramChannel/Connect.java timed out

Reviewed-by: dfuchs, alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 230 new commits pushed to the master branch:

  • 3f20612: 8255555: Bad copyright headers in SocketChannelCompare.java SocketChannelConnectionSetup.java UnixSocketChannelReadWrite.java
  • 42fc158: 8253939: [TESTBUG] Increase coverage of the cgroups detection code
  • 01eb690: 8255554: Bad copyright header in AbstractFileSystemProvider.java
  • 1215b1a: 8255457: Shenandoah: cleanup ShenandoahMarkTask
  • af33e16: 8255441: Cleanup ciEnv/jvmciEnv::lookup_method-s
  • 8ad7f38: 8255014: Record Classes javax.lang.model changes, follow-up
  • 6bb7e45: 8245194: Unix domain socket channel implementation
  • 8bde2f4: 8255013: implement Record Classes as a standard feature in Java, follow-up
  • 0425889: 8255429: Remove C2-based profiling
  • aaf4f69: 8255233: InterpreterRuntime::at_unwind should be a JRT_LEAF
  • ... and 220 more: https://git.openjdk.java.net/jdk/compare/abe51377373673a039099b89ee60b5724159b67d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dfuch, @AlanBateman) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 21, 2020
Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No other comments, it's a good re-write of the test and the new version should be a lot more robust. I would be tempted to remove the author tag as the test is significantly replaced.

test/jdk/java/nio/channels/DatagramChannel/Connect.java Outdated Show resolved Hide resolved

Actor(int port) {
Actor(int port) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be clear to create Actor with a SocketAddress rather than a port. It didn't matter in the old test because the Reactor was bound to the wildcard address but here there is nothing to tell the Actor which address to connect it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback Alan! Will look into these comments presently

@c-cleary
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Oct 28, 2020
@openjdk
Copy link

openjdk bot commented Oct 28, 2020

@ccleary-oracle
Your change (at version cab5452) is now ready to be sponsored by a Committer.

@dfuch
Copy link
Member

dfuch commented Oct 29, 2020

/sponsor

@openjdk openjdk bot closed this Oct 29, 2020
@openjdk openjdk bot added integrated Pull request has been integrated and removed sponsor Pull request is ready to be sponsored labels Oct 29, 2020
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Oct 29, 2020
@openjdk
Copy link

openjdk bot commented Oct 29, 2020

@dfuch @ccleary-oracle Since your change was applied there have been 240 commits pushed to the master branch:

  • 38574d5: 8255298: Remove SurvivorAlignmentInBytes functionality
  • 4031cb4: 8254189: Improve comments for StackOverFlow and fix in_xxx() functions
  • caec8d2: 8233560: [TESTBUG] ToolTipManager/Test6256140.java is failing on macos
  • a5b42ec: 8233570: [TESTBUG] HTMLEditorKit test bug5043626.java is failing on macos
  • 7e305ad: 8255405: sun/net/ftp/imp/FtpClient uses SimpleDateFormat in not thread-safe manner
  • d82a6dc: 8255438: [Vector API] More instructs in x86.ad should use legacy mode for code-gen
  • 1a5e6c9: 8253101: Clean up CallStaticJavaNode EA flags
  • a7595b2: 8250669: Running JMH micros is broken after JDK-8248135
  • edd1988: 8255530: Additional cleanup after JDK-8235710 (elliptic curve removal)
  • 790d6e2: 8255533: Incorrect javadoc in DateTimeFormatterBuilder.appendPattern() for 'uu'/'yy'
  • ... and 230 more: https://git.openjdk.java.net/jdk/compare/abe51377373673a039099b89ee60b5724159b67d...master

Your commit was automatically rebased without conflicts.

Pushed as commit ea26ff1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@openjdk openjdk bot removed the rfr Pull request is ready for review label Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated nio nio-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

3 participants