Skip to content

Commit

Permalink
[SPARK-48218][CORE] TransportClientFactory.createClient may NPE cause…
Browse files Browse the repository at this point in the history
… FetchFailedException

### What changes were proposed in this pull request?
This PR aims to add a check for `TransportChannelHandler` to be non-null in the `TransportClientFactory.createClient` method.

### Why are the changes needed?

Line 178 synchronized (handler) , handler == null

org.apache.spark.network.client.TransportClientFactory#createClient(java.lang.String, int, boolean)
```java
      TransportChannelHandler handler = cachedClient.getChannel().pipeline()
        .get(TransportChannelHandler.class);
      synchronized (handler) {
        handler.getResponseHandler().updateTimeOfLastRequest();
      }
```

```java
org.apache.spark.shuffle.FetchFailedException
	at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:1180)
	at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:913)
	at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:84)
	at org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:29)

Caused by: java.lang.NullPointerException
	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:178)
	at org.apache.spark.network.shuffle.ExternalBlockStoreClient.lambda$fetchBlocks$0(ExternalBlockStoreClient.java:128)
	at org.apache.spark.network.shuffle.RetryingBlockTransferor.transferAllOutstanding(RetryingBlockTransferor.java:154)
	at org.apache.spark.network.shuffle.RetryingBlockTransferor.start(RetryingBlockTransferor.java:133)
	at org.apache.spark.network.shuffle.ExternalBlockStoreClient.fetchBlocks(ExternalBlockStoreClient.java:139)
```

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #46506 from cxzl25/SPARK-48218.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
cxzl25 authored and dongjoon-hyun committed May 15, 2024
1 parent 12820e1 commit 6767053
Showing 1 changed file with 4 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -171,8 +171,10 @@ public TransportClient createClient(String remoteHost, int remotePort, boolean f
// this code was able to update things.
TransportChannelHandler handler = cachedClient.getChannel().pipeline()
.get(TransportChannelHandler.class);
synchronized (handler) {
handler.getResponseHandler().updateTimeOfLastRequest();
if (handler != null) {
synchronized (handler) {
handler.getResponseHandler().updateTimeOfLastRequest();
}
}

if (cachedClient.isActive()) {
Expand Down

0 comments on commit 6767053

Please sign in to comment.