Skip to content

Commit

Permalink
[SPARK-37060][CORE][3.1] Handle driver status response from backup ma…
Browse files Browse the repository at this point in the history
…sters

### What changes were proposed in this pull request?
After an improvement in SPARK-31486, contributor uses 'asyncSendToMasterAndForwardReply' method instead of 'activeMasterEndpoint.askSync' to get the status of driver. Since the driver's status is only available in active master and the 'asyncSendToMasterAndForwardReply' method iterate over all of the masters, we have to handle the response from the backup masters in the client, which the developer did not consider in the SPARK-31486 change. So drivers running in cluster mode and on a cluster with multi masters affected by this bug.

### Why are the changes needed?

We need to find if the response received from a backup master client must ignore it.

### Does this PR introduce _any_ user-facing change?

No, It's only fixed a bug and brings back the ability to deploy in cluster mode on multi-master clusters.

### How was this patch tested?

Closes apache#34911 from mohamadrezarostami/fix-a-bug-in-report-driver-status.

Authored-by: Mohamadreza Rostami <mohamadrezarostami2@gmail.com>
Signed-off-by: yi.wu <yi.wu@databricks.com>
  • Loading branch information
testsgmr authored and fishcus committed Jan 12, 2022
1 parent a9e99bd commit 4c01c47
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions core/src/main/scala/org/apache/spark/deploy/Client.scala
Expand Up @@ -190,13 +190,15 @@ private class ClientEndpoint(
logDebug(s"State of driver $submittedDriverID is ${state.get}, " +
s"continue monitoring driver status.")
}
}
}
} else {
}
}
} else if (exception.exists(e => Utils.responseFromBackup(e.getMessage))) {
logDebug(s"The status response is reported from a backup spark instance. So, ignored.")
} else {
logError(s"ERROR: Cluster master did not recognize $submittedDriverID")
System.exit(-1)
}
}
}
override def receive: PartialFunction[Any, Unit] = {

case SubmitDriverResponse(master, success, driverId, message) =>
Expand Down

0 comments on commit 4c01c47

Please sign in to comment.