Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46344][CORE] Warn properly when a driver exists successfully but master is disconnected #44278

Closed
wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Dec 9, 2023

What changes were proposed in this pull request?

This PR aims to warn properly when a driver exists successfully but master is disconnected.

Why are the changes needed?

In this case, Master considers them Error eventually.

Screenshot 2023-12-09 at 3 05 27 PM

Worker Log

23/12/09 15:13:21 INFO Worker: Driver driver-20231209151301-0003 exited successfully
=== Master is disconnected here ===
23/12/09 15:13:53 WARN Worker: Driver driver-20231209151332-0004 exited successfully while master is disconnected.
=== A new master starts and is connected here ===
23/12/09 15:17:10 INFO Worker: Driver driver-20231209151707-0000 exited successfully

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the CORE label Dec 9, 2023
@dongjoon-hyun
Copy link
Member Author

Could you review this PR, @viirya ?

@viirya
Copy link
Member

viirya commented Dec 9, 2023

In this case, Master considers them Error eventually.

Hmm, I think the new warning log looks good. But does this change Master' behavior regarding it? For such drivers, Master still considers them Error with this warning.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Dec 9, 2023

Yes, you're right. This is a Worker-only log stuff. Master still considers these as Errors and has a separete way to handle based on spark.driver.supervise configuration.

@dongjoon-hyun
Copy link
Member Author

Thank you, @viirya ! Since this is a log-only change, let me merge this~ Merged to master for Apache Spark 4.0.0.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-46344 branch December 9, 2023 23:34
dbatomic pushed a commit to dbatomic/spark that referenced this pull request Dec 11, 2023
…ut master is disconnected

### What changes were proposed in this pull request?

This PR aims to warn properly when a driver exists successfully but master is disconnected.

### Why are the changes needed?

In this case, `Master` considers them `Error` eventually.

![Screenshot 2023-12-09 at 3 05 27 PM](https://github.com/apache/spark/assets/9700541/1323819b-4a0c-466d-afaa-845f507a905e)

**Worker Log**
```
23/12/09 15:13:21 INFO Worker: Driver driver-20231209151301-0003 exited successfully
=== Master is disconnected here ===
23/12/09 15:13:53 WARN Worker: Driver driver-20231209151332-0004 exited successfully while master is disconnected.
=== A new master starts and is connected here ===
23/12/09 15:17:10 INFO Worker: Driver driver-20231209151707-0000 exited successfully
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#44278 from dongjoon-hyun/SPARK-46344.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Feb 7, 2024
…ut master is disconnected

### What changes were proposed in this pull request?

This PR aims to warn properly when a driver exists successfully but master is disconnected.

### Why are the changes needed?

In this case, `Master` considers them `Error` eventually.

![Screenshot 2023-12-09 at 3 05 27 PM](https://github.com/apache/spark/assets/9700541/1323819b-4a0c-466d-afaa-845f507a905e)

**Worker Log**
```
23/12/09 15:13:21 INFO Worker: Driver driver-20231209151301-0003 exited successfully
=== Master is disconnected here ===
23/12/09 15:13:53 WARN Worker: Driver driver-20231209151332-0004 exited successfully while master is disconnected.
=== A new master starts and is connected here ===
23/12/09 15:17:10 INFO Worker: Driver driver-20231209151707-0000 exited successfully
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#44278 from dongjoon-hyun/SPARK-46344.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants