Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-40902] [MESOS][TESTS] Fix issue with mesos tests failing due to quick submission of drivers #38378

Closed
wants to merge 2 commits into from

Conversation

mridulm
Copy link
Contributor

@mridulm mridulm commented Oct 24, 2022

What changes were proposed in this pull request?

Quick submission of drivers in tests to mesos scheduler results in dropping drivers

Queued drivers in MesosClusterScheduler are ordered based on MesosDriverDescription - and the ordering used checks for priority (if different), followed by comparison of submission time.
For two driver submissions with same priority, if made in quick succession (such that submission time is same due to millisecond granularity of Date), this results in dropping the second MesosDriverDescription from queuedDrivers (since driverOrdering returns 0 when comparing the descriptions).

This PR fixes the more immediate issue with tests.

Why are the changes needed?

Flakey tests, see here for an example.

Does this PR introduce any user-facing change?

No.
Fixing only tests for now - as mesos support is deprecated, not changing scheduler itself to address this.

How was this patch tested?

Fixes unit tests

@mridulm
Copy link
Contributor Author

mridulm commented Oct 24, 2022

+CC @dongjoon-hyun, @wangyum

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (Pending CIs)
Thank you for investigating and fixing this long standing issue, @mridulm .

@mridulm
Copy link
Contributor Author

mridulm commented Oct 24, 2022

Thanks for quick review @dongjoon-hyun !

@dongjoon-hyun
Copy link
Member

I manually tested this test suite fix. Merged to master/3.3/3.2. Thank you, @mridulm .

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-40902] [MESOS] Fix issue with mesos tests failing due to quick submission of drivers [SPARK-40902] [MESOS][TESTS] Fix issue with mesos tests failing due to quick submission of drivers Oct 24, 2022
dongjoon-hyun pushed a commit that referenced this pull request Oct 24, 2022
… quick submission of drivers

### What changes were proposed in this pull request?

##### Quick submission of drivers in tests to mesos scheduler results in dropping drivers

Queued drivers in `MesosClusterScheduler` are ordered based on `MesosDriverDescription` - and the ordering used checks for priority (if different), followed by comparison of submission time.
For two driver submissions with same priority, if made in quick succession (such that submission time is same due to millisecond granularity of Date), this results in dropping the second `MesosDriverDescription` from `queuedDrivers` (since `driverOrdering` returns `0` when comparing the descriptions).

This PR fixes the more immediate issue with tests.

### Why are the changes needed?

Flakey tests, [see here](https://lists.apache.org/thread/jof098qxp0s6qqmt9qwv52f9665b1pjg) for an example.

### Does this PR introduce _any_ user-facing change?

No.
Fixing only tests for now - as mesos support is deprecated, not changing scheduler itself to address this.

### How was this patch tested?

Fixes unit tests

Closes #38378 from mridulm/fix_MesosClusterSchedulerSuite.

Authored-by: Mridul <mridulatgmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 60b1056)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request Oct 24, 2022
… quick submission of drivers

### What changes were proposed in this pull request?

##### Quick submission of drivers in tests to mesos scheduler results in dropping drivers

Queued drivers in `MesosClusterScheduler` are ordered based on `MesosDriverDescription` - and the ordering used checks for priority (if different), followed by comparison of submission time.
For two driver submissions with same priority, if made in quick succession (such that submission time is same due to millisecond granularity of Date), this results in dropping the second `MesosDriverDescription` from `queuedDrivers` (since `driverOrdering` returns `0` when comparing the descriptions).

This PR fixes the more immediate issue with tests.

### Why are the changes needed?

Flakey tests, [see here](https://lists.apache.org/thread/jof098qxp0s6qqmt9qwv52f9665b1pjg) for an example.

### Does this PR introduce _any_ user-facing change?

No.
Fixing only tests for now - as mesos support is deprecated, not changing scheduler itself to address this.

### How was this patch tested?

Fixes unit tests

Closes #38378 from mridulm/fix_MesosClusterSchedulerSuite.

Authored-by: Mridul <mridulatgmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 60b1056)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@mridulm
Copy link
Contributor Author

mridulm commented Oct 24, 2022

Thanks for merging it @dongjoon-hyun !

@mridulm mridulm deleted the fix_MesosClusterSchedulerSuite branch October 24, 2022 18:08
@github-actions github-actions bot added the MESOS label Oct 24, 2022
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
… quick submission of drivers

### What changes were proposed in this pull request?

##### Quick submission of drivers in tests to mesos scheduler results in dropping drivers

Queued drivers in `MesosClusterScheduler` are ordered based on `MesosDriverDescription` - and the ordering used checks for priority (if different), followed by comparison of submission time.
For two driver submissions with same priority, if made in quick succession (such that submission time is same due to millisecond granularity of Date), this results in dropping the second `MesosDriverDescription` from `queuedDrivers` (since `driverOrdering` returns `0` when comparing the descriptions).

This PR fixes the more immediate issue with tests.

### Why are the changes needed?

Flakey tests, [see here](https://lists.apache.org/thread/jof098qxp0s6qqmt9qwv52f9665b1pjg) for an example.

### Does this PR introduce _any_ user-facing change?

No.
Fixing only tests for now - as mesos support is deprecated, not changing scheduler itself to address this.

### How was this patch tested?

Fixes unit tests

Closes apache#38378 from mridulm/fix_MesosClusterSchedulerSuite.

Authored-by: Mridul <mridulatgmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
… quick submission of drivers

### What changes were proposed in this pull request?

##### Quick submission of drivers in tests to mesos scheduler results in dropping drivers

Queued drivers in `MesosClusterScheduler` are ordered based on `MesosDriverDescription` - and the ordering used checks for priority (if different), followed by comparison of submission time.
For two driver submissions with same priority, if made in quick succession (such that submission time is same due to millisecond granularity of Date), this results in dropping the second `MesosDriverDescription` from `queuedDrivers` (since `driverOrdering` returns `0` when comparing the descriptions).

This PR fixes the more immediate issue with tests.

### Why are the changes needed?

Flakey tests, [see here](https://lists.apache.org/thread/jof098qxp0s6qqmt9qwv52f9665b1pjg) for an example.

### Does this PR introduce _any_ user-facing change?

No.
Fixing only tests for now - as mesos support is deprecated, not changing scheduler itself to address this.

### How was this patch tested?

Fixes unit tests

Closes apache#38378 from mridulm/fix_MesosClusterSchedulerSuite.

Authored-by: Mridul <mridulatgmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 60b1056)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit e5744b9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants