[SPARK-27413][SS] keep the same epoch pace between driver and executor. by uncleGen · Pull Request #24323 · apache/spark

uncleGen · 2019-04-09T10:42:21Z

What changes were proposed in this pull request?

The pace of epoch generation in driver and epoch pulling in executor is different. It will result in many empty epochs for partition if the epoch pulling interval is larger than epoch generation.

How was this patch tested?

update existing unit tests.

uncleGen · 2019-04-09T10:46:20Z

cc @jose-torres

SparkQA · 2019-04-09T14:00:32Z

Test build #104427 has finished for PR 24323 at commit d624637.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

gaborgsomogyi · 2019-04-09T14:43:07Z

I agree it would reduce unused computation. On the other hand I think a better approach would be instead of pull mechanism sending push notifications to executors. It would end-up in less resource waste.

shaneknapp · 2019-04-09T17:12:20Z

test this please

SparkQA · 2019-04-09T17:20:28Z

Test build #104441 has started for PR 24323 at commit d624637.

uncleGen · 2019-04-10T01:56:29Z

@gaborgsomogyi Agree with you, and let us think about code refactor later, ok?

uncleGen · 2019-04-10T01:58:29Z

retest this please.

SparkQA · 2019-04-10T06:16:50Z

Test build #104463 has finished for PR 24323 at commit d624637.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gaborgsomogyi · 2019-04-10T07:13:48Z

Spark can be configured such a way to reach this without code modification. Refactoring is changing the code without behavior modification. I mean here and now a different approach can be used.

uncleGen · 2019-04-10T08:11:07Z

@gaborgsomogyi Hmm, I misunderstood what you meant. From my point of view, using push mechanism doesn't really help much but makes this code more complex. The main concern of PR is to avoid producing unexpected empty epochs. Both pulling and pushing are able to achieve it. ~~There is not any strong reason to change actual implementation. So IMHO, keep the actual implementation.~~

jose-torres · 2019-04-10T13:51:01Z

I'm not sure I understand the motivation here. It's true that setting the poll interval larger than the generation interval will generate a bunch of empty epochs, but why does that imply that we shouldn't allow it to be configured at all? (And even if we shouldn't, why is "equal to the epoch duration" the right value?)

uncleGen · 2019-04-11T03:02:45Z

@jose-torres Thanks for you reply

I'm not sure I understand the motivation here. It's true that setting the poll interval larger than the generation interval will generate a bunch of empty epochs,

You have got the motivation. As Mentioned above, the main concern of PR is to avoid produce empty epoch.

but why does that imply that we shouldn't allow it to be configured at all?

We can indeed allow it to be configured. But firstly, IMO, it is not a good idea to expose this config to users and let them to set it carefully. Secondly, the epoch pulling interval is a internal config, and we may use a better approach to optimize this issue but not add config.

And even if we shouldn't, why is "equal to the epoch duration" the right value?

Hmm... After think again and deeply, "equal to the epoch duration" is not a very good choice. In some corner cases, executor will pull epoch from driver later than "epoch duration". So as @gaborgsomogyi mentioned, "push notifications to executors" may be a better approach.

To reiterate, the main concern of PR is to avoid produce empty epoch and late epoch. Do you have any doubt about this motivation?

Any suggestion is appreciated.

uncleGen · 2019-04-23T04:41:36Z

Send epoch update to executors instead of pull mechanism.

SparkQA · 2019-04-23T07:05:02Z

Test build #104822 has finished for PR 24323 at commit ef4238a.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-04-23T07:05:02Z

Test build #104821 has finished for PR 24323 at commit cf37736.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

uncleGen · 2019-04-24T01:45:37Z

retest this please

SparkQA · 2019-04-24T04:02:31Z

Test build #104848 has finished for PR 24323 at commit ef4238a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-04-24T07:05:01Z

Test build #104859 has finished for PR 24323 at commit 02b2b2b.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

uncleGen · 2019-04-24T07:54:37Z

retest this please

SparkQA · 2019-04-24T11:54:46Z

Test build #104866 has finished for PR 24323 at commit 02b2b2b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

uncleGen · 2019-05-06T13:39:01Z

cc @tdas

SparkQA · 2019-05-06T13:45:06Z

Test build #105154 has finished for PR 24323 at commit 4676c64.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-05-07T06:13:07Z

Test build #105195 has finished for PR 24323 at commit 4676c64.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-05-07T07:05:01Z

Test build #105196 has finished for PR 24323 at commit ad14cd2.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-05-07T08:24:18Z

Test build #105200 has finished for PR 24323 at commit ad14cd2.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

uncleGen · 2019-05-08T01:51:28Z

retest this please

SparkQA · 2019-05-08T05:53:51Z

Test build #105242 has finished for PR 24323 at commit ad14cd2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

github-actions · 2019-12-31T00:06:43Z

We're closing this PR because it hasn't been updated in a while.
This isn't a judgement on the merit of the PR in any way. It's just
a way of keeping the PR queue manageable.

If you'd like to revive this PR, please reopen it!

keep the same epoch pace between driver and executor.

d624637

uncleGen changed the title ~~keep the same epoch pace between driver and executor.~~ [SPARK-27413][SS] keep the same epoch pace between driver and executor. Apr 9, 2019

uncleGen added 2 commits April 23, 2019 12:26

epoch pull mechanism

48d319e

add comments

cf37736

update

02b2b2b

uncleGen force-pushed the SPARK-27413 branch from ef4238a to 02b2b2b Compare April 24, 2019 06:34

Merge branch 'master' into SPARK-27413

4676c64

bugfix

ad14cd2

dongjoon-hyun added the STRUCTURED STREAMING label Jun 14, 2019

github-actions bot added the Stale label Dec 31, 2019

github-actions bot closed this Jan 1, 2020

Conversation

uncleGen commented Apr 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

uncleGen commented Apr 9, 2019

Uh oh!

SparkQA commented Apr 9, 2019

Uh oh!

gaborgsomogyi commented Apr 9, 2019

Uh oh!

shaneknapp commented Apr 9, 2019

Uh oh!

SparkQA commented Apr 9, 2019

Uh oh!

uncleGen commented Apr 10, 2019

Uh oh!

uncleGen commented Apr 10, 2019

Uh oh!

SparkQA commented Apr 10, 2019

Uh oh!

gaborgsomogyi commented Apr 10, 2019

Uh oh!

uncleGen commented Apr 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jose-torres commented Apr 10, 2019

Uh oh!

uncleGen commented Apr 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

uncleGen commented Apr 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Apr 23, 2019

Uh oh!

SparkQA commented Apr 23, 2019

Uh oh!

uncleGen commented Apr 24, 2019

Uh oh!

SparkQA commented Apr 24, 2019

Uh oh!

SparkQA commented Apr 24, 2019

Uh oh!

uncleGen commented Apr 24, 2019

Uh oh!

SparkQA commented Apr 24, 2019

Uh oh!

uncleGen commented May 6, 2019

Uh oh!

SparkQA commented May 6, 2019

Uh oh!

SparkQA commented May 7, 2019

Uh oh!

SparkQA commented May 7, 2019

Uh oh!

SparkQA commented May 7, 2019

Uh oh!

uncleGen commented May 8, 2019

Uh oh!

SparkQA commented May 8, 2019

Uh oh!

github-actions bot commented Dec 31, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

uncleGen commented Apr 9, 2019 •

edited

Loading

uncleGen commented Apr 10, 2019 •

edited

Loading

uncleGen commented Apr 11, 2019 •

edited

Loading

uncleGen commented Apr 23, 2019 •

edited

Loading