Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22243][DStreams]spark.yarn.jars reload from config when Checkpoint recovery #19469

Closed
wants to merge 1 commit into from

Conversation

ChenjunZou
Copy link

add spark.yarn.jars to the checkpoint reload configs.

@zsxwing
Copy link
Member

zsxwing commented Oct 11, 2017

ok to test

@zsxwing
Copy link
Member

zsxwing commented Oct 11, 2017

LGTM. cc @jerryshao to take a look.

@jerryshao
Copy link
Contributor

There's a similar PR #19427 , I was wondering if we can provide a general solution for such issues, like using a configuration to specify all the confs which needs to be reloaded, spark.streaming.confsToReload = spark.yarn.jars,spark.xx.xx. So that we don't need to fix related issues again and again. What do you think?

@SparkQA
Copy link

SparkQA commented Oct 11, 2017

Test build #82621 has finished for PR 19469 at commit 7a03073.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor

@ChenjunZou did you get a chance to look at my left comment?

@ssaavedra
Copy link
Contributor

I think that may be a good idea. I'd say this can depend on the scheduler. Should that be discussed under a different JIRA number?

@jerryshao
Copy link
Contributor

@ssaavedra , yes I think so. with the pull-in of k8s support, I would guess more configurations need to be added to exclusion rule. With current solution, one by one PR doesn't make so sense. We should either figure out a general solution or refactor this part.

Besides, as we moved to structured streaming, do we need to pay more efforts on these issues? @zsxwing

@felixcheung
Copy link
Member

@jerryshao I think it's a good idea to have a design to work with different resource manager.

As of now though there is only one additional config needed for k8s, and from this PR one for yarn.
What's next?

@ssaavedra
Copy link
Contributor

The one I submitted here https://issues.apache.org/jira/browse/SPARK-22294 is not kubernetes-related as such, although it does affect deployments in Kubernetes. It should affect any spark-submit done with a custom bindAddress.

But yes, there might be a k8s-related config at some point, just like there is specific configuration for YARN already.

But the currently proposed PRs are just bugs caught at some point by other PRs. Maybe they should be merged and then a more general architecture can be proposed? This way we'll have all the properties known to need reloading already in the code before refactoring.

Is there anything we should know about structured streaming in regards to checkpoints?

@jerryshao
Copy link
Contributor

@felixcheung As you can see there's bunch of configurations needs to be added here in apache-spark-on-k8s#516, that's why I'm asking a general solutions for such related issue.

I'm OK to merge this PR. But I would suspect similar PRs will still be created in future, since those issues are quite scenario specific, users may have different scenarios and can touch different issues regarding to this. So I'm just wondering if we could have a better solution for this.

@felixcheung
Copy link
Member

felixcheung commented Oct 20, 2017 via email

@ChenjunZou ChenjunZou closed this Oct 27, 2017
@ChenjunZou ChenjunZou deleted the checkpoint-yarn-jars branch October 27, 2017 09:37
@ChenjunZou ChenjunZou restored the checkpoint-yarn-jars branch October 27, 2017 09:37
@ChenjunZou ChenjunZou deleted the checkpoint-yarn-jars branch October 27, 2017 09:41
@zsxwing
Copy link
Member

zsxwing commented Oct 27, 2017

Sorry for the delay. I think it's not worth to design a new feature that's only for DStream. Instead, I would encourage people to use Structured Streaming in the new Spark versions.

I'm okey to just merge this PR and future minor PRs (I don't think it will be a lot) for the similar issues. @ChenjunZou could you reopen this PR, please?

@felixcheung
Copy link
Member

sure, @zsxwing please beaware of apache-spark-on-k8s#516 and #19427

ghost pushed a commit to dbtsai/spark that referenced this pull request Nov 2, 2017
… checkpoint recovery

## What changes were proposed in this pull request?
the previous [PR](apache#19469) is deleted by mistake.
the solution is straight forward.
adding  "spark.yarn.jars" to propertiesToReload so this property will load from config.

## How was this patch tested?

manual tests

Author: ZouChenjun <zouchenjun@youzan.com>

Closes apache#19637 from ChenjunZou/checkpoint-yarn-jars.
@zsxwing
Copy link
Member

zsxwing commented Nov 2, 2017

sure, @zsxwing please beaware of apache-spark-on-k8s#516 and #19427

Yeah. I'm aware of them. I will review #19427.

@zsxwing
Copy link
Member

zsxwing commented Nov 2, 2017

FYI, looks like @ChenjunZou opened a new PR #19637 rather than reopening this. Since the content is the same, I just merged #19637.

asfgit pushed a commit that referenced this pull request Nov 10, 2017
… checkpoint recovery

## What changes were proposed in this pull request?
the previous [PR](#19469) is deleted by mistake.
the solution is straight forward.
adding  "spark.yarn.jars" to propertiesToReload so this property will load from config.

## How was this patch tested?

manual tests

Author: ZouChenjun <zouchenjun@youzan.com>

Closes #19637 from ChenjunZou/checkpoint-yarn-jars.
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
… checkpoint recovery

## What changes were proposed in this pull request?
the previous [PR](apache#19469) is deleted by mistake.
the solution is straight forward.
adding  "spark.yarn.jars" to propertiesToReload so this property will load from config.

## How was this patch tested?

manual tests

Author: ZouChenjun <zouchenjun@youzan.com>

Closes apache#19637 from ChenjunZou/checkpoint-yarn-jars.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants