New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27989] [Core] Added retries on the connection to the driver for k8s #24702
Conversation
|
||
val driver = retry(3) { | ||
fetcher.setupEndpointRefByURI(arguments.driverUrl) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this changes not just for k8s, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @felixcheung
It's the executor used for kubernetes but It canbe called for different resource managers.
I Labeled it as K8s because it also modifies the docker images used in k8s. Not sure what would be the best component. For instance YarnCoarseGrainedExecutorBackend extends this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's what I mean. in case of shared class, the PR should tag all of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @felixcheung
Then this patch belongs to core and Kubernetes due to the java property in the docker image. Looking at https://spark-prs.appspot.com/ for the categories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @felixcheung any feedback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually If we don't add the conditions int the for loop, it will try to get many connections, even if they are succeeding, so I fixed the other way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops @jlpedrosa I meant to retain the condition in the loop, but otherwise I find the code right above simpler. However I like retaining the original exception. Just please pay attention to the style. driver =
indent is off and needs to say if (i == nTries - 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen
Fixed the indentation. Please double check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks the same; did you push?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen I think it's in the right place, I'm rewriting history to clean the multiple commits. I think it is in the right place...
16e3c99
to
61d2d1e
Compare
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
Outdated
Show resolved
Hide resolved
@@ -51,6 +51,8 @@ ENV SPARK_HOME /opt/spark | |||
|
|||
WORKDIR /opt/spark/work-dir | |||
RUN chmod g+w /opt/spark/work-dir | |||
#Disable negative dns reslolution https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html | |||
RUN sed -i -e 's/networkaddress.cache.negative.ttl=10/networkaddress.cache.negative.ttl=0/g' /usr/lib/jvm/java-1.8-openjdk/jre/lib/security/java.security |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more flexible if that could be set along with networkaddress.cache.ttl via an env variable from within the entrypoint script. This way it would be configurable.
Cannot this be set via the extraJavaOptions config option of the driver or the executor using -Djava.security.networkaddress.cache.negative.ttl=value
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I read in the docs, no those values can't be overriden via properties. It can be just replaced the whole file.
https://stackoverflow.com/questions/4521119/java-argument-to-specify-java-security-file-for-jvm
You can use this code to test.
import java.security.Security;
public class HelloWorld {
public static void main(String[] args) {
String prop = Security.getProperty("networkaddress.cache.negative.ttl");
System.out.println("java.security.networkaddress.cache.negative.ttl: " + prop);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlpedrosa @srowen I still find it better to be able to load your own file eg. via java.security.properties
. I dont really agree with this being hardcoded, at I least I would add it to the entrypoint script and do the sed
there so it is configurable.
Btw user may want to add a file to the image where he overrides the sec properties as follows:
java -Djava.security.properties=./properties.sec HelloWorld
java.security.networkaddress.cache.negative.ttl: 12
$ cat properties.sec
networkaddress.cache.negative.ttl=12
I find the latter more clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we disable negative resolution caching anywhere. It has a potential negative impact, though, don't know if on the whole it's right. I'd leave it out of this otherwise minor change.
But, isn't the property just networkaddress.cache.negative.ttl
?
https://docs.oracle.com/javase/8/docs/api/java/net/InetAddress.html
At least that should be settable as a simple system property. Not sure if it's indeed different when trying to go this route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @srowen @skonto
The problem with Java and negative resolution is the moment you have ANY timeout in DNS resolution, you'll never be able to resolve that name. Which in some Kubernetes scenarios happens (pods network taking time to get fully operational). Basically no matter how many retries it will never work if you have a timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jlpedrosa I voted for making it configurable and extendable ;) Should we also make a note of this eg. docs? Let's see what @srowen has to say and has the final call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wouldn't make this change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm neutral on that. It seems like a non-trivial change to the whole JVM to work around an env-specific DNS issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the option as you considered it was acceptable. Please have a look and let me know what you think. This at least would enable to set it up via spark operator, but I can open a different PR if you want that I modify the code to manage that env variable in the scala side as a managed option for 2.4. In 3.0 it can be handled via pod templates, have not tested it, but I think all scala is doing append variables, so defining that variable in a template could enable it
core/src/test/scala/org/apache/spark/executor/CoarseGrainedExecutorBackendSuite.scala
Outdated
Show resolved
Hide resolved
common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
Outdated
Show resolved
Hide resolved
61d2d1e
to
1131db7
Compare
Please note that I've squashed the commits and rebased it. |
@jlpedrosa pls relate your PR with a jira ticket (https://spark.apache.org/contributing.html). |
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
Outdated
Show resolved
Hide resolved
common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
Outdated
Show resolved
Hide resolved
@@ -51,6 +51,8 @@ ENV SPARK_HOME /opt/spark | |||
|
|||
WORKDIR /opt/spark/work-dir | |||
RUN chmod g+w /opt/spark/work-dir | |||
#Disable negative dns reslolution https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html | |||
RUN sed -i -e 's/networkaddress.cache.negative.ttl=10/networkaddress.cache.negative.ttl=0/g' /usr/lib/jvm/java-1.8-openjdk/jre/lib/security/java.security |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we disable negative resolution caching anywhere. It has a potential negative impact, though, don't know if on the whole it's right. I'd leave it out of this otherwise minor change.
But, isn't the property just networkaddress.cache.negative.ttl
?
https://docs.oracle.com/javase/8/docs/api/java/net/InetAddress.html
At least that should be settable as a simple system property. Not sure if it's indeed different when trying to go this route.
|
||
val driver = retry(3) { | ||
fetcher.setupEndpointRefByURI(arguments.driverUrl) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops @jlpedrosa I meant to retain the condition in the loop, but otherwise I find the code right above simpler. However I like retaining the original exception. Just please pay attention to the style. driver =
indent is off and needs to say if (i == nTries - 1)
@@ -51,6 +51,8 @@ ENV SPARK_HOME /opt/spark | |||
|
|||
WORKDIR /opt/spark/work-dir | |||
RUN chmod g+w /opt/spark/work-dir | |||
#Disable negative dns reslolution https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html | |||
RUN sed -i -e 's/networkaddress.cache.negative.ttl=10/networkaddress.cache.negative.ttl=0/g' /usr/lib/jvm/java-1.8-openjdk/jre/lib/security/java.security |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wouldn't make this change here.
b81ece8
to
44e6db6
Compare
try { | ||
driver = fetcher.setupEndpointRefByURI(arguments.driverUrl) | ||
} catch { | ||
case e: Throwable => if (i == nTries -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: -1 => - 1 (I think this might fail the style checker)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @srowen Ah, Ok. Didn't saw that space, if there's a way to run the style checker would be excelent! I followed the official docs but that tried to refactor hundreds of files. I've pushed the change you requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run ./dev/lint-scala
to just run the checks
if (hostResolveTimeMs > 2000) { | ||
logger.warn("DNS resolution for {} took {} ms", resolvedAddress, hostResolveTimeMs); | ||
logger.warn("DNS resolution {} for {} took {} ms", | ||
resolvMsg, resolvedAddress, hostResolveTimeMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indent is too deep; can it just do on the previous line? if not continuation indent is at most 4 spaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @srowen
I thought the line length was limited too. Pushed the fix.
@@ -78,6 +78,16 @@ case "$1" in | |||
-Xms$SPARK_EXECUTOR_MEMORY | |||
-Xmx$SPARK_EXECUTOR_MEMORY | |||
-cp "$SPARK_CLASSPATH" | |||
) | |||
|
|||
#Disable negative dns reslolution https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html if that property is defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I just wouldn't do this here. It's pretty unrelated. I am neutral on adding another flag for this. You're really working around a pretty specific failure mode here on the JVM side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen @skonto
I am afraid I am not following you. @skonto pointed out that he preferred the fix instead on Docker file, in the entrypoint.sh script with a configurable variable (which is also specific for k8s) so this is what this patch has. As you said "I wouldn't do this here" I though you also meant the Dockerfile and you agreed.
So I am not sure what do you mean by "wouldn't do this here", where is here and where is the place where you want it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that a) I'm not sure we should make this change but if we do, it's b) not directly related to the JIRA/PR purpose, which is just to introduce retries. I think you're saying there is not much point in retrying unless this change happens too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @srowen
Now I understand what you mean, "here" means this PR or jira ticket (please correct me if I miss understood you). Then no worries, I'll keep the retries only in this one and then start a new PR/Jira process to discuss the DNS, and see what and where we can do.
In my opinion retries make sense, avoid rescheduling something for a transient error that is recoverable is helpful, this is not multi layered system so it can't storm out.
I've re-written history into a single commit to make it cleaner.
JL
d01be1e
to
edbc135
Compare
Test build #4803 has finished for PR 24702 at commit
|
Merged to master. (The 'does not merge' message is spurious.) |
if (hostResolveTimeMs > 2000) { | ||
logger.warn("DNS resolution for {} took {} ms", resolvedAddress, hostResolveTimeMs); | ||
logger.warn("DNS resolution {} for {} took {} ms", resolvMsg, resolvedAddress, hostResolveTimeMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arg, the line is too long and fails Java style checks. I'll fix. I don't see why the last test run didn't catch it ... the 'does not merge' message isn't related.
… k8s Disabled negative dns caching for docker images Improved logging on DNS resolution, convenient for slow k8s clusters ## What changes were proposed in this pull request? Added retries when building the connection to the driver in K8s. In some scenarios DNS reslution can take more than the timeout. Also openjdk-8 by default has negative dns caching enabled, which means even retries may not help depending on the times. ## How was this patch tested? This patch was tested agains an specific k8s cluster with slow response time in DNS to ensure it woks. Closes apache#24702 from jlpedrosa/feature/kuberetries. Authored-by: Jose Luis Pedrosa <jlpedrosa@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
Disabled negative dns caching for docker images
Improved logging on DNS resolution, convenient for slow k8s clusters
What changes were proposed in this pull request?
Added retries when building the connection to the driver in K8s.
In some scenarios DNS reslution can take more than the timeout.
Also openjdk-8 by default has negative dns caching enabled, which means even retries may not help depending on the times.
How was this patch tested?
This patch was tested agains an specific k8s cluster with slow response time in DNS to ensure it woks.