Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-29233][K8S] Add regex expression checks for executorEnv… #25920

Closed
wants to merge 1 commit into from

Conversation

merrily01
Copy link
Contributor

@merrily01 merrily01 commented Sep 24, 2019

What changes were proposed in this pull request?

In kubernetes, there are some naming regular expression requirements and restrictions on environment variable names, such as:

  • In kubernetes version release-1.7 and earlier, the naming rules of pod environment variable names should meet the requirements of regular expressions: [A-Za-z_] [A-Za-z0-9_]*
  • In kubernetes version release-1.8 and later, the naming rules of pod environment variable names should meet the requirements of regular expressions: [-. _ A-ZA-Z][-. _ A-ZA-Z0-9].*

However, in spark on k8s mode, spark should add restrictions on environmental variable names when creating executorEnv.

In addition, we need to use regular expressions adapted to the high version of k8s to increase the restrictions on the names of environmental variables.

Otherwise, the pod will not be created properly and the spark application will be suspended.

To solve the problem above, a regular validation to executorEnv is added and committed. 

Why are the changes needed?

If no validation rules are added, the environment variable names that don't meet the requirements will cause the pod to not be created properly and the application will be suspended.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Add unit tests and manually run.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @merrily01 . Thank you for making this PR. Can we have additional integration test for the following?

Otherwise, it will lead to the problem that the pod can not be created normally and the tasks will be suspended.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, did you run this PR locally, @merrily01 ? This doesn't compile yet.

@merrily01 merrily01 force-pushed the SPARK-29233 branch 2 times, most recently from 1371532 to ca729a4 Compare September 25, 2019 10:43
@merrily01
Copy link
Contributor Author

@srowen @dongjoon-hyun
Hi~ Thank you for your quick review.
I'm a little confused.
For the Spark community, in the case of different checking rules for low and high versions of k8s, should I base my checking rules on high versions or on stricter rules?
Should I take the high version of k8s as my personal feeling?

If so, I might need to modify the rules and code.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What versions would work, not work, with or without this change?

@merrily01
Copy link
Contributor Author

What versions would work, not work, with or without this change?

@srowen

  • In kubernetes version release-1.7 and earlier, the naming rules of pod environment variable names should meet the requirements of regular expressions: [A-Za-z_] [A-Za-z0-9_]*
  • In kubernetes version release-1.8 and later, the naming rules of pod environment variable names should meet the requirements of regular expressions: [-. _ A-ZA-Z][-. _ A-ZA-Z0-9].*

This change was submitted in commit 604dfb3 of k8s release-1.8 version.

This current code will take effect for k8s release-1.8 and later versions.

@srowen
Copy link
Member

srowen commented Sep 25, 2019

How about just using a more lenient regex then, that would allow both? this is more of a preemptive check than something that must exactly match?

@merrily01
Copy link
Contributor Author

That's a good idea!I'll try it later.
Thank you very much for your careful review.(^__^) @srowen

@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Sep 25, 2019

Test build #111370 has finished for PR 25920 at commit 562b367.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 25, 2019

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/16413/

@SparkQA
Copy link

SparkQA commented Sep 25, 2019

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/16413/

@merrily01
Copy link
Contributor Author

merrily01 commented Sep 26, 2019

Hey~ @srowen

I was so sleepy last night that my mind was a little unclear. After my careful consideration, I have something to tell you:

  1. This validation is necessary, otherwise an executor environment variable name that does not conform to the specification will lead to pod creation errors.

  2. As you know, this check rule is different in high and low versions of k8s.(The low version is stricter than the high version)
    That means there will be no problem for low to high versions of k8s, but in the case of high to low versions,k8s itself may also have this problem.

  3. Compatibility can be achieved with lower version regex (Stricter), but this is contrary to the original intention of the high version of k8s to make this change.

  4. I prefer the validation here to be consistent with the high version behavior of k8s, rather than considering this as a compatibility issue.

  5. What do you think if I declare it in the notes and log message?For example, as follows:

image

@SparkQA
Copy link

SparkQA commented Sep 26, 2019

Test build #111421 has finished for PR 25920 at commit 81a30fc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@merrily01
Copy link
Contributor Author

Wonder why it failed?@AmplabJenkins

@srowen
Copy link
Member

srowen commented Sep 26, 2019

Unless there's an easy way to know what version of K8S will be in use, I think you have to keep the looser check here. It may let through invalid configs, and something will fail, but you get a failure either way. This change just tries to bring it forward.

AmplabJenkins is a bot. Click the link to see why the build failed. It may be unrelated though: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/16460/console

@merrily01
Copy link
Contributor Author

Sorry for misoperation and reopen. Now using a looser check,could you please kindly review? i m so sorry . @srowen @dongjoon-hyun

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK pending tests

@SparkQA
Copy link

SparkQA commented Sep 28, 2019

Test build #4888 has finished for PR 25920 at commit c64fbd2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@merrily01
Copy link
Contributor Author

merrily01 commented Sep 28, 2019

Thank you so much @srowen and @dongjoon-hyun please recheck ,i have added addition test and it can be tested and compiled. thanks a lot !

@srowen
Copy link
Member

srowen commented Sep 29, 2019

I'm going to try retriggering the K8S tests by closing and reopening

@srowen srowen closed this Sep 29, 2019
@srowen srowen reopened this Sep 29, 2019
@dongjoon-hyun
Copy link
Member

Retest this please.

private val EXECUTOR_ENV_VARS = Map(
"spark.executorEnv.1executorEnvVars1/var1" -> "executorEnvVars1",
"spark.executorEnv.executorEnvVars2*var2" -> "executorEnvVars2",
"spark.executorEnv.executorEnvVars3_var3" -> "executorEnvVars3")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we move this into the test case? Do we have a plan to reuse this in another test case?
Never mind. This existing one looks good, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, please add another test case like "spark.executorEnv.4executorEnvVars4/var4" -> "executorEnvVars4" in order to show clearly that this is only key-constraint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun "spark.executorEnv.4executorEnvVars4/var4" -> "executorEnvVars4" is the same as "spark.executorEnv.1executorEnvVars1/var1" -> "executorEnvVars1" . BTW, I added two test cases, which I think will prove that this is only key-constraint.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @merrily01 . Look fine except a few minor comments.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29233][KUBERNETES] Add regex expression checks for executorEnv… [SPARK-29233][K8S] Add regex expression checks for executorEnv… Sep 29, 2019
@SparkQA
Copy link

SparkQA commented Sep 29, 2019

Test build #111577 has finished for PR 25920 at commit c64fbd2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 29, 2019

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/16583/

@SparkQA
Copy link

SparkQA commented Sep 29, 2019

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/16583/

@srowen
Copy link
Member

srowen commented Oct 3, 2019

@merrily01 do you want to make these last updates per above?

@merrily01
Copy link
Contributor Author

merrily01 commented Oct 3, 2019

@merrily01 do you want to make these last updates per above?
Yes, I will update later today. I am on the way to England.

@srowen @dongjoon-hyun done , thanks a lot !

… in K8S mode

In kubernetes, there are some naming regular expression requirements and restrictions on environment variable names, such as:

- In kubernetes version release-1.7 and earlier, the naming rules of pod environment variable names should meet the requirements of regular expressions: [[A-Za-z_] [A-Za-z0-9_]*](https://github.com/kubernetes/kubernetes/blob/release-1.7/staging/src/k8s.io/apimachinery/pkg/util/validation/validation.go#L169)

- In kubernetes version release-1.8 and later, the naming rules of pod environment variable names should meet the requirements of regular expressions: [[-. _ A-ZA-Z][-. _ A-ZA-Z0-9].*](https://github.com/kubernetes/kubernetes/blob/release-1.8/staging/src/k8s.io/apimachinery/pkg/util/validation/validation.go#L305)

However, in spark on k8s mode, spark should add restrictions on environmental variable names when creating executorEnv.

In addition, we need to use regular expressions adapted to the high version of k8s to increase the restrictions on the names of environmental variables.

Otherwise, the pod will not be created properly and the spark application will be suspended.

To solve the problem above, a regular validation to executorEnv is added and committed. 

Unit tests have been added.
@SparkQA
Copy link

SparkQA commented Oct 4, 2019

Test build #4891 has finished for PR 25920 at commit 799d87f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Oct 6, 2019

Merged to master

@merrily01
Copy link
Contributor Author

Forget to say thank you.
Thanks a lot @srowen @dongjoon-hyun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants