Skip to content

[SPARK-31235][YARN] Separates different categories of applications#28009

Closed
wang-zhun wants to merge 8 commits intoapache:masterfrom
wang-zhun:SPARK-31235
Closed

[SPARK-31235][YARN] Separates different categories of applications#28009
wang-zhun wants to merge 8 commits intoapache:masterfrom
wang-zhun:SPARK-31235

Conversation

@wang-zhun
Copy link
Contributor

@wang-zhun wang-zhun commented Mar 25, 2020

What changes were proposed in this pull request?

This PR adds spark.yarn.applicationType to identify the application type

Why are the changes needed?

The current application defaults to the SPARK type.
In fact, different types of applications have different characteristics and are suitable for different scenarios.For example: SPAKR-SQL, SPARK-STREAMING.
I recommend distinguishing them by the parameter spark.yarn.applicationType so that we can more easily manage and maintain different types of applications.

How was this patch tested?

1.add UT
2.Tested by verifying Yarn-UI ApplicationType in the following cases:

  • client and cluster mode

Need additional explanation:
limit cannot exceed 20 characters, can be empty or space
The reasons are as follows:

// org.apache.hadoop.yarn.server.resourcemanager.submitApplication.
 if (submissionContext.getApplicationType() == null) {
      submissionContext
        .setApplicationType(YarnConfiguration.DEFAULT_APPLICATION_TYPE);
} else {
      // APPLICATION_TYPE_LENGTH = 20
      if (submissionContext.getApplicationType().length() > YarnConfiguration.APPLICATION_TYPE_LENGTH) {
        submissionContext.setApplicationType(submissionContext
          .getApplicationType().substring(0,
            YarnConfiguration.APPLICATION_TYPE_LENGTH));
      }
    }

"--archives", "archive1.txt,archive2.txt",
"--num-executors", "6",
"--name", "trill",
"--conf", "spark.yarn.applicationType=SPARK-SQL",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you also test the case when spark.yarn.applicationType is not set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, what happens if empty string, any characters not allowed in it? Add tests and documentation to cover.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you didn't add tests for other like one empty string?
Also I believe the limit on that string is 20 characters, we should test and document that. Can you also test a space in the type?

Copy link
Contributor

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to update the docs/running-on-yarn.md document for the config

"--archives", "archive1.txt,archive2.txt",
"--num-executors", "6",
"--name", "trill",
"--conf", "spark.yarn.applicationType=SPARK-SQL",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, what happens if empty string, any characters not allowed in it? Add tests and documentation to cover.

@tgravescs
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Mar 25, 2020

Test build #120371 has finished for PR 28009 at commit ef705dc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Test build #120377 has finished for PR 28009 at commit ef705dc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wang-zhun
Copy link
Contributor Author

@jiangxb1987 @tgravescs Thanks for your responses and suggestions

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Test build #120390 has finished for PR 28009 at commit 2064ce1.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wang-zhun
Copy link
Contributor Author

test this please

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Test build #120413 has finished for PR 28009 at commit c3aa425.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wang-zhun wang-zhun requested a review from tgravescs March 27, 2020 04:56
@wang-zhun
Copy link
Contributor Author

Hi @jiangxb1987 @tgravescs , could you help to review this?

@jiangxb1987
Copy link
Contributor

Please add test case as Thomas suggested, thanks!

@wang-zhun
Copy link
Contributor Author

Please add test case as Thomas suggested, thanks!
Ok sorry i didn't notice

@SparkQA
Copy link

SparkQA commented Mar 29, 2020

Test build #120548 has finished for PR 28009 at commit 1744c48.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

My comment was to add both a unit test case and a description. Please add the test cases - one with an empty string, 1 with a space in the string, 1 exceeding the 20 character limit.

@SparkQA
Copy link

SparkQA commented Apr 7, 2020

Test build #120926 has finished for PR 28009 at commit 76e9a2a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wang-zhun
Copy link
Contributor Author

@tgravescs help look at this PR.

@tgravescs
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Apr 28, 2020

Test build #122000 has finished for PR 28009 at commit 76e9a2a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

thanks for pinging me, I'll try to look by the end of the week

@tgravescs
Copy link
Contributor

this is weird, I can't rerun the checks @dongjoon-hyun do you have permissions to tell this to rerun checks?

@SparkQA
Copy link

SparkQA commented May 5, 2020

Test build #122298 has finished for PR 28009 at commit 4599e18.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 5, 2020

Test build #122299 has finished for PR 28009 at commit b762753.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes look good, thanks @wang-zhun

@asfgit asfgit closed this in f3891e3 May 5, 2020
@tgravescs
Copy link
Contributor

merged to master

appContext.getPriority.getPriority should be (1)
}

test("specify a more specific type for the application") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test flaky? It failed twice in #26624.

@tgravescs
Copy link
Contributor

yeah looks like its having issues with the yarn authorization provider for some reason

@wang-zhun can we change the test to just create the RMAppManager once and then reuse?

@wang-zhun
Copy link
Contributor Author

@tgravescs ok, #28456

@dongjoon-hyun
Copy link
Member

Hi, All. I made a follow-up PR. Until now, it looks promising.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants