Skip to content

HADOOP-19835 Make MapReduce Application Master class configurable in YARNRunner#8331

Open
lewismc wants to merge 1 commit intoapache:trunkfrom
lewismc:HADOOP-19835
Open

HADOOP-19835 Make MapReduce Application Master class configurable in YARNRunner#8331
lewismc wants to merge 1 commit intoapache:trunkfrom
lewismc:HADOOP-19835

Conversation

@lewismc
Copy link
Member

@lewismc lewismc commented Mar 9, 2026

Description of PR

See HADOOP-19835

How was this patch tested?

Tested with Hadoop 3.4.3 and Celeborn 0.6.2 and Nutch 1.23-SNAPSHOT. Nutch MapReduce smoke tests were run.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

AI Tooling

If an AI tool was used:

}

vargs.add(MRJobConfig.APPLICATION_MASTER_CLASS);
String amClass = jobConf.get("yarn.app.mapreduce.am",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be "yarn.app.mapreduce.am.class" ?

adding a new config requires changing the constant, docs, default config file, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this patch @pan3793 , I have no issue changing it to yarn.app.mapreduce.am.class. I'd really like to hear from @RexXiong as to how Celeborn has been integrated without this change as I could only get it to work with this patched YARNRunner.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the Celeborn document presents another method: -Dyarn.app.mapreduce.am.command-opts=org.apache.celeborn.mapreduce.v2.app.MRAppMasterWithCeleborn, see https://github.com/apache/celeborn?tab=readme-ov-file#deploy-mapreduce-client

Copy link
Member Author

@lewismc lewismc Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @RexXiong thanks.
I wasn't able to get that working because of how Hadoop builds the AM container command and what yarn.app.mapreduce.am.command-opts is used for.

How the AM is launched

In YARNRunner, the command for the AM container is built in two separate steps:

  1. Main class – One place in code does:
  • vargs.add(MRJobConfig.APPLICATION_MASTER_CLASS); So the main class is always org.apache.hadoop.mapreduce.v2.app.MRAppMaster. That value is hardcoded; no config key is read for it.
  1. Command opts – yarn.app.mapreduce.am.command-opts is used elsewhere for JVM options or extra arguments. Those are merged into the same command, but they are not used as the main class. So they end up either:
  • as JVM args (e.g. -Xmx...), or
  • as arguments passed to the main class (i.e. to MRAppMaster).

So the actual process looks like:

java [options from command-opts] org.apache.hadoop.mapreduce.v2.app.MRAppMaster [any extra args] 1>... 2>...

If you set:

-Dyarn.app.mapreduce.am.command-opts=org.apache.celeborn.mapreduce.v2.app.MRAppMasterWithCeleborn

then that string is treated as part of “options” or “extra args”. It does not replace the main class, so:

  • JVM still runs MRAppMaster as main.
  • MRAppMasterWithCeleborn is at best an argument to MRAppMaster, not the entry point.

The JVM never executes MRAppMasterWithCeleborn as the main class; it always runs MRAppMaster. I wasn't able to get the example from the Celeborn doc’s method running without this patch. The main class is fixed in code, and command-opts never controls it.

Unless I am mistaken, to actually run Celeborn’s AM, the main class in that launch command must be MRAppMasterWithCeleborn. The only way to do that with the current design is to change the code that builds the command so it takes the main class from config (e.g. yarn.app.mapreduce.am or as proposed by @pan3793 yarn.app.mapreduce.am.class) instead of always using APPLICATION_MASTER_CLASS. That’s what this patch for YARNRunner does; command-opts alone can’t do it.

Thanks for any feedback.

Copy link
Member Author

@lewismc lewismc Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple example command to reproduce and test

docker compose -f docker-compose.yml -f docker-compose.celeborn.yml exec -u hadoop namenode hadoop jar /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.4.3.jar pi 2 4

If you need access to the Docker composition to test, please let me know. Thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ export CLASSPATH=...
$ java -Xmx1g HackMain Main foo bar

@lewismc, I guess in the above command, HackMain will run as the entrypoint?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if so, this is a hack of the MR framework ... I think we should make the AM class configurable as you proposed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lewismc Agree with your proposal, make AM class configurable seems more reasonable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RexXiong thanks for the feedback. Can you please provide guidance on expanding this PR? Anything in addition to

  1. changing yarn.app.mapreduce.am --> yarn.app.mapreduce.am.class, and
  2. changing the constant, docs, default config file, etc... how do I do this?
    Thank you

@pan3793
Copy link
Member

pan3793 commented Mar 10, 2026

cc @RexXiong, how was MR on Celeborn used without this change?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 20s trunk passed
+1 💚 compile 0m 55s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 compile 0m 58s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 checkstyle 0m 53s trunk passed
+1 💚 mvnsite 0m 57s trunk passed
+1 💚 javadoc 0m 49s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 48s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 1m 14s trunk passed
+1 💚 shadedclient 29m 5s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 32s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javac 0m 29s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 javac 0m 30s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 28s the patch passed
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 javadoc 0m 22s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 21s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 0m 56s the patch passed
+1 💚 shadedclient 28m 58s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 114m 10s hadoop-mapreduce-client-jobclient in the patch passed.
+1 💚 asflicense 0m 45s The patch does not generate ASF License warnings.
228m 4s
Subsystem Report/Notes
Docker ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8331/1/artifact/out/Dockerfile
GITHUB PR #8331
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 4cce92c76378 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 8f3cfd1
Default Java Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8331/1/testReport/
Max. process+thread count 1258 (vs. ulimit of 5500)
modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8331/1/console
versions git=2.43.0 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants