New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YARN-11195. Adding document to enable numa #4501
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
``` | ||
<property> | ||
<name>yarn.nodemanager.numa-awareness.<NODE_ID>.cpus</name> | ||
<value>8192</value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8192 does not look a right example value for cpus.
|
||
# Verify | ||
|
||
**1) NameNode log** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NodeManager instead of NameNode
|
||
**1) NameNode log** | ||
|
||
In any of the namenode, grep the logs with command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NodeManager instead of namenode.
"grep the logs" could be modified to "grep the nodemanager log file using below command"
<nodemanager_ip>.log.2022-06-24-19.gz:2022-06-24 19:16:40,178 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.numa.NumaResourceHandlerImpl (main): NUMA resources allocation is enabled, initializing NUMA resources allocator. | ||
``` | ||
|
||
**2) Container Log** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check if container is assigned with NUMA - below can be used. And also grep the NodeManager log using below grep command
grep "NUMA node" | grep <container_id>
(memory local to another processor or memory shared between processors). | ||
Yarn Containers can make benefit of this NUMA design to get better performance by binding to a | ||
specific NUMA node and all subsequent memory allocations will be served by the same node, | ||
reducing remote memory accesses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also mention that NUMA support for YARN Container has to be enabled only if worker node machines has NUMA support.
</property> | ||
``` | ||
|
||
**4) NUMA nodes id’s** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we share the sample numa hardware output which shows node ids 0, 1 with memory and cpu configs from which user sets the below configs.
``` | ||
<property> | ||
<name>yarn.nodemanager.numa-awareness.<NODE_ID>.memory</name> | ||
<value>8192</value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we describe why we set 8192 in this example.
</property> | ||
``` | ||
|
||
**7) Passing java_opts for map/reduce** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark, Tez and other YARN Applications also need to set the container JVM Opts to leverage NUMA Support.
Thanks @Samrat002 for the patch. Have given few comments. |
🎊 +1 overall
This message was automatically generated. |
@PrabhuJoseph updated with changes ! |
YARN-11195. Fix markdown path YARN-11195. Add optional numa balancing step YARN-11195. Fix indentation YARN-11195. Fix indentation YARN-11195. Numa Enabling document YARN-11195. Numa Enabling document
🎊 +1 overall
This message was automatically generated. |
Thanks @Samrat002. Latest patch LGTM, +1. |
Contributed by Samrat Deb.
Description of PR
This pr is a documentation steps to enable numa for the cluster running on instances like m5.24x large which internally has 2 chip .
Relevent JIRA :- YARN-11195
How was this patch tested?
This working steps has been tested in EMR cluster with 1 master node and 5 core nodes of instance types (m5.24xlarge)
Allignment
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?