-
Notifications
You must be signed in to change notification settings - Fork 937
Introduce -tune command line option to set env vars and mca params from ... #474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
Refer to this link for build results (access rights to CI server needed): |
|
I'm not sure I fully understand this proposed change. We already have the aggregate MCA file with option --amca, and we have the MCA param for passing envars. So isn't this just renaming -amca to -tuned? |
|
This is not just renaming, this new option supports setting not only mca parameters from the file but env variables as well. This option simplify the procedure of writing a profile with the best options(mca/env) for specific application. Users can just copy from the command line and paste to the conf file requested mca and env parameters. This is what Mike just mentioned in the thread related to direct launch. |
|
the proposed format is more user friendly than existing one:
-mca var val as var=valit is very hard to explain to end-user, no way to the end-user complete the conversion process w/o errors and no way to copy&paste from mpirun command line as-is into recipe file.
It is very easy to make mistakes, hard to support and hard to maintain %cat hcoll_amca.conf
mca_base_env_list = HCOLL_BCOL=basesmuma,ptpcoll;HCOLL_SBGP=basesmuma,p2p;HCOLL_ML_USE_KNOMIAL_ALLREDUCE=1
%
Example: % cat imb.conf
-x MXM_TLS=rc,self,shm -x HCOLL_ALG=bruck
-mca opal_rmaps_policy dist:mlx5_1:span
-mca fca_enable_caching 1
% mpirun -tune imb.conf imb.exe
|
|
So what you are proposing is to create a new variation of the current -amca file that has a different syntax, but does essentially the same thing? Then to distinguish it, you would create a new cmd line option to orterun so we know which file type to expect? I'm just trying to grok what you are proposing here. Since we wouldn't backport something like this to the 1.8 series, one question that springs to mind is: why not just modify the amca parser to handle this new syntax? |
|
you are right. The amca existing parser was modified to support a new syntax and all amca infra was reused. we added "-tune" option to keep "-amca" backward compatibility and not mess with existing concepts (amca accepts colon as file list separator, tune accepts comma) |
|
Thanks - I now grok your intent! I'll take a closer look. |
|
@rhc54 - could you please review? thanks |
|
Jeff and others that were occupied this week asked for a chance to consider it, will discuss at next week's telecon. I think we're leaning towards just replacing the current amca parser with this one so we only have one such method. |
|
We talked about this today on the call.
|
|
Nice! As for comments, all kind of them are still supported (#, //, /**/), I just added new patterns to the existing parser so it must be backwards compatible. |
|
@elenash Great -- thank you! Can you put some kind of deprecated notice on the |
|
@jsquyres Sure. In which release amca should be deprecated? |
|
v1.9. We'll kill -amca in v2.1. (i.e., we have to let it be deprecated for a whole series, and then we can kill it in the next series) |
|
@jsquyres Could you tell me which man page I should update? I'm a bit confused looking at so many files in ompi/mpi/man/man3/ |
|
I believe it's orte/tools/orterun/orterun.1in (note the "in" suffix -- orterun.1 is generated from orterun.1in). |
|
Ok, thanks, I will add it there. But I see that there is no information about -amca option, probably, there exist another man file. |
|
Well that's disappointing. I don't see it documented on any many page. Oh well. Add docs for --tuned and we'll be good. |
|
It's definitely documented on the web site: http://www.open-mpi.org/faq/?category=tuning#amca-param-files |
|
Ah, good. Knew it had to be documented somewhere. We should probably update that FAQ page, too (maybe hint that it will be deprecated starting with v1.9, to be replaced with -tune, etc.) |
|
There's some code for -amca option in orte/tools/orte-restart/orte-restart.c. I'm not sure I understand when it is used. Should I duplicate it for -tune? |
|
Yes, probably so. Anywhere that handles -amca should probably handle -tune. Can you put the deprecation notice there, too? Probably an opal_show_help() kind of message about the deprecation. |
|
@jsquyres As far as I understand orte-restart must finally trigger the same flow as orte_init and my warning message in orte/mca/plm/base/plm_base_launch_support.c will be hit. Am I wrong? |
|
|
Refer to this link for build results (access rights to CI server needed): |
|
I updated man page and added a warning message for amca. Please, take a look. |
|
@elenash - could you please update FAQ with -tune examples? |
|
It looks like I don't have permissions to work with git@github.com:open-mpi/ompi-www.git |
|
@elenash You do now. :-) |
|
Thanks! |
|
I think so. We've been fairly consistent about using "," for lists and ":" for paths, right? I.e., is the -am option a list or a path?
Jeff Squyres |
|
For -am option there is a list of paths specified. OPAL_ENV_SEP is used to split them which is a colon. That's why I ask you J |
|
all set, FAQ will be following as well. |
Introduce -tune command line option to set env vars and mca params from ...
|
Thanks! In the FAQ, @elenash please be sure to mention that this is for v1.9 and beyond. |
|
Updated FAQ: |
CSCuv67889: usnic: fix an error corner case
|
When I was trying to pass environment variables via mca_base_envar_file_prefix, the remote process gave the following warning message (Process 4732 Unable to locate the variable file ....), but the variable "name" seemed to have been passed correctly. Below are the background info:
|
...file
This commit introduces new mca parameter -mca mca_base_envar_file_prefix and command line option -tune to specify a single file or list of them separated by "," to set mca parameters and environment variables with the following syntax:
Examples:
-x b=2 -x a=3
cmd: -x a=1 -tune app.conf -> a=1 b=2
-mca btl ^tcp
cmd: -tune app.conf -> -mca btl ^tcp
--mca btl ^tcp
cmd: --mca btl tcp,self -tune app.conf -> --mca btl tcp,self
-mca btl ^tcp
cat mca.conf
btl=^openib
cmd: -mca btl tcp,self -tune app.conf -am mca.conf -> -mca btl tcp,self
-mca btl ^tcp
cat mca.conf
btl=^openib
cmd: -tune app.conf -am mca.conf -> -mca btl ^tcp
cat app.conf
-x a=1 -x b=2 -x c=3
cat app2.conf
-x d=4 -x e -x c=8
cmd: –tune app1.conf,app2.conf -> a=1 b=2 c=8 d=4 e=5
A conf file can be specified with absolute path, relative path or just name, there exist mca variables to specify path to look in as for -am option.
This feature works properly only when job is launched under mpirun, direct launch is not supported.