-
Notifications
You must be signed in to change notification settings - Fork 935
changed mindist mapping policy specifier #494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
Refer to this link for build results (access rights to CI server needed): |
|
May one ask why? In general, I'd prefer not to expand the command line yet again |
|
currently, the same command line specifies device and map in one sentence in scripts, usually one have device selector variable and mapper selector (with mca params) |
|
+1 |
|
Personally, I'd rather this not come into the code base. We continually here complaints about the large and constantly growing number of MCA params in this system. This new one doesn't do anything new at all - it just duplicates existing syntax. The argument that it is easier in a script doesn't make sense to me, as scripts have no issue with putting $vars in lists. So a 👎 from me barring some more convincing argument. |
|
from usability perspective - having one parameter to specify many different things is too complicated. -mca btl_tcp_if_include eth0,span is not a user-friendly. Split into two (and keep the current syntax as well) is more user-friendly |
|
I'm confused - why would you put span on a BTL include param??? It's a modifier to the mapping directive. |
|
it is just an example of mixing unrelated configurables in one parameter. |
|
I'm still not fully understanding your point. The current parameter specifies all mapping-related directives in one place. We went to that because people were unhappy with having to specify so many separate things. How is this not a step backward? I'd really prefer to have only one way to specify things, not several ways of specifying the same thing. |
|
I've been following this, and I confess that I'm a bit confused as well. Here's my questions:
|
|
the only reason for this PR is to simplify current form. At many customers and our internal scripts we see common usage pattern for running MPI:
These are two different unrelated flows, using mix of MCA or SHELL params -mca btl_openib_if_include $hca -x MXM_IB_DEVICE $hca ....and different flow for selecting mapping method: --map-by $mapThere is not correlation between two flows. Also, in mtt, we have a well-known param "alloc=..." which is device independent and should stay this way. Mixing device and mapping selection in one sentence is not user friendly. Now that we have 10k MCA params, we can add another two w/o hitting the overflow. |
|
But there never will be a correlation between the two flows. Even in your proposed change, there is no correlation. You still specify the mapping policy separate from the selection of a particular interface. The current system doesn't force any correlation. You specify your interfaces using the appropriate param. You then specify a mapping policy. If you want it to be "dist", then you have to tell us "distance from what?". If you then modify the policy, well you include that modification in the param. This isn't a case of adding just one or two params. We'd have to add a bunch: n_per_node, n_per_socket, n_per_numa, etc. Like I said, I refuse to have multiple ways to do the same thing. So if you are going to move one modifier to a separate param, then you have to move all of them. I just don't think that makes any sense. |
|
Mike have there been specific customer requests for this change to the way procs are mapped to the resource being specified? I understand your reasoning for suggesting a separation in the way the mapping resource is specified, from the way the procs are mapped to them, but given that currently there are 93 (with one deprecated) mca parameters for the rmaps framework, it might be better to consider |
|
Where did you get this 93 number??? Running ompi_info --all thru "grep maps | wc -l" yields 51, which includes a bunch of priority, verbosity, and other non-germane things that are specific to seldom-used components.. There are exactly two MCA params today that are used by the RMAPS framework: map-by and rank-by |
|
look at the mtt configuration example below. see alloc= part and suggest a nice way to exercise mindist alg w/o entering ugly zone of carrying device name (selected in different part of mtt) as part of specifying allocation policy? Same for customer scripts. We keep the old format as well, having two more mca params will not break any flow. Having two new params will let mtt to exercise mindist gracefully and will make many customer scripts happy and logical. The param name can be shorter. [MPI Details: OMPI]
# Check &test_alloc() for byslot or bynode
exec = mpirun @alloc@ -np &test_np() @mca@ &test_executable() &test_argv()
parameters = &MPI::OMPI::find_mpirun_params(&test_command_line(), \
&test_executable())
network = &MPI::OMPI::find_network(&test_command_line(), &test_executable())
alloc = &if(&eq(&test_alloc(), "node"), "--bynode", "--byslot")
|
|
Sounds to me like this is an MTT issue you are having, not an OMPI issue. Perhaps we should get together with Jeff and discuss the MTT issue separately? |
|
mtt suffers from the same problem. it uses same rationale as a customer scripts: select communication device flow is unrelated to the process mapping and happening in different parts of ini/scripts. |
|
Well, I'll reiterate - let's talk to Jeff about the MTT issue and see where that leads us. I still don't understand what is so hard about substituting a script variable into the MCA param (on the cmd line or in a setenv script cmd). I do it all the time. |
|
I've got a few minutes between meetings, so let me try again to explain the problem I see with this proposal. If we set this up as a separate MCA param, and leave the existing params alone, then we have two ways of specifying mapping modifiers: in the map-by param, and in the "modifier" param. The problem now is: how do I rationalize conflicts and precedence between those two params? The MCA param system cannot help me because they are two different params, so I now have to create an entirely separate logic path for dealing with it. Okay, let's try. What do I do if the modifier param is set (e.g., to "span"), but the map-by param given in the cmd line has no modifier in it? Do I assume that the user didn't want this particular mpirun to use the modifier, and thus left it out? Or do I assume that the lack of a modifier means that the modifier param is to be used? Will the user be expecting that behavior, say if the modifier param is set out of his immediate view in the default MCA param file? How do they then indicate that they don't want the modifier for this specific run? We would have to define every one of these conflict use-cases and write logic for dealing with it. This is a non-trivial problem, and your PR doesn't deal with it. This is why we don't have MCA params that contain overlapping functions - resolving the precedence rules becomes really hard. HTH |
|
The 93 came from the output of orte_info --mca rmaps all. The point was just to show there are indeed already a lot of parameters already for that framework. So hence a reluctance to add more too arbitrarily. |
|
I guess I still don't understand the use case for this PR, either. @miked-mellanox you've cited MTT (and user scripts) -- but I think I'm failing to grok the issue: it seems like you need to specify the $device in multiple places on the command line. Is that the problem? |
|
@miked-mellanox Ah, ok. Is there a reason you can't assign the value to a variable and use that variable multiple times on the command line? |
|
looking for user-friendly, elegant and visually simple solution which will not lead to re-write many existing scripts and mtt.ini files to adopt. (select device and mapper are unrelated flows, where some of them are loaded from outside like modules or functions.sh files or mtt.ini) Also, --map-by dist:nic,modifier is too overloaded way to express mapping. Do you think it can have simple form, like; --map-by-dev $hca --map-by node anyway, Ralph is right in his argument, will close it. |
|
What if you split the device out into a separate MCA param for that component? In other words, something like this: --map-by dist:modifier -mca rmaps_dist_device $hca |
|
this could work too! thanks! we will go for it. |
|
@elenash - could you please rework this PR to support: |
|
Done. But this change should be documented somehow, I guess. Users used to set device after the colon, now modifiers are being set there, it can lead to misunderstanding what is going wrong. However, there would be an error message if user tries to set device name into modifier field as previously like: The mapping request contains an unrecognized modifier: Request: dist:mlx4_0 Please check your request and try again. |
|
PR for jenkins to cope with this change |
|
can we support both? detect if well-known modifier goes after dist or treat it as HCA if unknown? |
|
|
Refer to this link for build results (access rights to CI server needed): |
|
I don't think it's a good idea because when a user just put a wrong specifier name, it will be treated as hca device and user will never know what was the reason of the error. |
|
yep, sounds right |
|
updated jenkins with test for mindist new flags |
|
bot:retest |
|
|
Refer to this link for build results (access rights to CI server needed): |
…fiers to --map-by dist:modifiers -mca rmaps_dist_device device
|
|
Refer to this link for build results (access rights to CI server needed): |
|
👍 |
changed mindist mapping policy specifier
…59b9 Silence a warning by converting the bitmap to a string prior to print…
Сhanged mindist mapping policy specifier from map-bt dist:device,modifiers to --map-by dist:modifiers -mca rmaps_dist_device device