Skip to content

Conversation

@alex-mikheev
Copy link
Contributor

@miked-mellanox @shamisp @yosefe @jladd-mlnx
please take a look

@shamisp
Copy link
Contributor

shamisp commented Oct 11, 2015

Adding @bosilca

@jladd-mlnx
Copy link
Member

Need to ensure this works with the new add procs behavior. We need to be able to support the direct modex. As far as I can tell, there is only support for bulk add procs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bosilca Are you ok with adding this member to the ompi_datatype_t?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This addition only affects the OMPI datatypes, everything below (and the real datatype engine is down in OPAL) remains unaffected.

@shamisp
Copy link
Contributor

shamisp commented Oct 12, 2015

@jladd-mlnx UCX itself does not make any assumptions about wire up protocols used by the above layer. Upper layers may use direct/inderect/bulk wire ups.

@jsquyres
Copy link
Member

@shamisp I think @jladd-mlnx was referring to the recent change in the PML add_procs() behavior to allow partial wireup. His question is not related to wire-line protocols.

@shamisp
Copy link
Contributor

shamisp commented Oct 12, 2015

@jsquyres - I reviewed this code as well already existing code in the master. Our add_proc() behavior is aligned with the add_proc behavior in the master.

@jsquyres
Copy link
Member

@shamisp Ok, sweet.

@jsquyres
Copy link
Member

FWIW, the "OSHMEM/SPML/IKRIT: fixes typo in .h file" commit should be squashed down into the commit that it is fixing before this is merged in.

@shamisp
Copy link
Contributor

shamisp commented Oct 13, 2015

@jsquyres - sure, we will squash it once we address all the comments.

@lanl-ompi
Copy link
Contributor

Test FAILed.

@hppritcha
Copy link
Member

Which version of ucx do I need to compile this code?

@shamisp
Copy link
Contributor

shamisp commented Oct 14, 2015

@hppritcha https://github.com/yosefe/ucx/tree/topic/nb-tag-matching
(once Yossi back from travel it will go into master)

@shamisp
Copy link
Contributor

shamisp commented Oct 14, 2015

I'm not sure if LANL-disable-dlopen-check is related to our code.

@jsquyres
Copy link
Member

hello_c timed out after 3 minutes. Does that happen if you --disable-dlopen with ucx?

@hppritcha
Copy link
Member

Hi Jeff. That may be spurious (the hang).

@hppritcha
Copy link
Member

bot:retest

@rolfv
Copy link

rolfv commented Oct 19, 2015

Has anyone configured with --enable-picky? I get lots of warnings with it configured that way.

@yosefe
Copy link
Contributor

yosefe commented Oct 20, 2015

@rolfv I've fixed the warning in OMPI code, some warnings come from UCX headers, and this is addressed by openucx/ucx#448

@rolfv
Copy link

rolfv commented Oct 20, 2015

Thanks, things look much better.

@rolfv
Copy link

rolfv commented Oct 20, 2015

Updated to latest ucx and then recompiled Open MPI and I know see this:

[rvandevaart@drossetti-ivy4 dbg]$ gmake -j 20 > make.log
../../../../../ompi/mca/pml/ucx/pml_ucx.c: In function 'mca_pml_ucx_iprobe':
../../../../../ompi/mca/pml/ucx/pml_ucx.c:512: warning: implicit declaration of function 'ucp_tag_probe_nb'
../../../../../ompi/mca/pml/ucx/pml_ucx.c: In function 'mca_pml_ucx_improbe':
../../../../../ompi/mca/pml/ucx/pml_ucx.c:564: warning: implicit declaration of function 'ucp_tag_msg_probe_nb'
../../../../../ompi/mca/pml/ucx/pml_ucx.c:565: warning: assignment makes pointer from integer without a cast
../../../../../ompi/mca/pml/ucx/pml_ucx.c: In function 'mca_pml_ucx_mprobe':
../../../../../ompi/mca/pml/ucx/pml_ucx.c:595: warning: assignment makes pointer from integer without a cast
../../../../../ompi/mca/pml/ucx/pml_ucx.c: In function 'mca_pml_ucx_imrecv':
../../../../../ompi/mca/pml/ucx/pml_ucx.c:617: warning: implicit declaration of function 'ucp_tag_msg_recv_nb'
../../../../../ompi/mca/pml/ucx/pml_ucx.c:620: warning: cast to pointer from integer of different size
../../../../../ompi/mca/pml/ucx/pml_ucx.c: In function 'mca_pml_ucx_mrecv':
../../../../../ompi/mca/pml/ucx/pml_ucx.c:643: warning: cast to pointer from integer of different size
[rvandevaart@drossetti-ivy4 dbg]$ 
``

@yosefe
Copy link
Contributor

yosefe commented Oct 20, 2015

ucx probe is not merged yet, we just need to finalize the api

@jsquyres
Copy link
Member

@yosefe Did you commit something to Open MPI that does not yet exist in the downstream UCX API? That does not seem like a good idea.

@yosefe
Copy link
Contributor

yosefe commented Oct 20, 2015

@jsquyres Yes, i've opened this PR before all code was merged to UCX, to get comments and address them as early as possibly. Obviously, this will not be merged to Open MPI before all UCX code is in place.

@jsquyres
Copy link
Member

@yosefe Ah, got it -- the code in question is just here on the PR, not already in master. Thanks for the clarification.

alex-mikheev and others added 4 commits October 20, 2015 19:45
Fast path lookup is done in inline funcion.
 PML UCX will use it to cache a handle for UCX datatype.
@yosefe yosefe force-pushed the topic/ucx_support branch from a17394e to a313588 Compare October 20, 2015 17:00
@yosefe
Copy link
Contributor

yosefe commented Oct 20, 2015

squashed the various fixes into the commits.

@shamisp
Copy link
Contributor

shamisp commented Oct 21, 2015

👍

@mike-dubman
Copy link
Member

@yosefe - please integrate this PR into v2.x and v1.10.1

mike-dubman added a commit that referenced this pull request Oct 21, 2015
@mike-dubman mike-dubman merged commit 4ea13f1 into open-mpi:master Oct 21, 2015
@yosefe yosefe deleted the topic/ucx_support branch October 21, 2015 13:15
jsquyres pushed a commit to jsquyres/ompi that referenced this pull request Sep 19, 2016
osc/pt2pt: do no use OPAL_THREAD_ADD64 for lock serial number
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants