Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

munmap is not being intercepted for cache refresh #299

Closed
jsquyres opened this issue Dec 2, 2014 · 9 comments
Closed

munmap is not being intercepted for cache refresh #299

jsquyres opened this issue Dec 2, 2014 · 9 comments
Assignees
Milestone

Comments

@jsquyres
Copy link
Member

jsquyres commented Dec 2, 2014

Per the thread starting here:

http://www.open-mpi.org/community/lists/users/2014/11/25730.php

munmap is not being intercepted properly. Late in the thread, it appears that this is happening because the wrapper compiler is only -lmpi, and not explicitly bringing in libopen-pal (where the munmap intercept lives). This means that the user is bringing in munmap from libc, and not seeing the OMPI munmap.

Hence, we're not intercepting munmap, and Badness occurs.

One thought on how to fix this is to re-introduce linking to libopen-rte and libopen-pal in the wrapper compilers. We distinctly took this behavior out at one point (and deliberately just linked against libmpi), but I confess to not remembering the exact reason why. It may have been taken out just because "it's the right thing -- we can have implicit dependencies pull in the rest", or it may have been so that we could support external ORTE and OPAL installations. Not sure. Someone will need to spelunk into the history to find out why. This may give insight into whether we can put the -lopen-rte -lopen-pal back in the wrappers.

@jsquyres jsquyres added the bug label Dec 2, 2014
@jsquyres jsquyres added this to the Open MPI 1.8.5 milestone Dec 2, 2014
@jsquyres jsquyres modified the milestones: Open MPI 1.8.6, Open MPI 1.8.5 Feb 23, 2015
@emmanuelthome
Copy link

the rationale for why the direct dependency on open-rte and open-pal was dropped is there

https://svn.open-mpi.org/trac/ompi/ticket/2092

E.

@jsquyres
Copy link
Member Author

@emmanuelthome Excellent spelunking! Many thanks for reminding me that Past Jeff wrote out the whole reason why we no longer explicitly -lopen-rte -lopen-pal.

@markalle @gpaulsen This might be an excellent argument to bring in the memory hook replacements we discussed this past week (but we'll need to solve the UCX-has-the-same-memory-hooks issue).

@emmanuelthome
Copy link

@jsquyres You're welcome ;-)

Speaking of this, I got bitten again this week by some bug which smells like a similar "hook does get called" issue (not 100% sure yet. But at least an openib-specific segfault which disappears with mpi_leave_pinned 0...). I briefly considered trying out ummunotify. Seems to be dead and buried, unfortunately. #429 hints in that direction, at least. Do you confirm ?

@jsquyres
Copy link
Member Author

@emmanuelthome #429 has been on my to-do list for quite a while, and I haven't gotten to it. :-( Mellanox saw some failures that looked like the Open MPI ummunotify code paths were broken, but they didn't investigate deeply -- that's what #429 is about.

There was a conversation at the Open MPI dev meeting this past week (https://github.com/open-mpi/ompi/wiki/Meeting-2016-02) about using the Platform MPI method of memory hooks. That might end up moving forward, which could solve both this issue and (maybe?) obviate the need for ummunotify support...?

That being said, the Platform MPI method is apparently identical to the UCX method, and therefore they can conflict with each other in userspace. Hence, ummunotify may well still be the One True Answer (i.e., kernel-level support). If you have a little time, if you could verify if #429 is actually due to ummunotify code paths in OMPI being broken, that would be most helpful.

@gpaulsen
Copy link
Member

gpaulsen commented Mar 3, 2016

It SHOULD be possible that while we're setting up the memory hooks at the symbol level, to determine if a hook is ALREADY installed, and chain the hooks. If ALL of the hooks do this, (i.e. UCX and us) then this could work (analogous to how signal handlers chain together). Of course the signal handler approach is well documented by the OS, and this is not documented and pretty hacky.

I feel like this is edging out further on a thin tree limb that should have broken long ago.

@markalle
Copy link
Contributor

markalle commented Mar 7, 2016

Looking at the code it's "probably" possible to detect a prior usage of this trick, and even save the function pointer that the previous product had registered.  Then inside our interception we could either

  1. call syscall(__SYS_munmap/etc, ...) if there was no previous instance of the trick in place2. call their function pointer if there was one
     
    But that seems fragile, both in detection and in the case where multiple products use the same trick (including saving the previously registered product's function pointer). They could get set up into an infinite loop of calling each other back and forth.
     
     
    I'm more inclined to say we look for a small coordination between the two projects:
    Open MPI could offer an
        ompi_memory_register(functionptr)
    where we'd make a weak guarantee of service that IF our product manages to detect memory going away, we'll call functionptr(addr,len).  UCX could offer a similar function guaranteeing the same.
     
    Then we each register our deregistration callback with each other, and we each use the patch_symbol() trick. We wouldn't necessarily know which product would win, but whichever it is, that product calls the other product's deregistration for them.
     
    Mark
     
     
    ----- Original message -----From: Geoff Paulsen notifications@github.comTo: open-mpi/ompi ompi@noreply.github.comCc: Mark Allen/Dallas/IBM@IBMUSSubject: Re: [ompi] munmap is not being intercepted for cache refresh (munmap is not being intercepted for cache refresh #299)Date: Wed, Mar 2, 2016 10:40 PM 
    It SHOULD be possible that while we're setting up the memory hooks at the symbol level, to determine if a hook is ALREADY installed, and chain the hooks. If ALL of the hooks do this, (i.e. UCX and us) then this could work (analogous to how signal handlers chain together). Of course the signal handler approach is well documented by the OS, and this is not documented and pretty hacky.
    I feel like this is edging out further on a thin tree limb that should have broken long ago.
    —Reply to this email directly or view it on GitHub.

 

@gpaulsen
Copy link
Member

I will ask to get Mark to contribute the Platform MPI solution for this fix ASAP, possibly with the hook to also allow other hooks to fire during any hook event (similar to signal handlers).

@gpaulsen gpaulsen self-assigned this Mar 22, 2016
@jsquyres
Copy link
Member Author

@hjelmn has posted a PR with this functionality (after a bunch of off-list discussions with @markalle and @gpaulsen). See #1495.

@jsquyres
Copy link
Member Author

jsquyres commented May 2, 2016

This was closed via #1495.

@jsquyres jsquyres closed this as completed May 2, 2016
jsquyres pushed a commit to jsquyres/ompi that referenced this issue Sep 19, 2016
dong0321 pushed a commit to dong0321/ompi that referenced this issue Feb 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants