Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use RTLD_GLOBAL to load _C. #31162

Closed
wants to merge 26 commits into from
Closed

Commits on Dec 12, 2019

  1. Don't use RTLD_GLOBAL to load _C.

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 12, 2019
    Configuration menu
    Copy the full SHA
    beb3b40 View commit details
    Browse the repository at this point in the history
  2. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 12, 2019
    Configuration menu
    Copy the full SHA
    a0484e6 View commit details
    Browse the repository at this point in the history
  3. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 12, 2019
    Configuration menu
    Copy the full SHA
    3231d91 View commit details
    Browse the repository at this point in the history
  4. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 12, 2019
    Configuration menu
    Copy the full SHA
    a08fa45 View commit details
    Browse the repository at this point in the history

Commits on Dec 13, 2019

  1. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 13, 2019
    Configuration menu
    Copy the full SHA
    0d0f466 View commit details
    Browse the repository at this point in the history
  2. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 13, 2019
    Configuration menu
    Copy the full SHA
    1060202 View commit details
    Browse the repository at this point in the history
  3. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 13, 2019
    Configuration menu
    Copy the full SHA
    22a565c View commit details
    Browse the repository at this point in the history
  4. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 13, 2019
    Configuration menu
    Copy the full SHA
    4fc627b View commit details
    Browse the repository at this point in the history
  5. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 13, 2019
    Configuration menu
    Copy the full SHA
    b796cb3 View commit details
    Browse the repository at this point in the history
  6. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 13, 2019
    Configuration menu
    Copy the full SHA
    417faae View commit details
    Browse the repository at this point in the history
  7. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 13, 2019
    Configuration menu
    Copy the full SHA
    28c0ef5 View commit details
    Browse the repository at this point in the history
  8. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 13, 2019
    Configuration menu
    Copy the full SHA
    25f8d0c View commit details
    Browse the repository at this point in the history

Commits on Dec 14, 2019

  1. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Dec 14, 2019
    Configuration menu
    Copy the full SHA
    95ebee5 View commit details
    Browse the repository at this point in the history

Commits on Jan 2, 2020

  1. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    [ghstack-poisoned]
    ezyang committed Jan 2, 2020
    Configuration menu
    Copy the full SHA
    53461ff View commit details
    Browse the repository at this point in the history

Commits on Jan 6, 2020

  1. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 6, 2020
    Configuration menu
    Copy the full SHA
    edfc4ea View commit details
    Browse the repository at this point in the history
  2. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 6, 2020
    Configuration menu
    Copy the full SHA
    a77da74 View commit details
    Browse the repository at this point in the history
  3. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 6, 2020
    Configuration menu
    Copy the full SHA
    4170672 View commit details
    Browse the repository at this point in the history

Commits on Jan 7, 2020

  1. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    9812a35 View commit details
    Browse the repository at this point in the history
  2. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    70dd543 View commit details
    Browse the repository at this point in the history
  3. Update on "Don't use RTLD_GLOBAL to load _C."

    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    993c810 View commit details
    Browse the repository at this point in the history
  4. Update on "Don't use RTLD_GLOBAL to load _C."

    This should help us resolve a multitude of weird segfaults and crashes
    when PyTorch is imported along with other packages. Those would often
    happen because libtorch symbols were exposed globally and could be used
    as a source of relocations in shared libraries loaded after libtorch.
    
    Fixes #3059.
    
    Some of the subtleties in preparing this patch:
    
    * Getting ASAN to play ball was a pain in the ass. The basic problem is that when we load with `RTLD_LOCAL`, we now may load a library multiple times into the address space; this happens when we have custom C++ extensions. Since the libraries are usually identical, this is usually benign, but it is technically undefined behavior and UBSAN hates it. I sprayed a few ways of getting things to "work" correctly: I preload libstdc++ (so that it is seen consistently over all library loads) and added turned off vptr checks entirely. Another possibility is we should have a mode where we use RTLD_GLOBAL to load _C, which would be acceptable in environments where you're sure C++ lines up correctly. There's a long comment in the test script going into more detail about this.
    * Making some of our shared library dependencies load with `RTLD_LOCAL` breaks them. OpenMPI and MKL don't work; they play linker shenanigans to look up their symbols which doesn't work when loaded locally, and if we load a library with `RLTD_LOCAL` we aren't able to subsequently see it with `ctypes`. To solve this problem, we employ a clever device invented by apaszke: we create a dummy library `torch_global_deps` with dependencies on all of the libraries which need to be loaded globally, and then load that with `RTLD_GLOBAL`. As long as none of these libraries have C++ symbols, we can avoid confusion about C++ standard library.
    
    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    20872e7 View commit details
    Browse the repository at this point in the history
  5. Update on "Don't use RTLD_GLOBAL to load _C."

    This should help us resolve a multitude of weird segfaults and crashes
    when PyTorch is imported along with other packages. Those would often
    happen because libtorch symbols were exposed globally and could be used
    as a source of relocations in shared libraries loaded after libtorch.
    
    Fixes #3059.
    
    Some of the subtleties in preparing this patch:
    
    * Getting ASAN to play ball was a pain in the ass. The basic problem is that when we load with `RTLD_LOCAL`, we now may load a library multiple times into the address space; this happens when we have custom C++ extensions. Since the libraries are usually identical, this is usually benign, but it is technically undefined behavior and UBSAN hates it. I sprayed a few ways of getting things to "work" correctly: I preload libstdc++ (so that it is seen consistently over all library loads) and added turned off vptr checks entirely. Another possibility is we should have a mode where we use RTLD_GLOBAL to load _C, which would be acceptable in environments where you're sure C++ lines up correctly. There's a long comment in the test script going into more detail about this.
    * Making some of our shared library dependencies load with `RTLD_LOCAL` breaks them. OpenMPI and MKL don't work; they play linker shenanigans to look up their symbols which doesn't work when loaded locally, and if we load a library with `RLTD_LOCAL` we aren't able to subsequently see it with `ctypes`. To solve this problem, we employ a clever device invented by apaszke: we create a dummy library `torch_global_deps` with dependencies on all of the libraries which need to be loaded globally, and then load that with `RTLD_GLOBAL`. As long as none of these libraries have C++ symbols, we can avoid confusion about C++ standard library.
    
    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    2081c67 View commit details
    Browse the repository at this point in the history
  6. Update on "Don't use RTLD_GLOBAL to load _C."

    This should help us resolve a multitude of weird segfaults and crashes
    when PyTorch is imported along with other packages. Those would often
    happen because libtorch symbols were exposed globally and could be used
    as a source of relocations in shared libraries loaded after libtorch.
    
    Fixes #3059.
    
    Some of the subtleties in preparing this patch:
    
    * Getting ASAN to play ball was a pain in the ass. The basic problem is that when we load with `RTLD_LOCAL`, we now may load a library multiple times into the address space; this happens when we have custom C++ extensions. Since the libraries are usually identical, this is usually benign, but it is technically undefined behavior and UBSAN hates it. I sprayed a few ways of getting things to "work" correctly: I preload libstdc++ (so that it is seen consistently over all library loads) and added turned off vptr checks entirely. Another possibility is we should have a mode where we use RTLD_GLOBAL to load _C, which would be acceptable in environments where you're sure C++ lines up correctly. There's a long comment in the test script going into more detail about this.
    * Making some of our shared library dependencies load with `RTLD_LOCAL` breaks them. OpenMPI and MKL don't work; they play linker shenanigans to look up their symbols which doesn't work when loaded locally, and if we load a library with `RLTD_LOCAL` we aren't able to subsequently see it with `ctypes`. To solve this problem, we employ a clever device invented by apaszke: we create a dummy library `torch_global_deps` with dependencies on all of the libraries which need to be loaded globally, and then load that with `RTLD_GLOBAL`. As long as none of these libraries have C++ symbols, we can avoid confusion about C++ standard library.
    
    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 7, 2020
    Configuration menu
    Copy the full SHA
    8948282 View commit details
    Browse the repository at this point in the history

Commits on Jan 8, 2020

  1. Update on "Don't use RTLD_GLOBAL to load _C."

    This should help us resolve a multitude of weird segfaults and crashes
    when PyTorch is imported along with other packages. Those would often
    happen because libtorch symbols were exposed globally and could be used
    as a source of relocations in shared libraries loaded after libtorch.
    
    Fixes #3059.
    
    Some of the subtleties in preparing this patch:
    
    * Getting ASAN to play ball was a pain in the ass. The basic problem is that when we load with `RTLD_LOCAL`, we now may load a library multiple times into the address space; this happens when we have custom C++ extensions. Since the libraries are usually identical, this is usually benign, but it is technically undefined behavior and UBSAN hates it. I sprayed a few ways of getting things to "work" correctly: I preload libstdc++ (so that it is seen consistently over all library loads) and added turned off vptr checks entirely. Another possibility is we should have a mode where we use RTLD_GLOBAL to load _C, which would be acceptable in environments where you're sure C++ lines up correctly. There's a long comment in the test script going into more detail about this.
    * Making some of our shared library dependencies load with `RTLD_LOCAL` breaks them. OpenMPI and MKL don't work; they play linker shenanigans to look up their symbols which doesn't work when loaded locally, and if we load a library with `RLTD_LOCAL` we aren't able to subsequently see it with `ctypes`. To solve this problem, we employ a clever device invented by apaszke: we create a dummy library `torch_global_deps` with dependencies on all of the libraries which need to be loaded globally, and then load that with `RTLD_GLOBAL`. As long as none of these libraries have C++ symbols, we can avoid confusion about C++ standard library.
    
    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 8, 2020
    Configuration menu
    Copy the full SHA
    593b661 View commit details
    Browse the repository at this point in the history
  2. Update on "Don't use RTLD_GLOBAL to load _C."

    This should help us resolve a multitude of weird segfaults and crashes
    when PyTorch is imported along with other packages. Those would often
    happen because libtorch symbols were exposed globally and could be used
    as a source of relocations in shared libraries loaded after libtorch.
    
    Fixes #3059.
    
    Some of the subtleties in preparing this patch:
    
    * Getting ASAN to play ball was a pain in the ass. The basic problem is that when we load with `RTLD_LOCAL`, we now may load a library multiple times into the address space; this happens when we have custom C++ extensions. Since the libraries are usually identical, this is usually benign, but it is technically undefined behavior and UBSAN hates it. I sprayed a few ways of getting things to "work" correctly: I preload libstdc++ (so that it is seen consistently over all library loads) and added turned off vptr checks entirely. Another possibility is we should have a mode where we use RTLD_GLOBAL to load _C, which would be acceptable in environments where you're sure C++ lines up correctly. There's a long comment in the test script going into more detail about this.
    * Making some of our shared library dependencies load with `RTLD_LOCAL` breaks them. OpenMPI and MKL don't work; they play linker shenanigans to look up their symbols which doesn't work when loaded locally, and if we load a library with `RLTD_LOCAL` we aren't able to subsequently see it with `ctypes`. To solve this problem, we employ a clever device invented by apaszke: we create a dummy library `torch_global_deps` with dependencies on all of the libraries which need to be loaded globally, and then load that with `RTLD_GLOBAL`. As long as none of these libraries have C++ symbols, we can avoid confusion about C++ standard library.
    
    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 8, 2020
    Configuration menu
    Copy the full SHA
    d79fe49 View commit details
    Browse the repository at this point in the history
  3. Update on "Don't use RTLD_GLOBAL to load _C."

    This should help us resolve a multitude of weird segfaults and crashes
    when PyTorch is imported along with other packages. Those would often
    happen because libtorch symbols were exposed globally and could be used
    as a source of relocations in shared libraries loaded after libtorch.
    
    Fixes #3059.
    
    Some of the subtleties in preparing this patch:
    
    * Getting ASAN to play ball was a pain in the ass. The basic problem is that when we load with `RTLD_LOCAL`, we now may load a library multiple times into the address space; this happens when we have custom C++ extensions. Since the libraries are usually identical, this is usually benign, but it is technically undefined behavior and UBSAN hates it. I sprayed a few ways of getting things to "work" correctly: I preload libstdc++ (so that it is seen consistently over all library loads) and added turned off vptr checks entirely. Another possibility is we should have a mode where we use RTLD_GLOBAL to load _C, which would be acceptable in environments where you're sure C++ lines up correctly. There's a long comment in the test script going into more detail about this.
    * Making some of our shared library dependencies load with `RTLD_LOCAL` breaks them. OpenMPI and MKL don't work; they play linker shenanigans to look up their symbols which doesn't work when loaded locally, and if we load a library with `RLTD_LOCAL` we aren't able to subsequently see it with `ctypes`. To solve this problem, we employ a clever device invented by apaszke: we create a dummy library `torch_global_deps` with dependencies on all of the libraries which need to be loaded globally, and then load that with `RTLD_GLOBAL`. As long as none of these libraries have C++ symbols, we can avoid confusion about C++ standard library.
    
    Signed-off-by: Edward Z. Yang <ezyang@fb.com>
    
    Differential Revision: [D19262579](https://our.internmc.facebook.com/intern/diff/D19262579)
    
    [ghstack-poisoned]
    ezyang committed Jan 8, 2020
    Configuration menu
    Copy the full SHA
    2771d72 View commit details
    Browse the repository at this point in the history