Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use RTLD_GLOBAL to load _C. #31162

Closed
wants to merge 26 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
beb3b40
Don't use RTLD_GLOBAL to load _C.
ezyang Dec 12, 2019
a0484e6
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 12, 2019
3231d91
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 12, 2019
a08fa45
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 12, 2019
0d0f466
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 13, 2019
1060202
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 13, 2019
22a565c
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 13, 2019
4fc627b
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 13, 2019
b796cb3
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 13, 2019
417faae
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 13, 2019
28c0ef5
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 13, 2019
25f8d0c
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 13, 2019
95ebee5
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Dec 14, 2019
53461ff
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 2, 2020
edfc4ea
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 6, 2020
a77da74
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 6, 2020
4170672
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 6, 2020
9812a35
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 7, 2020
70dd543
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 7, 2020
993c810
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 7, 2020
20872e7
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 7, 2020
2081c67
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 7, 2020
8948282
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 7, 2020
593b661
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 8, 2020
d79fe49
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 8, 2020
2771d72
Update on "Don't use RTLD_GLOBAL to load _C."
ezyang Jan 8, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
42 changes: 32 additions & 10 deletions .jenkins/pytorch/test.sh
Expand Up @@ -75,22 +75,44 @@ fi
# if you're not careful. Check this if you made some changes and the
# ASAN test is not working
if [[ "$BUILD_ENVIRONMENT" == *asan* ]]; then
# Suppress vptr violations arising from multiple copies of pybind11
export ASAN_OPTIONS=detect_leaks=0:symbolize=1:strict_init_order=true
# We suppress the vptr volation, since we have separate copies of
# libprotobuf in both libtorch.so and libcaffe2.so, and it causes
# the following problem:
# test_cse (__main__.TestJit) ... torch/csrc/jit/export.cpp:622:38:
# runtime error: member call on address ... which does not point
# to an object of type 'google::protobuf::MessageLite'
# ...: note: object is of type 'onnx_torch::ModelProto'
#
# This problem should be solved when libtorch.so and libcaffe2.so are
# merged.
export UBSAN_OPTIONS=print_stacktrace=1:suppressions=$PWD/ubsan.supp
export PYTORCH_TEST_WITH_ASAN=1
export PYTORCH_TEST_WITH_UBSAN=1
# TODO: Figure out how to avoid hard-coding these paths
export ASAN_SYMBOLIZER_PATH=/usr/lib/llvm-5.0/bin/llvm-symbolizer
export TORCH_USE_RTLD_GLOBAL=1
# NB: We load libtorch.so with RTLD_GLOBAL for UBSAN, unlike our
# default behavior.
#
# The reason for this is that without RTLD_GLOBAL, if we load multiple
# libraries that depend on libtorch (as is the case with C++ extensions), we
# will get multiple copies of libtorch in our address space. When UBSAN is
# turned on, it will do a bunch of virtual pointer consistency checks which
# won't work correctly. When this happens, you get a violation like:
#
# member call on address XXXXXX which does not point to an object of
# type 'std::_Sp_counted_base<__gnu_cxx::_Lock_policy::_S_atomic>'
# XXXXXX note: object is of type
# 'std::_Sp_counted_ptr<torch::nn::LinearImpl*, (__gnu_cxx::_Lock_policy)2>'
#
# (NB: the textual types of the objects here are misleading, because
# they actually line up; it just so happens that there's two copies
# of the type info floating around in the address space, so they
# don't pointer compare equal. See also
# https://github.com/google/sanitizers/issues/1175
#
# UBSAN is kind of right here: if we relied on RTTI across C++ extension
# modules they would indeed do the wrong thing; but in our codebase, we
# don't use RTTI (because it doesn't work in mobile). To appease
# UBSAN, however, it's better if we ensure all the copies agree!
#
# By the way, an earlier version of this code attempted to load
# libtorch_python.so with LD_PRELOAD, which has a similar effect of causing
# it to be loaded globally. This isn't really a good idea though, because
# it depends on a ton of dynamic libraries that most programs aren't gonna
# have, and it applies to child processes.
export LD_PRELOAD=/usr/lib/llvm-5.0/lib/clang/5.0.0/lib/linux/libclang_rt.asan-x86_64.so
# Increase stack size, because ASAN red zones use more stack
ulimit -s 81920
Expand Down
26 changes: 26 additions & 0 deletions caffe2/CMakeLists.txt
Expand Up @@ -1085,6 +1085,32 @@ if(USE_CUDA)
endif()


# Note [Global dependencies]
# Some libraries (e.g. OpenMPI) like to dlopen plugins after they're initialized,
# and they assume that all of their symbols will be available in the global namespace.
# On the other hand we try to be good citizens and avoid polluting the symbol
# namespaces, so libtorch is loaded with all its dependencies in a local scope.
# That usually leads to missing symbol errors at run-time, so to avoid a situation like
# this we have to preload those libs in a global namespace.
add_library(torch_global_deps SHARED ${TORCH_SRC_DIR}/csrc/empty.c)
set_target_properties(torch_global_deps PROPERTIES LINKER_LANGUAGE C)
if (USE_MPI)
target_link_libraries(torch_global_deps ${MPI_CXX_LIBRARIES})
endif()
target_link_libraries(torch_global_deps ${MKL_LIBRARIES})
# The CUDA libraries are linked here for a different reason: in some
# cases we load these libraries with ctypes, and if they weren't opened
# with RTLD_GLOBAL, we'll do the "normal" search process again (and
# not find them, because they're usually in non-standard locations)
if (USE_CUDA)
target_link_libraries(torch_global_deps ${TORCH_CUDA_LIBRARIES})
target_link_libraries(torch_global_deps ${Caffe2_PUBLIC_CUDA_DEPENDENCY_LIBS})
target_link_libraries(torch_global_deps torch::cudart)
endif()

install(TARGETS torch_global_deps DESTINATION "${TORCH_INSTALL_LIB_DIR}")


# ---[ Caffe2 HIP sources.
if(USE_ROCM)
# Call again since Caffe2_HIP_INCLUDE is extended with ATen include dirs.
Expand Down
90 changes: 56 additions & 34 deletions torch/__init__.py
Expand Up @@ -13,8 +13,10 @@
import os
import sys
import platform
import ctypes
from ._utils import _import_dotted_name
from ._utils_internal import get_file_path, prepare_multiprocessing_environment
from ._utils_internal import get_file_path, prepare_multiprocessing_environment, \
USE_RTLD_GLOBAL_WITH_LIBTORCH
from .version import __version__
from ._six import string_classes as _string_classes

Expand All @@ -33,61 +35,81 @@
# Load the extension module
################################################################################

# Loading the extension with RTLD_GLOBAL option allows to not link extension
# modules against the _C shared object. Their missing THP symbols will be
# automatically filled by the dynamic loader.
import os as _dl_flags

# if we have numpy, it *must* be imported before the call to setdlopenflags()
# or there is risk that later c modules will segfault when importing numpy
try:
import numpy as _np
except ImportError:
pass

if platform.system() == 'Windows':
# first get nvToolsExt PATH
def get_nvToolsExt_path():
NVTOOLEXT_HOME = _dl_flags.getenv('NVTOOLSEXT_PATH', 'C:\\Program Files\\NVIDIA Corporation\\NvToolsExt')
NVTOOLSEXT_PATH = os.getenv('NVTOOLSEXT_PATH', 'C:\\Program Files\\NVIDIA Corporation\\NvToolsExt')

if _dl_flags.path.exists(NVTOOLEXT_HOME):
return _dl_flags.path.join(NVTOOLEXT_HOME, 'bin', 'x64')
else:
return ''
if os.path.exists(NVTOOLSEXT_PATH):
nvtoolsext_lib_path = os.path.join(NVTOOLSEXT_PATH, 'bin', 'x64')
else:
nvtoolsext_lib_path = ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tip that maybe not directly related to this PR: We missed the path of CUDA here. To make it compatible with Python 3.8, we have to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to fix this, but I'm not exactly sure what the suggestion is here. What's the other CUDA path we missed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a variable that records the version of the cuda used during build? If yes, then the answer is quite simple. For example, for CUDA 9.2, just add the following path.

cuda_path = os.path.join(os.environ.get('CUDA_PATH_V9_2', r'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2'), 'bin')


py_dll_path = _dl_flags.path.join(sys.exec_prefix, 'Library', 'bin')
th_dll_path = _dl_flags.path.join(_dl_flags.path.dirname(__file__), 'lib')
py_dll_path = os.path.join(sys.exec_prefix, 'Library', 'bin')
th_dll_path = os.path.join(os.path.dirname(__file__), 'lib')

dll_paths = [th_dll_path, py_dll_path, get_nvToolsExt_path(), _dl_flags.environ['PATH']]
dll_paths = [th_dll_path, py_dll_path, nvtoolsext_lib_path, os.environ['PATH']]

# then add the path to env
_dl_flags.environ['PATH'] = ';'.join(dll_paths)
os.environ['PATH'] = ';'.join(dll_paths)

else:
# first check if the os package has the required flags

# See Note [Global dependencies]
def _load_global_deps():
if platform.system() == 'Windows':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about defining a global variable IS_WINDOWS since it's used multiple times in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do this in a follow up

return

lib_name = 'libtorch_global_deps' + ('.dylib' if platform.system() == 'Darwin' else '.so')
here = os.path.abspath(__file__)
lib_path = os.path.join(os.path.dirname(here), 'lib', lib_name)

ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been going through this out of curiosity and it got me wondering if this doesn't lead to an eventual dlclose? Don't we have to stash this library handle somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



if (USE_RTLD_GLOBAL_WITH_LIBTORCH or os.getenv('TORCH_USE_RTLD_GLOBAL')) and \
platform.system() != 'Windows':
# Do it the hard way. You might want to load libtorch with RTLD_GLOBAL in a
# few circumstances:
#
# 1. You're in a build environment (e.g., fbcode) where
# libtorch_global_deps is not available, but you still need
# to get mkl to link in with RTLD_GLOBAL or it will just
# not work.
#
# 2. You're trying to run PyTorch under UBSAN and you need
# to ensure that only one copy of libtorch is loaded, so
# vptr checks work properly
#
# If you're using this setting, you must verify that all the libraries
# you load consistently use the same libstdc++, or you may have
# mysterious segfaults.
#
import os as _dl_flags
if not hasattr(_dl_flags, 'RTLD_GLOBAL') or not hasattr(_dl_flags, 'RTLD_LAZY'):
try:
# next try if DLFCN exists
import DLFCN as _dl_flags
except ImportError:
# as a last attempt, use compile-time constants
import torch._dl as _dl_flags

old_flags = sys.getdlopenflags()
sys.setdlopenflags(_dl_flags.RTLD_GLOBAL | _dl_flags.RTLD_LAZY)
from torch._C import *
sys.setdlopenflags(old_flags)
del old_flags
del _dl_flags

del _dl_flags

from torch._C import *
else:
# Easy way. You want this most of the time, because it will prevent
# C++ symbols from libtorch clobbering C++ symbols from other
# libraries, leading to mysterious segfaults.
#
# See Note [Global dependencies]
_load_global_deps()
from torch._C import *

__all__ += [name for name in dir(_C)
if name[0] != '_' and
not name.endswith('Base')]

if platform.system() != 'Windows':
sys.setdlopenflags(old_flags)
del old_flags

################################################################################
# Define basic utilities
################################################################################
Expand Down
1 change: 1 addition & 0 deletions torch/_utils_internal.py
Expand Up @@ -54,3 +54,4 @@ def get_source_lines_and_file(obj):

TEST_MASTER_ADDR = '127.0.0.1'
TEST_MASTER_PORT = 29500
USE_RTLD_GLOBAL_WITH_LIBTORCH = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this constant if it's always false? Is this so that you can patch it in fbcode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, fbcode shenanigans

Empty file added torch/csrc/empty.c
Empty file.
2 changes: 0 additions & 2 deletions ubsan.supp
@@ -1,3 +1 @@
vptr:libtorch.so
vptr:libtorch_python.so
vptr:libcaffe2.so