-
Notifications
You must be signed in to change notification settings - Fork 132
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
System Info
- CPU Architecture x86_64
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
I'm trying to build the backend using the commands below.
export TRT_VERSION=10.0.1.6
export TRT_ROOT=/usr/local/TensorRT-10.2.0.19
python3 ../tensorrt_llm/scripts/build_wheel.py --trt_root ${TRT_ROOT} \
--cpp_only \
-D "CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.3/" -D "ENABLE_MULTI_DEVICE=1"
cmake "-DTRT_VERSION=${TRT_VERSION}" \
"-DCMAKE_TOOLCHAIN_FILE=${CMAKE_TOOLCHAIN_FILE}" \
"-DVCPKG_TARGET_TRIPLET=${VCPKG_TARGET_TRIPLET}" \
"-DTRT_LIB_DIR=${TRT_ROOT}/targets/${ARCH}-linux-gnu/lib" \
"-DTRT_INCLUDE_DIR=${TRT_ROOT}/include" \
"-DUSE_CXX11_ABI:BOOL=ON" \
"-DCMAKE_BUILD_TYPE=Release" \
"-DCMAKE_INSTALL_PREFIX:PATH=/tmp/tritonbuild/tensorrtllm/install" \
"-DTRITON_REPO_ORGANIZATION:STRING=https://github.com/triton-inference-server" \
"-DTRITON_COMMON_REPO_TAG:STRING=r24.05" \
"-DTRITON_CORE_REPO_TAG:STRING=r24.05" \
"-DTRITON_BACKEND_REPO_TAG:STRING=r24.05" \
"-DTRITON_ENABLE_GPU:BOOL=ON" \
"-DTRITON_ENABLE_MALI_GPU:BOOL=OFF" \
"-DTRITON_ENABLE_STATS:BOOL=ON" \
"-DTRITON_ENABLE_METRICS:BOOL=ON" \
"-DTRITON_ENABLE_MEMORY_TRACKER:BOOL=ON" \
-S \
../inflight_batcher_llm \
-B \
.
cmake --build . --config Release -j20 -t install
The cmake install command fails with a linking error complaining about undefined references.
[ 97%] Linking CXX shared library libtriton_tensorrtllm.so
/usr/bin/cmake -E cmake_link_script CMakeFiles/triton-tensorrt-llm-backend.dir/link.txt --verbose=1
/usr/bin/c++ -fPIC -O2 -Wall -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -march=x86-64-v2 -mtune=broadwell -O3 -DNDEBUG -Wl,--version-script libtriton_tensorrtllm.ldscript -Wl,-rpath,'$ORIGIN' -Wl,--no-undefined -Wl,--as-needed,-O1,--sort-common -Wl,-z,relro,-z,now,-z,noexecstack -shared -Wl,-soname,libtriton_tensorrtllm.so -o libtriton_tensorrtllm.so "CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o" -L/usr/local/cuda-12.3/lib64/stubs -L/usr/local/cuda-12.3/lib64 -Wl,-rpath,/home/build/backend/build:/home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm:/home/build/backend/build/_deps/repo-core-build:/usr/local/cuda-12.3/lib64:/usr/local/cuda-12.3/lib64/stubs:/usr/local/TensorRT-10.2.0.19/targets/x86_64-linux-gnu/lib:/home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm/plugins libtriton_tensorrtllm_common.so /home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm/libtensorrt_llm.so _deps/repo-core-build/libtritonserver.so _deps/repo-backend-build/libtritonbackendutils.a _deps/repo-common-build/src/libtritonasyncworkqueue.a /usr/local/cuda-12.3/lib64/libcudart.so -ldl /usr/lib/librt.a _deps/repo-backend-build/libkernel_library_new.a /usr/local/cuda-12.3/lib64/libcupti.so /usr/lib64/libmpi.so /usr/local/cuda-12.3/lib64/libcudart.so /usr/local/cuda-12.3/lib64/stubs/libnvidia-ml.so /usr/local/TensorRT-10.2.0.19/targets/x86_64-linux-gnu/lib/libnvinfer.so /home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm/plugins/libnvinfer_plugin_tensorrt_llm.so -lcudadevrt -lcudart_static -lrt -lpthread -ldl
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<float>(float const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIfEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIfEEvPKT_iiib]+0x1e7): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIfEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIfEEvPKT_iiib]+0x1fe): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<__half>(__half const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixI6__halfEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixI6__halfEEvPKT_iiib]+0x27a): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixI6__halfEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixI6__halfEEvPKT_iiib]+0x291): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<unsigned int>(unsigned int const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIjEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIjEEvPKT_iiib]+0x1d7): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIjEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIjEEvPKT_iiib]+0x1ee): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<unsigned long>(unsigned long const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixImEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixImEEvPKT_iiib]+0x1d7): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixImEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixImEEvPKT_iiib]+0x1ee): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<int>(int const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIiEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIiEEvPKT_iiib]+0x1d7): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIiEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIiEEvPKT_iiib]+0x1ee): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/triton-tensorrt-llm-backend.dir/build.make:112: libtriton_tensorrtllm.so] Error 1
make[2]: Leaving directory '/home/build/backend/build'
make[1]: *** [CMakeFiles/Makefile2:303: CMakeFiles/triton-tensorrt-llm-backend.dir/all] Error 2
make[1]: Leaving directory '/home/build/backend/build'
make: *** [Makefile:136: all] Error 2
Those symbols seem to be defined in libtensorrt_llm.so which is included in the command
nm -D /home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm/libtensorrt_llm.so | c++filt | grep TllmException
00000000007c4a10 T tensorrt_llm::common::TllmException::demangle(char const*)
00000000007c5280 T tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
00000000007c5280 T tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
00000000007c49e0 T tensorrt_llm::common::TllmException::~TllmException()
00000000007c49c0 T tensorrt_llm::common::TllmException::~TllmException()
00000000007c49c0 T tensorrt_llm::common::TllmException::~TllmException()
00000000007c4b50 T tensorrt_llm::common::TllmException::getTrace() const
000000006eb66e90 V typeinfo for tensorrt_llm::common::TllmException
00000000028d2a80 V typeinfo name for tensorrt_llm::common::TllmException
000000006eb66ea8 V vtable for tensorrt_llm::common::TllmException
I came across this earlier issue NVIDIA/TensorRT-LLM#277 which said there was a problem with linking when only building the c++ bindings but that issue claims the issue was fixed in TensorRT-LLM.
I'm only building the cpp bindings because of NVIDIA/TensorRT-LLM#1989
Expected behavior
I expected it to build.
actual behavior
Got the linking error provided above
additional notes
Is an issue similar to NVIDIA/TensorRT-LLM#277 affecting this repository?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working