Skip to content

Unable to build tensorrt_llm backend; problems with CXX11 ABI  #542

@jlewi

Description

@jlewi

System Info

  • CPU Architecture x86_64

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I'm trying to build the backend using the commands below.

  export TRT_VERSION=10.0.1.6
  export TRT_ROOT=/usr/local/TensorRT-10.2.0.19
 python3 ../tensorrt_llm/scripts/build_wheel.py --trt_root ${TRT_ROOT} \
        --cpp_only \
        -D "CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.3/" -D "ENABLE_MULTI_DEVICE=1"

      cmake "-DTRT_VERSION=${TRT_VERSION}" \
      "-DCMAKE_TOOLCHAIN_FILE=${CMAKE_TOOLCHAIN_FILE}" \
      "-DVCPKG_TARGET_TRIPLET=${VCPKG_TARGET_TRIPLET}" \
      "-DTRT_LIB_DIR=${TRT_ROOT}/targets/${ARCH}-linux-gnu/lib" \
      "-DTRT_INCLUDE_DIR=${TRT_ROOT}/include" \
      "-DUSE_CXX11_ABI:BOOL=ON" \
      "-DCMAKE_BUILD_TYPE=Release" \
      "-DCMAKE_INSTALL_PREFIX:PATH=/tmp/tritonbuild/tensorrtllm/install" \
      "-DTRITON_REPO_ORGANIZATION:STRING=https://github.com/triton-inference-server" \
      "-DTRITON_COMMON_REPO_TAG:STRING=r24.05" \
      "-DTRITON_CORE_REPO_TAG:STRING=r24.05" \
      "-DTRITON_BACKEND_REPO_TAG:STRING=r24.05" \
      "-DTRITON_ENABLE_GPU:BOOL=ON" \
      "-DTRITON_ENABLE_MALI_GPU:BOOL=OFF" \
      "-DTRITON_ENABLE_STATS:BOOL=ON" \
      "-DTRITON_ENABLE_METRICS:BOOL=ON" \
      "-DTRITON_ENABLE_MEMORY_TRACKER:BOOL=ON" \
      -S \
      ../inflight_batcher_llm \
      -B \
      .
      
      cmake --build . --config Release -j20  -t install

The cmake install command fails with a linking error complaining about undefined references.

[ 97%] Linking CXX shared library libtriton_tensorrtllm.so
/usr/bin/cmake -E cmake_link_script CMakeFiles/triton-tensorrt-llm-backend.dir/link.txt --verbose=1
/usr/bin/c++ -fPIC -O2 -Wall -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -march=x86-64-v2 -mtune=broadwell -O3 -DNDEBUG -Wl,--version-script libtriton_tensorrtllm.ldscript -Wl,-rpath,'$ORIGIN' -Wl,--no-undefined -Wl,--as-needed,-O1,--sort-common -Wl,-z,relro,-z,now,-z,noexecstack -shared -Wl,-soname,libtriton_tensorrtllm.so -o libtriton_tensorrtllm.so "CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o"   -L/usr/local/cuda-12.3/lib64/stubs  -L/usr/local/cuda-12.3/lib64  -Wl,-rpath,/home/build/backend/build:/home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm:/home/build/backend/build/_deps/repo-core-build:/usr/local/cuda-12.3/lib64:/usr/local/cuda-12.3/lib64/stubs:/usr/local/TensorRT-10.2.0.19/targets/x86_64-linux-gnu/lib:/home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm/plugins libtriton_tensorrtllm_common.so /home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm/libtensorrt_llm.so _deps/repo-core-build/libtritonserver.so _deps/repo-backend-build/libtritonbackendutils.a _deps/repo-common-build/src/libtritonasyncworkqueue.a /usr/local/cuda-12.3/lib64/libcudart.so -ldl /usr/lib/librt.a _deps/repo-backend-build/libkernel_library_new.a /usr/local/cuda-12.3/lib64/libcupti.so /usr/lib64/libmpi.so /usr/local/cuda-12.3/lib64/libcudart.so /usr/local/cuda-12.3/lib64/stubs/libnvidia-ml.so /usr/local/TensorRT-10.2.0.19/targets/x86_64-linux-gnu/lib/libnvinfer.so /home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm/plugins/libnvinfer_plugin_tensorrt_llm.so -lcudadevrt -lcudart_static -lrt -lpthread -ldl
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<float>(float const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIfEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIfEEvPKT_iiib]+0x1e7): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIfEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIfEEvPKT_iiib]+0x1fe): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<__half>(__half const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixI6__halfEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixI6__halfEEvPKT_iiib]+0x27a): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixI6__halfEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixI6__halfEEvPKT_iiib]+0x291): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<unsigned int>(unsigned int const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIjEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIjEEvPKT_iiib]+0x1d7): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIjEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIjEEvPKT_iiib]+0x1ee): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<unsigned long>(unsigned long const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixImEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixImEEvPKT_iiib]+0x1d7): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixImEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixImEEvPKT_iiib]+0x1ee): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: CMakeFiles/triton-tensorrt-llm-backend.dir/src/libtensorrtllm.cc.o: in function `void tensorrt_llm::common::printMatrix<int>(int const*, int, int, int, bool)':
libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIiEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIiEEvPKT_iiib]+0x1d7): undefined reference to `tensorrt_llm::common::fmtstr[abi:cxx11](char const*, ...)'
/usr/lib/gcc/x86_64-pc-linux-gnu/12.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: libtensorrtllm.cc:(.text._ZN12tensorrt_llm6common11printMatrixIiEEvPKT_iiib[_ZN12tensorrt_llm6common11printMatrixIiEEvPKT_iiib]+0x1ee): undefined reference to `tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/triton-tensorrt-llm-backend.dir/build.make:112: libtriton_tensorrtllm.so] Error 1
make[2]: Leaving directory '/home/build/backend/build'
make[1]: *** [CMakeFiles/Makefile2:303: CMakeFiles/triton-tensorrt-llm-backend.dir/all] Error 2
make[1]: Leaving directory '/home/build/backend/build'
make: *** [Makefile:136: all] Error 2

Those symbols seem to be defined in libtensorrt_llm.so which is included in the command

 nm -D /home/build/backend/inflight_batcher_llm/../tensorrt_llm/cpp/build/tensorrt_llm/libtensorrt_llm.so  | c++filt | grep TllmException
00000000007c4a10 T tensorrt_llm::common::TllmException::demangle(char const*)
00000000007c5280 T tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
00000000007c5280 T tensorrt_llm::common::TllmException::TllmException(char const*, unsigned long, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
00000000007c49e0 T tensorrt_llm::common::TllmException::~TllmException()
00000000007c49c0 T tensorrt_llm::common::TllmException::~TllmException()
00000000007c49c0 T tensorrt_llm::common::TllmException::~TllmException()
00000000007c4b50 T tensorrt_llm::common::TllmException::getTrace() const
000000006eb66e90 V typeinfo for tensorrt_llm::common::TllmException
00000000028d2a80 V typeinfo name for tensorrt_llm::common::TllmException
000000006eb66ea8 V vtable for tensorrt_llm::common::TllmException

I came across this earlier issue NVIDIA/TensorRT-LLM#277 which said there was a problem with linking when only building the c++ bindings but that issue claims the issue was fixed in TensorRT-LLM.

I'm only building the cpp bindings because of NVIDIA/TensorRT-LLM#1989

Expected behavior

I expected it to build.

actual behavior

Got the linking error provided above

additional notes

Is an issue similar to NVIDIA/TensorRT-LLM#277 affecting this repository?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions