This repository was archived by the owner on Jul 4, 2025. It is now read-only.
forked from NVIDIA/TensorRT-LLM
-
Notifications
You must be signed in to change notification settings - Fork 3
Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
License
menloresearch/cortex.tensorrt-llm
ErrorLooks like something went wrong!
About
Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
Topics
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- C++ 99.3%
- Python 0.5%
- Cuda 0.2%
- CMake 0.0%
- Shell 0.0%
- Smarty 0.0%