GitHub - DataXujing/TensorRT-LLM-ChatGLM3: :fire: 大模型部署实战：TensorRT-LLM, Triton Inference Server, vLLM

大模型加速部署：TensorRT-LLM, Triton Inference Server, vLLM, LangChain

关于详细的slide介绍，请在issue中索要！

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
img		img
service		service
tensorrt_llm		tensorrt_llm
triton_inference_server/model_repo		triton_inference_server/model_repo
vLLM		vLLM
LICENSE		LICENSE
README.md		README.md
Triton大模型部署.pdf		Triton大模型部署.pdf
app.py		app.py
end_to_end_grpc_client.py		end_to_end_grpc_client.py
langchain_chatglm3.py		langchain_chatglm3.py
langchain_chatglm3_triton.py		langchain_chatglm3_triton.py
requirements.txt		requirements.txt