The only way to set wold_size is using --world_size parameter of the ./scripts/launch_triton_server.py script.
So I can't use tritonserver to launch a "tp_size=4 pp_size=1" tensorrtllm model due to the mpiSize is always 1. This make tensorrtllm_backend can't be launched using tritonserver.
And in triton server, there may be many tensorrtllm models which have diffferent tp_size and pp_size, and are required different world_size.
So the best way to solve this, is to set in config file in model repository, like in config.pbtxt or <version>/model.json, right?
Can tensorrtllm_backend support such a option in config file?
Thanks.