-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify remaining autofill logic in Triton core #3567
Conversation
config->platform().empty() && | ||
config->default_model_filename().empty() && has_version) { | ||
bool is_dir = false; | ||
if (version_dir_content.find(kPyTorchLibTorchFilename) != |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanmayv25 the input for the python backend is going to be a model.pt for Inferentia right? Maybe we should add a comment or note that clarifies that if the model name is model.pt and user uses autofill without specifying a backend - the pytorch backend will be autofilled.
I believe the same holds for TF + Inferentia
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't inferentia a special "Python model"? Won't it has its own file structure inside the version directory even if it loads PyTorch / TF model on its own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanmayv25 can you comment on the above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The triton core will not be aware of whether the final model being executed within Inferentia model.py
is a model.pt or model.savedmodel. Triton will execute the python script in the model.py like any other. In the python subprocess, we finally access the model.pt which is not tied to the core. The model.pt of inferentia is hidden within model.py.
We have tried to make inferentia implementation to be decoupled from core and the implementation is not related to pytorch backend at all. We would not like to impose a dependency in the autofiller at least for now..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the Triton core autofill is for backward compatibility so it only tries to fill the backends that were supported before backend API (You can even let Triton recognize less backends with TRITRON_ENABLE_XXX CMake flags). For backend implemented after backend API, you must provide at least the "backend" field in config even you want the backend to perform further autofilling, then this part of the logic won't be reached
So if you want to deploy inferentia model, then backend: "inferentia"("python"?)
would have specified and the Triton core autofill will not attempt to look at the filename. I think the case @CoderHam mentioned only happens when someone wants Triton core to misconfigure the backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the case @CoderHam mentioned only happens when someone wants Triton core to misconfigure the backend.
I see. So, you mean a user can place their inferentia compiled model files in version directory and call it model.savedmodel or model.pt. Then they may try to use autofiller which will incorrectly detect it as TF or libtorch backend. Is this the problem being discussed here?
The script we provide for generating config.pbtxt
for inferentia models does provide the backend:"python"
so this logic should not ideally be touched. But I am not against adding additional checks. But I wonder what those would be. If you detect model.py
in version directory then don't populate backend field?
Am I right? Or have I completely misunderstood the discussion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That is what my concern was @tanmayv25. If the user will always have the backend specified in the model config for Inferentia models we should be good with the current checks + checking for model.py.
If you detect model.py in version directory then don't populate backend field?
I think we should autofill the backend to be python if there is a model.py in the version directory. That being said it could be a follow up PR.
@GuanLuo approving these changes, we can address them in the future but I just wanted to know what patch we will take in such cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just keep the Triton core autofill to work just for backward compatibility, we already decouple the backends from Triton core and we should keep it this way. We don't want to add backend specific handling into the core every time we add new backend. The autofill with backend API will all be done inside the backend, Triton will require the "backend" field to be set regardless so it knows which backend it should pass the control over.
If you detect model.py in version directory then don't populate backend field?
Triton will not recognize default model filenames other than the backends that we have before the backend API, so it will ignore and look for other filenames.
src/core/model_config_utils.cc
Outdated
#endif // TRITON_ENABLE_PYTORCH | ||
} | ||
// Server side autofill only tries to determine the backend of the model, and | ||
// it is only for baackends known by Triton. Because extracting more detailed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
baackends->backends
The comment can be improved like:
Server-side autofill only set backend fields for the models that belong to limited backends for backwards-compatibility. See <link to the known backends>. Extracting detailed information is delegated to the backend implementation to auto-complete.
config->platform().empty() && | ||
config->default_model_filename().empty() && has_version) { | ||
bool is_dir = false; | ||
if (version_dir_content.find(kPyTorchLibTorchFilename) != |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the case @CoderHam mentioned only happens when someone wants Triton core to misconfigure the backend.
I see. So, you mean a user can place their inferentia compiled model files in version directory and call it model.savedmodel or model.pt. Then they may try to use autofiller which will incorrectly detect it as TF or libtorch backend. Is this the problem being discussed here?
The script we provide for generating config.pbtxt
for inferentia models does provide the backend:"python"
so this logic should not ideally be touched. But I am not against adding additional checks. But I wonder what those would be. If you detect model.py
in version directory then don't populate backend field?
Am I right? Or have I completely misunderstood the discussion here.
849285e
…rver#3567) * Remove backend specific autofill class * Remove platform checking * Fix u * Improve inline comment
No description provided.