Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify remaining autofill logic in Triton core #3567

Merged
merged 4 commits into from
Nov 15, 2021
Merged

Conversation

GuanLuo
Copy link
Contributor

@GuanLuo GuanLuo commented Nov 12, 2021

No description provided.

config->platform().empty() &&
config->default_model_filename().empty() && has_version) {
bool is_dir = false;
if (version_dir_content.find(kPyTorchLibTorchFilename) !=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tanmayv25 the input for the python backend is going to be a model.pt for Inferentia right? Maybe we should add a comment or note that clarifies that if the model name is model.pt and user uses autofill without specifying a backend - the pytorch backend will be autofilled.

I believe the same holds for TF + Inferentia

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't inferentia a special "Python model"? Won't it has its own file structure inside the version directory even if it loads PyTorch / TF model on its own?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tanmayv25 can you comment on the above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The triton core will not be aware of whether the final model being executed within Inferentia model.py is a model.pt or model.savedmodel. Triton will execute the python script in the model.py like any other. In the python subprocess, we finally access the model.pt which is not tied to the core. The model.pt of inferentia is hidden within model.py.

We have tried to make inferentia implementation to be decoupled from core and the implementation is not related to pytorch backend at all. We would not like to impose a dependency in the autofiller at least for now..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the Triton core autofill is for backward compatibility so it only tries to fill the backends that were supported before backend API (You can even let Triton recognize less backends with TRITRON_ENABLE_XXX CMake flags). For backend implemented after backend API, you must provide at least the "backend" field in config even you want the backend to perform further autofilling, then this part of the logic won't be reached

So if you want to deploy inferentia model, then backend: "inferentia"("python"?) would have specified and the Triton core autofill will not attempt to look at the filename. I think the case @CoderHam mentioned only happens when someone wants Triton core to misconfigure the backend.

Copy link
Contributor

@tanmayv25 tanmayv25 Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the case @CoderHam mentioned only happens when someone wants Triton core to misconfigure the backend.

I see. So, you mean a user can place their inferentia compiled model files in version directory and call it model.savedmodel or model.pt. Then they may try to use autofiller which will incorrectly detect it as TF or libtorch backend. Is this the problem being discussed here?

The script we provide for generating config.pbtxt for inferentia models does provide the backend:"python" so this logic should not ideally be touched. But I am not against adding additional checks. But I wonder what those would be. If you detect model.py in version directory then don't populate backend field?

Am I right? Or have I completely misunderstood the discussion here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That is what my concern was @tanmayv25. If the user will always have the backend specified in the model config for Inferentia models we should be good with the current checks + checking for model.py.

If you detect model.py in version directory then don't populate backend field?

I think we should autofill the backend to be python if there is a model.py in the version directory. That being said it could be a follow up PR.
@GuanLuo approving these changes, we can address them in the future but I just wanted to know what patch we will take in such cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just keep the Triton core autofill to work just for backward compatibility, we already decouple the backends from Triton core and we should keep it this way. We don't want to add backend specific handling into the core every time we add new backend. The autofill with backend API will all be done inside the backend, Triton will require the "backend" field to be set regardless so it knows which backend it should pass the control over.

If you detect model.py in version directory then don't populate backend field?

Triton will not recognize default model filenames other than the backends that we have before the backend API, so it will ignore and look for other filenames.

deadeyegoodwin
deadeyegoodwin previously approved these changes Nov 12, 2021
#endif // TRITON_ENABLE_PYTORCH
}
// Server side autofill only tries to determine the backend of the model, and
// it is only for baackends known by Triton. Because extracting more detailed
Copy link
Contributor

@tanmayv25 tanmayv25 Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

baackends->backends

The comment can be improved like:
Server-side autofill only set backend fields for the models that belong to limited backends for backwards-compatibility. See <link to the known backends>. Extracting detailed information is delegated to the backend implementation to auto-complete.

config->platform().empty() &&
config->default_model_filename().empty() && has_version) {
bool is_dir = false;
if (version_dir_content.find(kPyTorchLibTorchFilename) !=
Copy link
Contributor

@tanmayv25 tanmayv25 Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the case @CoderHam mentioned only happens when someone wants Triton core to misconfigure the backend.

I see. So, you mean a user can place their inferentia compiled model files in version directory and call it model.savedmodel or model.pt. Then they may try to use autofiller which will incorrectly detect it as TF or libtorch backend. Is this the problem being discussed here?

The script we provide for generating config.pbtxt for inferentia models does provide the backend:"python" so this logic should not ideally be touched. But I am not against adding additional checks. But I wonder what those would be. If you detect model.py in version directory then don't populate backend field?

Am I right? Or have I completely misunderstood the discussion here.

CoderHam
CoderHam previously approved these changes Nov 13, 2021
@GuanLuo GuanLuo merged commit 2d432ca into main Nov 15, 2021
@GuanLuo GuanLuo deleted the gluo-autofill_cleanup branch November 15, 2021 20:08
nealvaidya pushed a commit to nealvaidya/server that referenced this pull request Jan 13, 2022
…rver#3567)

* Remove backend specific autofill class

* Remove platform checking

* Fix u

* Improve inline comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants