Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not loaded: No model version was found #7420

Open
jadhosn opened this issue Jul 5, 2024 · 0 comments
Open

Not loaded: No model version was found #7420

jadhosn opened this issue Jul 5, 2024 · 0 comments

Comments

@jadhosn
Copy link

jadhosn commented Jul 5, 2024

Description

Triton Information
Running Triton 24.04-py3 with a combination of Python backend models, TF serving and ONNX runtime backend.
The triton container is pulled as-is from NGC.

To Reproduce
We are unable to reproduce/identify the root cause for some of the models not loading on restart. Different models fail to load randomly on start-up even though these models load and execute fine when the model-control-model is set to explicit.

We haven't identified any specific model pattern (for example, a type of models, or specific backends). The majority of these models are running on an ONNX runtime backend. It looks like it's completely random. Every time the server restarts, different models fail!

We are unable to consistently reproduce on our side to provide any reproducible examples.

Expected behavior
All models should load successfully on server start-up if they have been tested and validated.

Traceback

I0705 14:49:55.104296 1 python_be.cc:2383] TRITONBACKEND_ModelFinalize: delete model state
I0705 14:49:55.104363 1 model_lifecycle.cc:620] successfully unloaded 'MYMODEL' version 1
I0705 14:49:55.718565 1 server.cc:347] Timeout 26: Found 0 live models and 0 in-flight non-inference requests
I0705 14:49:55.778384 1 backend_manager.cc:138] unloading backend 'onnxruntime'
I0705 14:49:55.779157 1 backend_manager.cc:138] unloading backend 'tensorflow'
I0705 14:49:55.779181 1 backend_manager.cc:138] unloading backend 'python'
I0705 14:49:55.779189 1 python_be.cc:2340] TRITONBACKEND_Finalize: Start
I0705 14:50:01.208390 1 python_be.cc:2345] TRITONBACKEND_Finalize: End
error: creating server: Internal - failed to load all models

Failures at timestamp 0
image

Failures at timestamp 1
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant