You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to check whether we can optimize performance of ML models that are deployed on ODAHU as a web api
because our customers report about very bad performance on batch predictions (millions rows) in comparison with local prediction (direct class invoking)
As a result we should have:
optimized ODAHU stack without models invoke API
or
Proposal to change or ADD new API for batch processing
Generally we have next layers on top of original model
original model – Python Class that usually provided by ML framework (scikit, tensorflow, etc)
ODAHU packager pack model into docker image so we add docker network overhead
Network overhead after docker image is deploying into kubernetes knative
==
Task:
Find where the most overhead is located
Decide whether we can optimize (in case of not optimal code, etc)
Optimize or report proposal for API changes
Pre-conditions: we assume that user should find reasonable balance between amount of data in body of an each API request and count of such requests to decrease network latency
The text was updated successfully, but these errors were encountered:
We need to check whether we can optimize performance of ML models that are deployed on ODAHU as a web api
because our customers report about very bad performance on batch predictions (millions rows) in comparison with local prediction (direct class invoking)
As a result we should have:
or
Generally we have next layers on top of original model
original model – Python Class that usually provided by ML framework (scikit, tensorflow, etc)
MLFlow flavor – Our MLFlow toolchain works with original models wrapped in MLFlow pyfunc flavor (https://github.com/odahu/odahu-trainer/blob/6c1b4d33f4bc755402f42e8b989ca5c6b811cdcc/mlflow/odahuflow/mlflowrunner/templates/entrypoint.py#L95)
GPPI wrapper – Our MLFlow toolchain add layer on top of MLFLow pyfunc flavor (https://github.com/odahu/odahu-trainer/blob/6c1b4d33f4bc755402f42e8b989ca5c6b811cdcc/mlflow/odahuflow/mlflowrunner/templates/entrypoint.py#L99)
ODAHU packager that add http api + json parser, before model invokation (https://github.com/odahu/odahu-packager/blob/058056694c3a71cdf1961bdd5f0dddc02e341050/packagers/docker/odahuflow/packager/rest/resources/odahuflow_handler.py#L231)
ODAHU packager pack model into docker image so we add docker network overhead
Network overhead after docker image is deploying into kubernetes knative
==
Task:
Pre-conditions: we assume that user should find reasonable balance between amount of data in body of an each API request and count of such requests to decrease network latency
The text was updated successfully, but these errors were encountered: