Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online endpoint deployment failing for custom models #2760

Open
obiii opened this issue Oct 25, 2023 · 1 comment
Open

Online endpoint deployment failing for custom models #2760

obiii opened this issue Oct 25, 2023 · 1 comment
Labels

Comments

@obiii
Copy link

obiii commented Oct 25, 2023

Operating System

Linux

Version Information

Python Version: 3.10
SDK: V2
azure-ai-ml package version: 1.8.0

Steps to reproduce

Hi,

I am following the notebook to deploy a model to online endpoint.

While deploying using:

model = Model(path="../model-1/model/sklearn_regression_model.pkl")
env = Environment(
    image="acrestmlopsdev.azurecr.io/reg_env_dd2",
)

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=online_endpoint_name,
    model=model,
    environment=env,
    instance_type="Standard_F4s_v2",
    code_configuration=CodeConfiguration(
        code="../model-1/onlinescoring", scoring_script="score.py"
    ),
    instance_count=1,
    egress_public_network_access="disabled"
)
ml_client.online_deployments.begin_create_or_update(blue_deployment).result()

the error occurs: A required package azureml-inference-server-http is missing.
image

The environment we are using is registered in AzureML workspace. Here is how it looks:
image

The docker and conda dependency file used to create the docker image in ACR is as follows:

Dockerfile:

# Start with a base image, for example:
# FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04

# Use the provided environment variables for conda and environment file paths

FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04

COPY deps.yml conda_env.yml

RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN echo "source /opt/miniconda/etc/profile.d/conda.sh && conda activate" >> ~/.bashrc

RUN cat conda_env.yml

RUN source /opt/miniconda/etc/profile.d/conda.sh && \
    conda activate && \
    conda install conda && \
    pip install cmake && \
    conda env update -f conda_env.yml

deps.yml:

name: model-env
channels:
  - conda-forge
dependencies:
  - python=3.7
  - numpy=1.21.2
  - pip=21.2.4
  - scikit-learn=0.24.2
  - scipy=1.7.1
  - pip:
    - inference-schema[numpy-support]==1.5
    - joblib==1.0.1
    - azureml-inference-server-http

The deps has azureml-inference-server-http as dependencies, the docker builds fine and AzureMl environment build from docker image is fine.

Expected behavior

Expected behaviour is that the online endpoint deploys properly.

Actual behavior

Gives following error:

2023-10-25T15:41:55,383013296+00:00 | gunicorn/run |
2023-10-25T15:41:55,384232095+00:00 | gunicorn/run | Entry script directory: /var/azureml-app/onlinescoring/.
2023-10-25T15:41:55,385439495+00:00 | gunicorn/run |
2023-10-25T15:41:55,386724694+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,387960893+00:00 | gunicorn/run | Dynamic Python Package Installation
2023-10-25T15:41:55,389318393+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,390611292+00:00 | gunicorn/run |
2023-10-25T15:41:55,392044492+00:00 | gunicorn/run | Dynamic Python package installation is disabled.
2023-10-25T15:41:55,393430091+00:00 | gunicorn/run |
2023-10-25T15:41:55,394692890+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,395941190+00:00 | gunicorn/run | Checking if the Python package azureml-inference-server-http is installed
2023-10-25T15:41:55,397200089+00:00 | gunicorn/run | ###############################################
2023-10-25T15:41:55,398420089+00:00 | gunicorn/run |
2023-10-25T15:41:55,663463169+00:00 | gunicorn/run | A required package azureml-inference-server-http is missing. Please install azureml-inference-server-http before trying again
2023-10-25T15:41:55,666521767+00:00 - gunicorn/finish 100 0
2023-10-25T15:41:55,667702367+00:00 - Exit code 100 is not normal. Killing image

Addition information

No response

@obiii obiii added the bug label Oct 25, 2023
@obiii
Copy link
Author

obiii commented Oct 26, 2023

An update:

We have successfully resolved the deployment issue; however, the resolution process raised an interesting observation. Despite including all packages in deps.yml file, we encountered deployment failures indicating that the package needed to be installed, even thought the docker built successfully and environment was created from that docker image.

When loading the Azure ML Environment, we specified the conda_file parameter and provided the path to the deps.yml file. For instance:


env = Environment(
    conda_file="deps.yml",
    image="acrestmlopsdev.azurecr.io/reg_env_dd2",
)

This approach indeed resolved the problem, although it remains somewhat unclear why the specific dependency was required during environment loading when the same dependency was already included in the environment image creation process.

Would be thankful if anyone could shed some light here.

Thanks,
OR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant