Skip to content

Latest commit

 

History

History
368 lines (219 loc) · 14.7 KB

how-to-use-batch-model-openai-embeddings.md

File metadata and controls

368 lines (219 loc) · 14.7 KB
title titleSuffix description services ms.service ms.subservice ms.topic author ms.author ms.reviewer ms.date ms.custom
Run OpenAI models in batch endpoints
Azure Machine Learning
In this article, learn how to use batch endpoints with OpenAI models.
machine-learning
machine-learning
inferencing
conceptual
santiagxf
fasantia
mopeakande
11/04/2023
how-to, devplatv2, update-code

Run OpenAI models in batch endpoints to compute embeddings

[!INCLUDE cli v2]

Batch Endpoints can deploy models to run inference over large amounts of data, including OpenAI models. In this example, you learn how to create a batch endpoint to deploy ADA-002 model from OpenAI to compute embeddings at scale but you can use the same approach for completions and chat completions models. It uses Microsoft Entra authentication to grant access to the Azure OpenAI resource.

About this example

In this example, we're going to compute embeddings over a dataset using ADA-002 model from OpenAI. We will register the particular model in MLflow format using the OpenAI flavor which has support to orchestrate all the calls to the OpenAI service at scale.

[!INCLUDE machine-learning-batch-clone]

The files for this example are in:

cd endpoints/batch/deploy-models/openai-embeddings

Follow along in Jupyter Notebooks

You can follow along this sample in the following notebooks. In the cloned repository, open the notebook: deploy-and-test.ipynb.

Prerequisites

[!INCLUDE machine-learning-batch-prereqs]

Ensure you have an OpenAI deployment

The example shows how to run OpenAI models hosted in Azure OpenAI service. To successfully do it, you need an OpenAI resource correctly deployed in Azure and a deployment for the model you want to use.

:::image type="content" source="./media/how-to-use-batch-model-openai-embeddings/aoai-deployments.png" alt-text="An screenshot showing the Azure OpenAI studio with the list of model deployments available.":::

Take note of the OpenAI resource being used. We use the name to construct the URL of the resource. Save the URL for later use on the tutorial.

OPENAI_API_BASE="https://<your-azure-openai-resource-name>.openai.azure.com"
openai_api_base="https://<your-azure-openai-resource-name>.openai.azure.com"

Ensure you have a compute cluster where to deploy the endpoint

Batch endpoints use compute cluster to run the models. In this example, we use a compute cluster called batch-cluster. We create the compute cluster here but you can skip this step if you already have one:

COMPUTE_NAME="batch-cluster"
az ml compute create -n batch-cluster --type amlcompute --min-instances 0 --max-instances 5

[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=create_compute)]


Decide in the authentication mode

You can access the Azure OpenAI resource in two ways:

  • Using Microsoft Entra authentication (recommended).
  • Using an access key.

Using Microsoft Entra is recommended because it helps you avoid managing secrets in the deployments.

You can configure the identity of the compute to have access to the Azure OpenAI deployment to get predictions. In this way, you don't need to manage permissions for each of the users using the endpoint. To configure the identity of the compute cluster get access to the Azure OpenAI resource, follow these steps:

  1. Ensure or assign an identity to the compute cluster your deployment uses. In this example, we use a compute cluster called batch-cluster and we assign a system assigned managed identity, but you can use other alternatives.

    COMPUTE_NAME="batch-cluster"
    az ml compute update --name $COMPUTE_NAME --identity-type system_assigned
    
  2. Get the managed identity principal ID assigned to the compute cluster you plan to use.

    PRINCIPAL_ID=$(az ml compute show -n $COMPUTE_NAME --query identity.principal_id)
    
  3. Get the unique ID of the resource group where the Azure OpenAI resource is deployed:

    RG="<openai-resource-group-name>"
    RESOURCE_ID=$(az group show -g $RG --query "id" -o tsv)
    
  4. Grant the role Cognitive Services User to the managed identity:

    az role assignment create --role "Cognitive Services User" --assignee $PRINCIPAL_ID --scope $RESOURCE_ID
    

You can get an access key and configure the batch deployment to use the access key to get predictions. Grab the access key from your account and keep it for future reference in this tutorial.


Register the OpenAI model

Model deployments in batch endpoints can only deploy registered models. You can use MLflow models with the flavor OpenAI to create a model in your workspace referencing a deployment in Azure OpenAI.

  1. Create an MLflow model in the workspace's models registry pointing to your OpenAI deployment with the model you want to use. Use MLflow SDK to create the model:

    [!TIP] In the cloned repository in the folder model you already have an MLflow model to generate embeddings based on ADA-002 model in case you want to skip this step.

    import mlflow
    import openai
    
    engine = openai.Model.retrieve("text-embedding-ada-002")
    
    model_info = mlflow.openai.save_model(
        path="model",
        model="text-embedding-ada-002",
        engine=engine.id,
        task=openai.Embedding,
    )
  2. Register the model in the workspace:

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deploy-and-run.sh" ID="register_model" :::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=register_model)]

Create a deployment for an OpenAI model

  1. First, let's create the endpoint that hosts the model. Decide on the name of the endpoint:

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deploy-and-run.sh" ID="name_endpoint" :::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=name_endpoint)]

  2. Configure the endpoint:

    The following YAML file defines a batch endpoint:

    endpoint.yml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/endpoint.yml":::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=configure_endpoint)]

  3. Create the endpoint resource:

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deploy-and-run.sh" ID="create_endpoint" :::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=create_endpoint)]

  4. Our scoring script uses some specific libraries that are not part of the standard OpenAI SDK so we need to create an environment that have them. Here, we configure an environment with a base image a conda YAML.

    environment/environment.yml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/environment/environment.yml":::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=configure_environment)]


    The conda YAML looks as follows:

    conda.yaml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/environment/conda.yaml":::

  5. Let's create a scoring script that performs the execution. In Batch Endpoints, MLflow models don't require a scoring script. However, in this case we want to extend a bit the capabilities of batch endpoints by:

    [!div class="checklist"]

    • Allow the endpoint to read multiple data types, including csv, tsv, parquet, json, jsonl, arrow, and txt.
    • Add some validations to ensure the MLflow model used has an OpenAI flavor on it.
    • Format the output in jsonl format.
    • Add an environment variable AZUREML_BI_TEXT_COLUMN to control (optionally) which input field you want to generate embeddings for.

    [!TIP] By default, MLflow will use the first text column available in the input data to generate embeddings from. Use the environment variable AZUREML_BI_TEXT_COLUMN with the name of an existing column in the input dataset to change the column if needed. Leave it blank if the default behavior works for you.

    The scoring script looks as follows:

    code/batch_driver.py

    :::code language="python" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/code/batch_driver.py" :::

  6. One the scoring script is created, it's time to create a batch deployment for it. We use environment variables to configure the OpenAI deployment. Particularly we use the following keys:

    • OPENAI_API_BASE is the URL of the Azure OpenAI resource to use.
    • OPENAI_API_VERSION is the version of the API you plan to use.
    • OPENAI_API_TYPE is the type of API and authentication you want to use.

    The environment variable OPENAI_API_TYPE="azure_ad" instructs OpenAI to use Active Directory authentication and hence no key is required to invoke the OpenAI deployment. The identity of the cluster is used instead.

    To use access keys instead of Microsoft Entra authentication, we need the following environment variables:

    • Use OPENAI_API_TYPE="azure"
    • Use OPENAI_API_KEY="<YOUR_AZURE_OPENAI_KEY>"
  7. Once we decided on the authentication and the environment variables, we can use them in the deployment. The following example shows how to use Microsoft Entra authentication particularly:

    deployment.yml

    :::code language="yaml" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deployment.yml" highlight="26-28":::

    [!TIP] Notice the environment_variables section where we indicate the configuration for the OpenAI deployment. The value for OPENAI_API_BASE will be set later in the creation command so you don't have to edit the YAML configuration file.

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=configure_deployment)]

    [!TIP] Notice the environment_variables section where we indicate the configuration for the OpenAI deployment.

  8. Now, let's create the deployment.

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deploy-and-run.sh" ID="create_deployment" :::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=create_deployment)]

    Finally, set the new deployment as the default one:

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=set_default_deployment)]

  9. At this point, our batch endpoint is ready to be used.

Test the deployment

For testing our endpoint, we are going to use a sample of the dataset BillSum: A Corpus for Automatic Summarization of US Legislation. This sample is included in the repository in the folder data.

  1. Create a data input for this model:

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/imagenet-classifier/deploy-and-run.sh" ID="show_job_in_studio" :::

    ml_client.jobs.get(job.name)
  2. Invoke the endpoint:

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deploy-and-run.sh" ID="start_batch_scoring_job" :::

    [!TIP] [!INCLUDE batch-endpoint-invoke-inputs-sdk]

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=start_batch_scoring_job)]

  3. Track the progress:

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deploy-and-run.sh" ID="show_job_in_studio" :::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=get_job)]

  4. Once the deployment is finished, we can download the predictions:

    To download the predictions, use the following command:

    :::code language="azurecli" source="~/azureml-examples-main/cli/endpoints/batch/deploy-models/openai-embeddings/deploy-and-run.sh" ID="download_outputs" :::

    [!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb?name=download_outputs)]

  5. The output predictions look like the following.

    import pandas as pd 
       
    embeddings = pd.read_json("named-outputs/score/embeddings.jsonl", lines=True)
    embeddings

    embeddings.jsonl

    {
        "file": "billsum-0.csv",
        "row": 0,
        "embeddings": [
            [0, 0, 0 ,0 , 0, 0, 0 ]
        ]
    },
    {
        "file": "billsum-0.csv",
        "row": 1,
        "embeddings": [
            [0, 0, 0 ,0 , 0, 0, 0 ]
        ]
    },

Next steps