## Importing and Registering Hugging Face Models into Azure ML

This notebook demonstrates the process of importing models from the [Hugging Face hub](https://huggingface.co/models) and registering them into Azure Machine Learning (AML) for further use in various machine learning tasks.

### Why Hugging Face Models in PyRIT?
The primary goal of PyRIT is to assess the robustness of LLM endpoints against different harm categories such as fabrication/ungrounded content (e.g., hallucination), misuse (e.g., bias), and prohibited content (e.g., harassment). Hugging Face serves as a comprehensive repository of LLMs, capable of generating a diverse and complex prompts when given appropriate system prompt. Models such as:
- ["TheBloke/llama2_70b_chat_uncensored-GGML"](https://huggingface.co/TheBloke/llama2_70b_chat_uncensored-GGML)
- ["cognitivecomputations/Wizard-Vicuna-30B-Uncensored"](https://huggingface.co/cognitivecomputations/Wizard-Vicuna-30B-Uncensored)
- ["lmsys/vicuna-13b-v1.1"](https://huggingface.co/lmsys/vicuna-13b-v1.1)

are particularly useful for generating prompts or scenarios without content moderation. These can be configured as part of a `RedTeamingBot` in PyRIT to create challenging and uncensored prompts/scenarios. These prompts are then submitted to the target chat bot, helping assess its ability to handle potentially unsafe, unexpected, or adversarial inputs.

### Important Note on Deploying Quantized Models
When deploying quantized models, especially those suffixed with GGML, FP16, or GPTQ, it's crucial to have GPU support. These models are optimized for performance but require the computational capabilities of GPUs to run. Ensure your deployment environment is equipped with the necessary GPU resources to handle these models.

### Supported Tasks
The import process supports a variety of tasks, including but not limited to:
- Text classification
- Text generation
- Question answering
- Summarization

### Import Process
The process involves downloading models from the Hugging Face hub, converting them to MLflow format for compatibility with Azure ML, and then registering them for easy access and deployment.

### Prerequisites
- An Azure account with an active subscription. [Create one for free](https://azure.microsoft.com/free/).
- An Azure ML workspace set up. [Learn how to set up a workspace](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-manage-workspace).
- Install the Azure ML client library for Python with pip.
  ```bash
     pip install azure-ai-ml
     pip install azure-identity
  ```
- Execute the `az login` command to sign in to your Azure subscription. For detailed instructions, refer to the "Authenticate with Azure Subscription" section in the notebook provided [here](../setup/azure_openai_setup.ipynb)

## 1. Connect to Azure Machine Learning Workspace

Before we start, we need to connect to our Azure ML workspace. The workspace is the top-level resource for Azure ML, providing a centralized place to work with all the artifacts you create.

### Steps:
1. **Import Required Libraries**: We'll start by importing the necessary libraries from the Azure ML SDK.
2. **Set Up Credentials**: We'll use `DefaultAzureCredential` or `InteractiveBrowserCredential` for authentication.
3. **Access Workspace and Registry**: We'll obtain handles to our AML workspace and the model registry.


### 1.1 Import Required Libraries

Import the Azure ML SDK components required for workspace connection and model management.

In [None]:
# Import necessary libraries for Azure ML operations and authentication
from azure.ai.ml import MLClient, UserIdentityConfiguration
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential
from azure.ai.ml.dsl import pipeline
from azure.ai.ml.entities import AmlCompute
from azure.core.exceptions import ResourceNotFoundError

### 1.2 Load Environment Variables

Load necessary environment variables from an `.env` file.

To execute the following job on an AML compute cluster, set `AML_COMPUTE_TYPE` to `amlcompute` and specify `AML_INSTANCE_SIZE` as `STANDARD_D4_V2` (or other as you see fit). When utilizing the model import component, `AML_REGISTRY_NAME` should be set to `azureml`, and `AML_MODEL_IMPORT_VERSION` can be either `latest` or a specific version like `0.0.22`. For Hugging Face models, the `TASK_NAME` might be `text-generation`. For default values and further guidance, please see the `.env_example` file.


In [None]:
# Install the dotenv package if you haven't already
# !pip install python-dotenv

from dotenv import load_dotenv
import os

# Load the environment variables from the .env file
load_dotenv()

subscription_id = os.getenv('AZURE_SUBSCRIPTION_ID')
resource_group = os.getenv('AZURE_RESOURCE_GROUP')
workspace_name = os.getenv('AML_WORKSPACE_NAME')
registry_name = os.getenv('AML_REGISTRY_NAME')
aml_import_model_version = os.getenv('AML_MODEL_IMPORT_VERSION') # values could be 'latest' or any version

# Model and Compute Configuration
model_id = os.getenv('HF_MODEL_ID')
task_name = os.getenv('TASK_NAME')
aml_compute_type = os.getenv("AML_COMPUTE_TYPE")
instance_size = os.getenv('AML_INSTANCE_SIZE')
compute_name = os.getenv('AML_COMPUTE_NAME')
experiment_name = f"Import Model Pipeline Hugging Face model {model_id}"
min_instances = os.getenv("AML_MIN_INSTANCES")
max_instances = os.getenv("AML_MAX_INSTANCES")

### 1.3 Configure Credentials

Set up the `DefaultAzureCredential` for seamless authentication with Azure services. This method should handle most authentication scenarios. If you encounter issues, refer to the [Azure Identity documentation](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python) for alternative credentials.


In [None]:
# Setup Azure credentials, preferring DefaultAzureCredential and falling back to InteractiveBrowserCredential if necessary
try:
    credential = DefaultAzureCredential()
    # Verify if the default credential can fetch a token successfully
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    print("DefaultAzureCredential failed, falling back to InteractiveBrowserCredential:", ex)
    credential = InteractiveBrowserCredential()


### 1.4 Access Azure ML Workspace and Registry

Using the Azure ML SDK, we'll connect to our workspace. This requires having a configuration file or setting up the workspace parameters directly in the code. Ensure your workspace is configured with a compute instance or cluster for running the jobs.


In [None]:
# Initialize MLClient for AML workspace and registry access
try:
    # Attempt to create MLClient using configuration file
    ml_client_ws = MLClient.from_config(credential=credential)
except:
    ml_client_ws = MLClient(
        credential,
        subscription_id=subscription_id,
        resource_group_name=resource_group,
        workspace_name=workspace_name,
    )
# Initialize MLClient for AML model registry access
ml_client_registry = MLClient(credential, registry_name=registry_name)

### 1.5 Compute Target Setup

For model operations, we need a compute target. Here, we'll either attach an existing AmlCompute or create a new one. Note that creating a new AmlCompute can take approximately 5 minutes.

- **Existing AmlCompute**: If an AmlCompute with the specified name exists, we'll use it.
- **New AmlCompute**: If it doesn't exist, we'll create a new one. Be aware of the [resource limits](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-manage-quotas) in Azure ML.


In [None]:
# Setup or retrieve the compute target for model training
try:
    # Check if the compute target already exists
    _ = ml_client_ws.compute.get(compute_name)
    print("Found existing compute target.")
except ResourceNotFoundError:
    # If not found, create a new compute target
    print("Creating a new compute target...")
    compute_config = AmlCompute(
        name=compute_name,
        type=aml_compute_type,
        size=instance_size,
        min_instances=min_instances,
        max_instances=max_instances,
    )
    ml_client_ws.begin_create_or_update(compute_config).result()


## 2. Create an AML Pipeline for Hugging Face Models

In this section, we'll set up a pipeline to import and register Hugging Face models into Azure ML.

### Steps:
1. **Load Pipeline Component**: We'll load the necessary pipeline component from the Azure ML registry.
2. **Define Pipeline Parameters**: We'll specify parameters such as the Hugging Face model ID and compute target.
3. **Create Pipeline**: Using the loaded component and parameters, we'll define the pipeline.
4. **Execute Pipeline**: We'll submit the pipeline job to Azure ML and monitor its progress.


### 2.1 Load Pipeline Component

Load the `import_model` pipeline component from the Azure ML registry. This component is responsible for downloading the Hugging Face model, converting it to MLflow format, and registering it in Azure ML.


In [None]:
import_model = ml_client_registry.components.get(name="import_model", version=aml_import_model_version)

In [None]:
# Check if Hugging Face model exists in the Azure ML model registry

huggingface_model_exists_in_aml_registry = False
try:
    registered_model_id = model_id.replace("/", "-")  # Replace '/' with '-' for AML registry compatibility
    models = ml_client_registry.models.list(name=registered_model_id)
    if models:
        max_version = max(models, key=lambda x: int(x.version)).version
        model_version = str(int(max_version))
        print(f"Model already exists in Azure ML model catalog with name {registered_model_id} and version {model_version}")
        huggingface_model_exists_in_aml_registry = True
except Exception as e:
    print(f"Model {registered_model_id} not found in registry. Please continue importing the model.")


### 2.2 Create and Configure the Pipeline

Define the pipeline using the `import_model` component and the specified parameters. We'll also set up the User Identity Configuration for the pipeline, allowing individual components to access identity credentials if required.


In [None]:
# Define a AML pipeline for importing models into Azure ML
@pipeline
def model_import_pipeline(model_id, compute, task_name, instance_type):
    """
    Pipeline to import a model into Azure ML.

    Parameters:
    - model_id: The ID of the model to import.
    - compute: The compute resource to use for the import job.
    - task_name: The task associated with the model.
    - instance_type: The type of instance to use for the job.

    Returns:
    - A dictionary containing model registration details.
    """
    import_model_job = import_model(
        model_id=model_id, compute=compute, task_name=task_name, instance_type=instance_type
    )
    import_model_job.settings.continue_on_step_failure = False  # Do not continue on failure

    return {"model_registration_details": import_model_job.outputs.model_registration_details}


In [None]:
# Configure the pipeline object with necessary parameters and identity
pipeline_object = model_import_pipeline(
    model_id=model_id,
    compute=compute_name,
    task_name=task_name,
    instance_type=instance_size
)
pipeline_object.identity = UserIdentityConfiguration()
pipeline_object.settings.force_rerun = True
pipeline_object.settings.default_compute = compute_name

In [None]:
# Determine if the pipeline needs to be scheduled for model import
schedule_huggingface_model_import = (
    not huggingface_model_exists_in_aml_registry and model_id not in [None, "None"] and len(model_id) > 1
)
print(f"Need to schedule run for importing {model_id}: {schedule_huggingface_model_import}")

### 2.3 Submit the Pipeline Job

Submit the pipeline job to Azure ML for execution. The job will import the specified Hugging Face model and register it in Azure ML. We'll monitor the job's progress and output.


In [None]:
# Submit and monitor the pipeline job if model import is scheduled
if schedule_huggingface_model_import:
    huggingface_pipeline_job = ml_client_ws.jobs.create_or_update(
        pipeline_object, experiment_name=experiment_name
    )
    ml_client_ws.jobs.stream(huggingface_pipeline_job.name)  # Stream logs to monitor the job
