# 01 - Deploy FLUX Schnell model to a Managed Online Endpoint via Azure Machine Learning

Notebook below contains logic to deploy a BFL's FLUX Schnell model to a managed online endpoint, backed by GPU compute.

This deployment builds a custom inferencing environment off of a baseline Azure ML container image using the conda YAML file (`conda_dependencies.yaml`).

A custom scoring script (`score.py`) is used to load the model [via the diffusers library](https://huggingface.co/black-forest-labs/FLUX.1-schnell#diffusers) and fulfill user requests. 

### Import required packages and establish connection to AML workspace

In [None]:
from azure.ai.ml import MLClient
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment, Model, Environment, CodeConfiguration, DataCollector, DeploymentCollection
from azure.identity import DefaultAzureCredential


subscription_id = "<YOUR-SUBSCRIPTION-ID>"
resource_group = "<YOUR-RESOURCE-GROUP>"
workspace = "<YOUR-AML-WORKSPACE-NAME>"
model_name = "flux-schnell"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)
ml_client

### Register custom inferencing environment for FLUX Schnell model

Conda dependencies are installed into a baseline Azure ML inferencing environment

In [None]:
environment = Environment(
    name="flux-schnell-env",
    image="mcr.microsoft.com/azureml/curated/minimal-py311-inference:11",
    conda_file="conda_dependencies.yaml"
)
dir(ml_client.environments)
ml_client.environments.create_or_update(environment)

### Create endpoint Azure ML

In [None]:
# Creating a unique endpoint name
online_endpoint_name = "flux-schnell-endpoint"

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="DEMO - Endpoint for Black Forest Labs FLUX Schnell Model",
    auth_mode="key"
)
endpoint = ml_client.online_endpoints.begin_create_or_update(endpoint).result()

### Define deployment and push to managed online endpoint

Here we have configured our deployment to target a [Standard_NC24ads_A100_v4 compute target](https://learn.microsoft.com/en-us/azure/virtual-machines/nc-a100-v4-series) and the custom scoring script `score.py` will be responsible for loading the model and fulfilling user request.

Moreover, we have configured model data collectors to capture incoming requests/arguments and to save all outgoing images encoded as base64 strings.

In [None]:
inputs_collection = DeploymentCollection(enabled=True)
outputs_collection = DeploymentCollection(enabled=True)

mdc = DataCollector(collections={'model_inputs':inputs_collection, 'model_outputs':outputs_collection})

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=online_endpoint_name,
    environment=environment,
    environment_variables={
        "MODEL_BASE_PATH": "/",
        "MODEL_NAME": "flux-schnell",
    },
    instance_type="Standard_NC24ads_A100_v4",
    instance_count=1,
    data_collector=mdc,
    code_configuration=CodeConfiguration(code=".", scoring_script="score.py")
)
ml_client.begin_create_or_update(blue_deployment).result()

### Update endpoint to route all traffic to the newly created deployment

In [None]:
endpoint.traffic = {"blue": 100}
endpoint = ml_client.online_endpoints.begin_create_or_update(endpoint).result()