## Object detection and segmentation pipeline

# This notebook demonstrates how to build an ML pipeline in Azure ML that:
# 1. Takes input images and runs YOLO object detection on them
# 2. Uses the detected bounding boxes to run SAM segmentation
# 3. Outputs both detection and segmentation results
#
# The pipeline combines two popular computer vision models:
# - YOLO (You Only Look Once): For fast object detection
# - SAM (Segment Anything Model): For high-quality image segmentation
#
# We'll walk through:
# - Setting up the Azure ML environment and dependencies
# - Registering the YOLO and SAM models
# - Configuring data storage and I/O
# - Defining and connecting the pipeline components
# - Submitting and monitoring the pipeline run


In [2]:
import os

# Handle to the workspace
from azure.ai.ml import MLClient

from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

# Authentication package
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential, AzureCliCredential

# Maybe we need to access the storage account
from azure.storage.blob import BlobServiceClient, ContainerClient, BlobClient

try:
    credential = AzureCliCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()
ml_client = MLClient.from_config(credential)


AzureCliCredential.get_token failed: ERROR: AADSTS70043: The refresh token has expired or is invalid due to sign-in frequency checks by conditional access. The token was issued on 2025-01-28T08:50:37.3447482Z and the maximum allowed lifetime for this request is 14400. Trace ID: 31e3904a-65e3-4956-a228-58ee31493300 Correlation ID: 045a2d8a-f43c-4e24-9b11-06ffbc149907 Timestamp: 2025-02-06 08:11:43Z
Interactive authentication is needed. Please run:
az login --scope https://management.azure.com/.default

Found the config file in: /home/daniel/repos/aml/config.json


## Object detection and segmentation pipeline
We will be making a pipeline that runs object detection using a supplied yolo model, and segmenting out the detections using SAM2. For this we need to do the following: 
    1. Create an environment that can run the models.  
    2. Register the models
    3. Connect the correct storage locations to AML
    4. Define the pipeline steps
    5. Create and run the pipeline

### Environment:
We can define a conda environment and register it. Azure ML will build it for us and then we can use it to run code:

In [3]:
%%writefile ../environments/yoloenv.yaml
name: yoloenv
channels:
  - conda-forge
dependencies:
  - python=3.8
  - numpy=1.21.2
  - pip=21.2.4
  - scikit-learn=0.24.2
  - scipy=1.7.1
  - pandas>=1.1,<1.2
  - pip:
    - opencv-python
    - tensorflow 
    - keras
    - azureml-mlflow==1.42.0
    - azureml-core
    - azure-core
    - azure-ai-ml
    - pillow
    - ultralytics 
    


Overwriting ../environments/yoloenv.yaml


In [5]:
from azure.ai.ml.entities import Environment

custom_env_name = "yoloenvcpu"

yoloenv = Environment(
    name=custom_env_name,
    description="Custom environment for yolo stuff",
    tags={"scikit-learn": "0.24.2"},
    conda_file= "../environments/yoloenv.yaml",
    image="mcr.microsoft.com/azureml/minimal-ubuntu22.04-py39-cpu-inference:latest",
    )
## Uncomment to create the environment
#ml_client.environments.create_or_update(yoloenv)

### From Docker:
The models we are running require a few things that are not offered in the base images offered by AML though. We need to install some libraries. Fortunately we can also create environments using docker specifications:

In [6]:
%%bash
rm -rf azureml-environment
mkdir azureml-environment
echo """
FROM pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime

# Downloads to user config dir
ADD https://ultralytics.com/assets/Arial.ttf https://ultralytics.com/assets/Arial.Unicode.ttf /root/.config/Ultralytics/

# Install linux packages
ENV DEBIAN_FRONTEND noninteractive
RUN apt update
RUN TZ=Etc/UTC apt install -y tzdata
RUN apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++

# Security updates
# https://security.snyk.io/vuln/SNYK-UBUNTU1804-OPENSSL-3314796
RUN apt upgrade --no-install-recommends -y openssl tar

RUN pip install -U ultralytics 
RUN pip install azureml-mlflow==1.52.0
RUN pip install mlflow==2.4.2
""" > azureml-environment/Dockerfile

In [7]:
from azure.ai.ml.entities import Environment, BuildContext

env_docker_context = Environment(
    build=BuildContext(path="azureml-environment"),
    name="yolofromdocker",
    description="Environment created from a Docker context.",
)
## Uncomment to create the environment
#ml_client.environments.create_or_update(env_docker_context)

## Detection

In [6]:
%%writefile ../components/detect_with_yolo/detect_with_yolo.py

import os
import argparse
import glob
import mlflow
from ultralytics import YOLO

def main():
    parser = argparse.ArgumentParser(description="Read blob")
    parser.add_argument("--input_path", type=str, help="Input path")
    parser.add_argument("--model", type=str, help="Model path")
    parser.add_argument("--output_path", type=str, help="Output path")
    args = parser.parse_args()
    

    
    input_path = args.input_path
    print(f"Input path: {input_path}")
    files = glob.glob(input_path + "/**/**.jpg", recursive=True)
    print(f"We have found {len(files)} files in the input path")
    print(f"Some example paths: \n {files[:5]}")

    model = args.model
    print(f"Model path: {model}")

    yolomodel = YOLO(model)
    print("Model loaded successfully")

    os.makedirs(args.output_path, exist_ok=True)
    detections = yolomodel(files, save_txt=True, project=args.output_path)
    print("Predictions made")
    print(detections)
    




if __name__ == "__main__":
    main()

Overwriting ../components/detect_with_yolo/detect_with_yolo.py


In [7]:
from azure.ai.ml import command, Input, Output
inputs = {"input_path": Input(type="uri_folder"), "model": Input(type="custom_model")}
outputs = {"output_path": Output(type="uri_folder")}

detect_command = command(
    name="detect_with_yolo",
    code="../components/detect_with_yolo",
    inputs=inputs,
    outputs=outputs,
    command = "python detect_with_yolo.py --input_path $input_path --model $model --output_path $output_path",
    environment="yolofromdocker:2"
)

In [8]:
#import dsl
from azure.ai.ml import dsl

@dsl.pipeline(name="detect_with_yolo", description="Test yolo pipeline", compute = "defaultclustersbxdondorp")
def test_read_blob_pipeline(input_path: str, model: str):
    detect_component = detect_command(input_path=input_path, model=model)
    return {"output_path": detect_component.outputs.output_path}

In [10]:
pipeline = test_read_blob_pipeline(ml_client.data.get("coco128", version=1), model=ml_client.models.get("yolomodel", version=1))

In [185]:
ml_client.jobs.create_or_update(pipeline)

[32mUploading detect_with_yolo (0.0 MBs): 100%|██████████| 1004/1004 [00:00<00:00, 28090.73it/s]
[39m

pathOnCompute is not a known attribute of class <class 'azure.ai.ml._restclient.v2023_04_01_preview.models._models_py3.UriFolderJobOutput'> and will be ignored


Experiment,Name,Type,Status,Details Page
notebooks,tough_rod_s1nv190nwy,pipeline,NotStarted,Link to Azure Machine Learning studio


In [11]:
os.makedirs("../components/segment", exist_ok=True)

In [54]:
%%writefile ../components/segment/segment.py
import os
import argparse
import glob
import mlflow
from ultralytics import SAM
import numpy as np

def load_bboxes(path):
    with open(path) as f:
        lines = f.readlines()
    bboxes = []
    for line in lines:
        bbox = line.rstrip("\n").split(" ")
        bboxes.append([float(b) for b in bbox[1:]])
    return bboxes

def match_img_to_annotations(img_paths, annotations):
    matched = []
    for img_path in img_paths:
        img_name = os.path.basename(img_path).split(".")[0]
        for annotation in annotations:
            if img_name in annotation:
                matched.append([img_path, annotation])
                break
    return matched

def main():
    parser = argparse.ArgumentParser(description="Read blob")
    parser.add_argument("--input_data", type=str, help="Input path")
    parser.add_argument("--annotations", type=str, help="Annotations path")
    parser.add_argument("--segmentation_model", type=str, help="Segmentation model path")
    parser.add_argument("--output_path", type=str, help="Output path")
    args = parser.parse_args()

    input_data = args.input_data
    print(f"Input data: {input_data}")
    files = glob.glob(input_data + "/**/**.jpg", recursive=True)
    print(f"We have found {len(files)} files in the input path")
    print(f"Some example paths: \n {files[:5]}")

    annotations = args.annotations
    print(f"Annotations path: {annotations}")
    all_annotations = glob.glob(annotations + "/**/**.txt", recursive=True)
    print(f"We have found {len(all_annotations)} annotations in the input path")
    print(f"Some example paths: \n {all_annotations[:5]}")

    segmentation_model = args.segmentation_model
    print(f"Segmentation model path: {segmentation_model}")
    model = SAM(segmentation_model)
    print("Model loaded successfully")

    matched = match_img_to_annotations(files, all_annotations)
    print(f"Matched {len(matched)} images with annotations")

    output_path = os.path.join(args.output_path, "segmentation")
    os.makedirs(output_path, exist_ok=True)

    for img_path, annotation in matched:
        savepath = os.path.join(output_path, os.path.basename(img_path).split(".")[0])
        os.makedirs(savepath, exist_ok=True)


        bboxes = load_bboxes(annotation)
        segmentation = model(img_path, bboxes=bboxes, save=True, save_txt=True, project=savepath)
        # for s in segmentation:
        #     for i, mask in enumerate(s.masks):
        #         np.save(os.path.join(output_path, f"{os.path.basename(img_path)}_mask{i}.npy"), mask)
            
        print(f"Segmented image saved for {os.path.basename(img_path)}")
    

if __name__ == "__main__":
    main()



Overwriting ../components/segment/segment.py


In [156]:
ds = ml_client.data.get("output", version=1)

In [157]:
inputs = {"input_data": Input(type="uri_folder"), "annotations": Input(type="uri_folder"), "segmentation_model": Input(type="custom_model")}
outputs = {"output_path": Output(type="uri_folder", mode = "rw_mount", path = ds.path)}

segmentation_command = command(
    name="segment",
    code="../components/segment",
    inputs=inputs,
    outputs=outputs,
    command = """python segment.py --input_data $input_data --annotations $annotations --segmentation_model $segmentation_model --output_path $output_path""",
    environment="yolofromdocker:2"
)

In [163]:
!touch "outputs/test"

In [158]:
@dsl.pipeline(name="detect_and_segment", description="Test yolo pipeline", compute = "defaultclustersbxdondorp")
def detect_and_segment_pipeline(input_path: str, model: str, segmentation_model: str):
    detect_component = detect_command(input_path=input_path, model=model)
    segment_component = segmentation_command(input_data=input_path, annotations=detect_component.outputs.output_path, segmentation_model=segmentation_model)
    return {"output_path": segment_component.outputs.output_path}

In [159]:
segmentation_pipeline = detect_and_segment_pipeline(input_path= ml_client.data.get("coco128", version=1), model=ml_client.models.get("yolomodel", version=1), segmentation_model=ml_client.models.get("sam2model", version=1))

In [160]:
ml_client.jobs.create_or_update(segmentation_pipeline)

pathOnCompute is not a known attribute of class <class 'azure.ai.ml._restclient.v2023_04_01_preview.models._models_py3.UriFolderJobOutput'> and will be ignored
pathOnCompute is not a known attribute of class <class 'azure.ai.ml._restclient.v2023_04_01_preview.models._models_py3.UriFolderJobOutput'> and will be ignored


Experiment,Name,Type,Status,Details Page
notebooks,sleepy_gas_jqsyk3kkf9,pipeline,NotStarted,Link to Azure Machine Learning studio


In [161]:
segment[0].masks[0].class_name

NameError: name 'segment' is not defined

In [186]:
from ultralytics import YOLO 

In [200]:
from ultralytics import SAM

In [236]:
sm = SAM("sam2.1_b.pt")

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/sam2.1_b.pt to 'sam2.1_b.pt'...


100%|██████████| 154M/154M [00:14<00:00, 11.6MB/s] 


In [237]:
dm = YOLO("yolo11n.pt")

In [251]:
d = dm("coco128/images/train2017/000000000599.jpg", save_txt=True, project="detection_output")


image 1/1 /home/daniel/repos/aml/notebooks/coco128/images/train2017/000000000599.jpg: 416x640 1 cat, 1 couch, 1 potted plant, 2 remotes, 132.1ms
Speed: 23.0ms preprocess, 132.1ms inference, 8.8ms postprocess per image at shape (1, 3, 416, 640)
Results saved to [1mdetection_output/predict[0m
1 label saved to detection_output/predict/labels


In [262]:
def load_bboxes(path):
    with open(path) as f:
        lines = f.readlines()
    bboxes = []
    for line in lines:
        bbox = line.rstrip("\n").split(" ")
        bboxes.append([float(b) for b in bbox[1:]])
    return bboxes

In [263]:
load_bboxes("detection_output/predict/labels/000000000599.txt")

[[0.363869, 0.544014, 0.528958, 0.887638],
 [0.823807, 0.589523, 0.274796, 0.222137],
 [0.767803, 0.490098, 0.299068, 0.16051],
 [0.913592, 0.310914, 0.171335, 0.61133],
 [0.501443, 0.49617, 0.985432, 0.988912]]

In [253]:
load_bboxes("detection_output/000000000599.txt")

FileNotFoundError: [Errno 2] No such file or directory: 'detection_output/000000000599.txt'

In [247]:
d[0]

ultralytics.engine.results.Results object with attributes:

boxes: ultralytics.engine.results.Boxes object
keypoints: None
masks: None
names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant',

In [287]:
segment = sm("/home/daniel/repos/aml/notebooks/coco128/images/train2017/000000000599.jpg", bboxes=d[0].boxes.xywh)


image 1/1 /home/daniel/repos/aml/notebooks/coco128/images/train2017/000000000599.jpg: 1024x1024 1 0, 1 1, 1 2, 1 3, 1 4, 4677.4ms
Speed: 96.3ms preprocess, 4677.4ms inference, 13.9ms postprocess per image at shape (1, 3, 1024, 1024)


In [308]:
for s in segment:
    m = s.masks
    for mask in m:
        print(mask)

ultralytics.engine.results.Masks object with attributes:

data: tensor([[[False, False, False,  ..., False, False, False],
         [False, False, False,  ..., False, False, False],
         [False, False, False,  ..., False, False, False],
         ...,
         [False, False, False,  ..., False, False, False],
         [False, False, False,  ..., False, False, False],
         [False, False, False,  ..., False, False, False]]])
orig_shape: (407, 640)
shape: torch.Size([1, 407, 640])
xy: [array([[        325,         232],
       [        326,         232],
       [        326,         231],
       [        325,         232],
       [        324,         234],
       [        321,         231],
       [        319,         231],
       [        318,         232],
       [        317,         231],
       [        313,         231],
       [        312,         230],
       [        311,         230],
       [        309,         228],
       [        307,         228],
       [       

ultralytics.engine.results.Masks object with attributes:

data: tensor([[[False, False, False,  ..., False, False, False],
         [False, False, False,  ..., False, False, False],
         [False, False, False,  ..., False, False, False],
         ...,
         [False, False, False,  ..., False, False, False],
         [False, False, False,  ..., False, False, False],
         [False, False, False,  ..., False, False, False]]])
orig_shape: (407, 640)
shape: torch.Size([1, 407, 640])
xy: [array([[        510,         209],
       [        510,         211],
       [        511,         212],
       [        511,         211],
       [        512,         210],
       [        512,         209],
       [        510,         209],
       [        503,         208],
       [        506,         208],
       [        506,         210],
       [        502,         210],
       [        501,         211],
       [        499,         209],
       [        513,          58],
       [       

In [298]:
import numpy as np

In [299]:
np.save("mask.npz", segment[0].masks[0].data.numpy())

In [300]:
segment[0]

ultralytics.engine.results.Results object with attributes:

boxes: ultralytics.engine.results.Boxes object
keypoints: None
masks: ultralytics.engine.results.Masks object
names: {0: '0', 1: '1', 2: '2', 3: '3', 4: '4'}
obb: None
orig_img: array([[[ 19,   8,  10],
        [ 25,  14,  16],
        [ 30,  19,  21],
        ...,
        [ 97, 119, 137],
        [ 98, 115, 134],
        [ 92, 109, 128]],

       [[ 21,  10,  12],
        [ 25,  14,  16],
        [ 29,  18,  20],
        ...,
        [ 95, 115, 132],
        [ 96, 114, 131],
        [ 91, 109, 126]],

       [[ 21,  10,  12],
        [ 23,  12,  14],
        [ 25,  14,  16],
        ...,
        [ 96, 115, 130],
        [ 97, 113, 129],
        [ 93, 109, 125]],

       ...,

       [[ 79,  52,  31],
        [ 86,  59,  38],
        [ 86,  59,  38],
        ...,
        [ 20,  12,  13],
        [ 21,  10,  12],
        [ 21,  11,  11]],

       [[ 81,  54,  33],
        [ 84,  57,  36],
        [ 84,  57,  36],
        ...,
 

In [249]:
from azure.ai.ml.entities import Model

sam = Model(
    name="sam2model",
    description="ultralytics Sam B",
    path = "sam2.1_b.pt",
    tags={"sam": "sam2.1_b"},
    version = "1"
)

In [250]:
ml_client.models.create_or_update(sam)

Uploading sam2.1_b.pt (< 1 MB): 0.00B [00:00, ?B/s] (< 1 MB):   3%|▎         | 4.19M/162M [00:01<00:58, 2.71MB/s] (< 1 MB):   5%|▌         | 8.39M/162M [00:04<01:29, 1.71MB/s] (< 1 MB):  10%|█         | 16.8M/162M [00:07<00:45, 3.16MB/s] (< 1 MB):  13%|█▎        | 21.0M/162M [00:08<00:24, 5.74MB/s] (< 1 MB):  31%|███       | 50.3M/162M [00:09<00:09, 11.2MB/s] (< 1 MB):  28%|██▊       | 46.1M/162M [00:10<00:05, 19.3MB/s] (< 1 MB):  18%|█▊        | 29.4M/162M [00:10<00:03, 35.8MB/s] (< 1 MB):  26%|██▌       | 41.9M/162M [00:10<00:03, 35.8MB/s] (< 1 MB):  64%|██████▎   | 103M/162M [00:11<00:01, 46.2MB/s]  (< 1 MB):  49%|████▉     | 79.7M/162M [00:12<00:01, 48.0MB/s] (< 1 MB):  57%|█████▋    | 92.3M/162M [00:13<00:01, 45.9MB/s] (< 1 MB):  54%|█████▍    | 88.1M/162M [00:13<00:01, 38.4MB/s] (< 1 MB): 88.1MB [00:13, 6.33MB/s]                           




Model({'job_name': None, 'intellectual_property': None, 'is_anonymous': False, 'auto_increment_version': False, 'auto_delete_setting': None, 'name': 'sam2model', 'description': 'ultralytics Sam B', 'tags': {'sam': 'sam2.1_b'}, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/11f51dee-57cd-4d47-b542-8e244706e163/resourceGroups/sbx-dondorp/providers/Microsoft.MachineLearningServices/workspaces/amlsbxdondorp/models/sam2model/versions/1', 'Resource__source_path': '', 'base_path': '/home/daniel/repos/aml/notebooks', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x7f41d93e5ae0>, 'serialize': <msrest.serialization.Serializer object at 0x7f40996cd570>, 'version': '1', 'latest_version': None, 'path': 'azureml://subscriptions/11f51dee-57cd-4d47-b542-8e244706e163/resourceGroups/sbx-dondorp/workspaces/amlsbxdondorp/datastores/workspaceblobstore/paths/LocalUpload/fbd3099ebcf4dc91440fe8adfd0e8b6e/sam2.1_b.pt', 'datastore': None, 'utc_time_created': None, 

In [46]:
from azure.ai.ml.entities import Model
yolomodel = Model(
    name="yolomodel",
    description="yolo model",
    tags={"yolo": "v11"},
    path="yolo11n.pt",
    version="1",
    )

In [None]:
yolomodel

In [47]:
ml_client.models.create_or_update(yolomodel)

Uploading yolo11n.pt (< 1 MB): 0.00B [00:00, ?B/s] (< 1 MB): 100%|██████████| 5.61M/5.61M [00:00<00:00, 10.3MB/s] (< 1 MB): 100%|██████████| 5.61M/5.61M [00:00<00:00, 10.2MB/s]




Model({'job_name': None, 'intellectual_property': None, 'is_anonymous': False, 'auto_increment_version': False, 'auto_delete_setting': None, 'name': 'yolomodel', 'description': 'yolo model', 'tags': {'yolo': 'v11'}, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/11f51dee-57cd-4d47-b542-8e244706e163/resourceGroups/sbx-dondorp/providers/Microsoft.MachineLearningServices/workspaces/amlsbxdondorp/models/yolomodel/versions/1', 'Resource__source_path': '', 'base_path': '/home/daniel/repos/aml/notebooks', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x7fc610588ee0>, 'serialize': <msrest.serialization.Serializer object at 0x7fc61055cbe0>, 'version': '1', 'latest_version': None, 'path': 'azureml://subscriptions/11f51dee-57cd-4d47-b542-8e244706e163/resourceGroups/sbx-dondorp/workspaces/amlsbxdondorp/datastores/workspaceblobstore/paths/LocalUpload/c3409828f28205b32b60531141544634/yolo11n.pt', 'datastore': None, 'utc_time_created': None, 'flavors': N