<img src='https://github.com/Ikomia-dev/notebooks/blob/main/examples/img/banner_ikomia.png?raw=true'>




# How to run MobileSAM with the Ikomia API 

**MobileSAM** (Faster Segment Anything) is a streamlined and efficient variant of the Segment Anything Model (SAM), optimized for mobile applications. 

The innovation primarily addresses the challenge posed by the original SAM's resource-intensive image encoder. MobileSAM introduces a lightweight image encoder, significantly reducing the model's size and computational demands without compromising performance.

![illustration](https://raw.githubusercontent.com/ChaoningZhang/MobileSAM/master/assets/mask_point.jpg)

## Setup

You need to install Ikomia Python API with pip


In [None]:
!pip install ikomia

---

**-Google Colab ONLY- Restart runtime**

Click on the "RESTART RUNTIME" button at the end the previous window.

---

## Run MobileSAM on your image

### Box prompt

In [None]:
from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo  = wf.add_task(name = "infer_mobile_segment_anything", auto_connect=True)

# Setting parameters: boxes on the wheels
algo.set_parameters({
    "input_box": "[[425, 600, 700, 875], [1240, 675, 1400, 750], [1375, 550, 1650, 800]]"
})

# Run directly on your image
wf.run_on(url="https://github.com/facebookresearch/segment-anything/blob/main/notebooks/images/truck.jpg?raw=true")


In [None]:
from ikomia.utils.displayIO import display

# Display segmentation mask
from PIL import ImageShow
ImageShow.register(ImageShow.IPythonViewer(), 0)

display(algo.get_image_with_mask())

### Point prompt (select mask output)

In [None]:
# Init your workflow
wf = Workflow()

# Add algorithm
algo  = wf.add_task(name = "infer_mobile_segment_anything", auto_connect=True)

# Setting parameters: boxes on the wheels
algo.set_parameters({
    "input_point": "[500, 375]",
    "mask_id":"1"
})

# Run directly on your image
wf.run_on(url="https://github.com/facebookresearch/segment-anything/blob/main/notebooks/images/truck.jpg?raw=true")

# Display your image
display(algo.get_image_with_mask())

In [None]:
# Init your workflow
wf = Workflow()

# Add algorithm
algo  = wf.add_task(name = "infer_mobile_segment_anything", auto_connect=True)

# Setting parameters: boxes on the wheels
algo.set_parameters({
    "input_point": "[500, 375]",
    "mask_id":"3"
})

# Run directly on your image
wf.run_on(url="https://github.com/facebookresearch/segment-anything/blob/main/notebooks/images/truck.jpg?raw=true")

# Display your image
display(algo.get_image_with_mask())

In [None]:
# Init your workflow
wf = Workflow()

# Add algorithm
algo  = wf.add_task(name = "infer_mobile_segment_anything", auto_connect=True)

# Setting parameters: boxes on the wheels
algo.set_parameters({
    "input_box": "[425, 600, 700, 875]",
    "input_point": "[500, 375]",
    "input_point_label": "0"
})

# Run directly on your image
wf.run_on(url="https://github.com/facebookresearch/segment-anything/blob/main/notebooks/images/truck.jpg?raw=true")

# Display your image
display(algo.get_image_with_mask())

### Automatic mask generator

In [None]:
# Init your workflow
wf = Workflow()

# Add algorithm
algo  = wf.add_task(name = "infer_mobile_segment_anything", auto_connect=True)

# Setting parameters: boxes on the wheels
algo.set_parameters({
    "points_per_side": "16",
})

# Run directly on your image
wf.run_on(url="https://github.com/Ikomia-dev/notebooks/blob/main/examples/img/img_work.jpg?raw=true")

# Display your image
display(algo.get_image_with_mask())

#### List of parameters

- **input_box** (list): A Nx4 array of given box prompts to the  model, in [XYXY] or [[XYXY], [XYXY]] format.
- **draw_graphic_input** (Boolean): When set to True, it allows you to draw graphics (box or point) over the object you wish to segment. If set to False, MobileSAM will automatically generate masks for the entire image.
- **mask_id** (int) - default '1': When [a single graphic point](https://github.com/Ikomia-hub/infer_mobile_segment_anything#a-single-point) is selected, MobileSAM with generate three outputs given a single point (3 best scores). You can select which mask to output using the mask_id parameters (1, 2 or 3). 
- **input_point** (list, *optional*): A Nx2 array of point prompts to the model. Each point is in [X,Y] in pixels.
- **input_point_label** (list, *optional*): A length N array of labels for the point prompts. 1 indicates a foreground point and 0 indicates a background point
- **points_per_side** (int) - default '32' : (Automatic detection mode). The number of points to be sampled along one side of the image. The total number of points is points_per_side**2. 
- **points_per_batch** (int) - default '64': (Automatic detection mode).  Sets the number of points run simultaneously by the model. Higher numbers may be faster but use more GPU memory.
- **stability_score_thres** (float) - default '0.95': Filtering threshold in [0,1], using the stability of the mask under changes to the cutoff used to binarize the model's mask predictions.
- **box_nms_thres** (float) - default '0.7': The box IoU cutoff used by non-maximal suppression to filter duplicate masks.
- **iou_thres** (float) - default '0.88': A filtering threshold in [0,1], using the model's predicted mask quality.
- **crop_n_layers** (int) - default '0' : If >0, mask prediction will be run again oncrops of the image. Sets the number of layers to run, where each layer has 2**i_layer number of image crops.
- **crop_nms_thres** (float) - default '0': The box IoU cutoff used by non-maximal suppression to filter duplicate masks between different crops.
- **crop_overlap_ratio** (float) default 'float(512 / 1500)'
- **crop_n_points_downscale_factor** (int) - default '1' : The number of points-per-side sampled in layer n is scaled down by crop_n_points_downscale_factor**n.
- **min_mask_region_area** (int) - default '0': op layer. Exclusive with points_per_side. min_mask_region_area (int): If >0, postprocessing will be applied to remove disconnected regions and holes in masks with area smaller than min_mask_region_area. 
- **input_size_percent** (int) - default '100': Percentage size of the input image. Can be reduce to save memory usage. 