# `facebook/maskformer-swin-tiny-ade`

## Description
MaskFormer addresses instance, semantic and panoptic segmentation with the same paradigm: by predicting a set of masks and corresponding labels. Hence, all 3 tasks are treated as if they were instance segmentation.

## Running environment
- Custom Notebook image: `projects.registry.vmware.com/models/notebook/hf-inference-deploy@sha256:8c5960ce436881f37336b12556d7a661ea20e4dbfe9ac193516cf384daa51c19`
- 1 CPU, 5G memory

## Model inference

Import packages. *Note* that if you are using custom notebook image listed above, here you do not need to use `pip` to install any extra package.

In [1]:
from transformers import MaskFormerFeatureExtractor, MaskFormerForInstanceSegmentation
from PIL import Image
import requests

import os
os.environ['HTTP_PROXY'] = 'http://proxy.vmware.com:3128'
os.environ['HTTPS_PROXY'] = 'http://proxy.vmware.com:3128'
os.environ['CURL_CA_BUNDLE'] = ''

  from .autonotebook import tqdm as notebook_tqdm
2023-06-27 09:39:17.896619: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


*Note: Use `os.environ` to avoid HTTPS proxy error.*

Open the image. And extract the feature from the image.

In [2]:
url = "https://huggingface.co/datasets/hf-internal-testing/fixtures_ade20k/resolve/main/ADE_val_00000001.jpg"
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = MaskFormerFeatureExtractor.from_pretrained("facebook/maskformer-swin-tiny-ade")
inputs = feature_extractor(images=image, return_tensors="pt")

Downloading (…)rocessor_config.json: 100%|██████████| 380/380 [00:00<00:00, 37.6kB/s]


Get the pre-trained model. And use the extracted feature from above code cell as the input to the model inference.

In [3]:
model = MaskFormerForInstanceSegmentation.from_pretrained("facebook/maskformer-swin-tiny-ade")
outputs = model(**inputs)
# model predicts class_queries_logits of shape `(batch_size, num_queries)` and masks_queries_logits of shape `(batch_size, num_queries, height, width)`
class_queries_logits = outputs.class_queries_logits
masks_queries_logits = outputs.masks_queries_logits

Downloading (…)lve/main/config.json: 100%|██████████| 12.5k/12.5k [00:00<00:00, 4.50MB/s]
Downloading pytorch_model.bin: 100%|██████████| 167M/167M [00:10<00:00, 16.0MB/s] 


Post-process the model inference result, and print the result

In [4]:
predicted_semantic_map = feature_extractor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]

In [5]:
list(predicted_semantic_map.shape)

[512, 683]