UperNet
===

**Unified Perceptual Parsing for Scene Understanding**

 * Paper: https://arxiv.org/abs/1807.10221

![UperNet Overview](../assets/upernet_overview.png)

In [1]:
from PIL import Image
import torch
from transformers import (
    AutoImageProcessor,
    UperNetForSemanticSegmentation
)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
image_processor = AutoImageProcessor.from_pretrained(
    "openmmlab/upernet-convnext-tiny"
)
model = UperNetForSemanticSegmentation.from_pretrained(
    "openmmlab/upernet-convnext-tiny"
).to(device)


  from .autonotebook import tqdm as notebook_tqdm
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.


In [3]:
image_path = "../samples/plants.jpg"
image = Image.open(image_path).convert("RGB")

inputs = image_processor(images=image, return_tensors="pt")
inputs = inputs.to(device)

with torch.no_grad():
    outputs = model(**inputs)

logits = outputs.logits
list(logits.shape)

[1, 150, 512, 512]