Description
System Info
transformers
version: 4.34.0- Platform: Linux-5.15.0-89-generic-x86_64-with-glibc2.31
- Python version: 3.10.13
- Huggingface_hub version: 0.21.4
- Safetensors version: 0.4.0
- PyTorch version (GPU?): 2.1.0+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
from transformers import Mask2FormerImageProcessor
import numpy as np
ignore_index = 255
image_processor = Mask2FormerImageProcessor(ignore_index=ignore_index,
do_resize=False,
do_rescale=False,
do_normalize=False)
image_norm = np.random.rand(4, 4, 3)
# Create void mask (all pixels have ignore_index)
semantic_mask = np.ones(image_norm.shape[:2], dtype=np.uint8)*255
semantic_mask = semantic_mask.astype(np.uint8)
print(semantic_mask)
inputs = image_processor(
image_norm,
segmentation_maps=semantic_mask,
return_tensors='pt',
)
print(inputs)
===========================================================
[[255 255 255 255]
[255 255 255 255]
[255 255 255 255]
[255 255 255 255]]
Traceback (most recent call last):
File "/home/anba/catkin_ws/src/tas_dev/dev/anba/Mask2Former/test.py", line 21, in
inputs = image_processor(
File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/transformers/models/mask2former/image_processing_mask2former.py", line 566, in call
return self.preprocess(images, segmentation_maps=segmentation_maps, **kwargs)
File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/transformers/models/mask2former/image_processing_mask2former.py", line 764, in preprocess
encoded_inputs = self.encode_inputs(
File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/transformers/models/mask2former/image_processing_mask2former.py", line 943, in encode_inputs
masks, classes = self.convert_segmentation_map_to_binary_masks(
File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/transformers/models/mask2former/image_processing_mask2former.py", line 558, in convert_segmentation_map_to_binary_masks
return convert_segmentation_map_to_binary_masks(
File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/transformers/models/mask2former/image_processing_mask2former.py", line 284, in convert_segmentation_map_to_binary_masks
binary_masks = np.stack(binary_masks, axis=0) # (num_labels, height, width)
File "<array_function internals>", line 180, in stack
File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/numpy/core/shape_base.py", line 422, in stack
raise ValueError('need at least one array to stack')
ValueError: need at least one array to stackProcess finished with exit code 1
Expected behavior
If this is intended that void masks should never be passed, then the result is fine.
However, when training segmentation models, shouldn't it be possible to include images with only background/void class?
Metadata
Metadata
Assignees
Labels
Type
Projects
Status