mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe_tasks_components_processors_imagepreprocessinggraph__ImageToTensorCalculator #5298

tarakang · 2024-04-06T07:51:46Z

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

MacOS 14.3.1

MediaPipe Tasks SDK version

0.10.9

Task name (e.g. Image classification, Gesture recognition etc.)

selfie_multiclass_hairr

Programming Language and version (e.g. C++, Python, Java)

Python

Describe the actual behavior

An error occurred when I entered the image of type float32

Describe the expected behaviour

It is expected to handle images of type float32

Standalone code/steps you may have used to try to get what you need

import os
from functools import reduce
import wget
import cv2, sys
import numpy as np
from PIL import Image
import mediapipe as mp

BaseOptions = mp.tasks.BaseOptions
ImageSegmenter = mp.tasks.vision.ImageSegmenter
ImageSegmenterOptions = mp.tasks.vision.ImageSegmenterOptions
VisionRunningMode = mp.tasks.vision.RunningMode

MASK_OPTION_1_HAIR = 'hair'

output_image_path = './mask.jpg'



# image = Image.open(f"{image_path}")

mask_targets = ['hair']
mask_dilation = 1

model_folder_path = 'mediapipe'
os.makedirs(model_folder_path, exist_ok=True)

model_path = os.path.join(model_folder_path, 'selfie_multiclass_256x256.tflite')
model_url = 'https://storage.googleapis.com/mediapipe-models/image_segmenter/selfie_multiclass_256x256/float32/latest/selfie_multiclass_256x256.tflite'
if not os.path.exists(model_path):
    print(f"Downloading 'selfie_multiclass_256x256.tflite' model")
    wget.download(model_url, model_path)

options = ImageSegmenterOptions(base_options=BaseOptions(model_asset_path=model_path),
                                running_mode=VisionRunningMode.IMAGE,
                                output_category_mask=True)


def get_mediapipe_image(numpy_image: Image) -> mp.Image:
    # Convert gr.Image to NumPy array
    # numpy_image = cv2.imread(numpy_image)
    image_format = mp.ImageFormat.VEC32F1

    # Convert BGR to RGB (if necessary)
    if numpy_image.shape[-1] == 4:
        image_format = mp.ImageFormat.VEC32F1
    elif numpy_image.shape[-1] == 3:
        image_format = mp.ImageFormat.VEC32F1
        numpy_image = cv2.cvtColor(numpy_image, cv2.COLOR_BGR2RGB)



    return mp.Image(image_format=image_format, data=numpy_image)


def process(image):
    # Create the image segmenter
    with ImageSegmenter.create_from_options(options) as segmenter:

        # Retrieve the masks for the segmented image
        media_pipe_image = get_mediapipe_image(numpy_image=image)

        segmented_masks = segmenter.segment(media_pipe_image)

        masks = []
        for i, target in enumerate(mask_targets):
            mask_index = 1  # Hair mask index
            masks.append(segmented_masks.confidence_masks[mask_index] if i == 0 else np.zeros(media_pipe_image.shape[:2], dtype=np.uint8))

        image_data = media_pipe_image.numpy_view()

        # convert the image shape from "rgb" to "rgba" aka add the alpha channel
        if image_data.shape[-1] == 3:
            image_shape = (image_data.shape[0], image_data.shape[1], 4)
            alpha_channel = np.ones((image_shape[0], image_shape[1], 1), dtype=np.uint8) * 255
            image_data = np.concatenate((image_data, alpha_channel), axis=2)

        image_shape = image_data.shape

        mask_background_array = np.zeros(image_shape, dtype=np.uint8)
        mask_background_array[:] = (0, 0, 0, 0)
        image_array = np.zeros(image_shape, dtype=np.uint8)
        image_array[:] = (255, 255, 255, 255)

        mask_arrays = []

        for i, mask in enumerate(masks):
            condition = np.stack((mask.numpy_view(),) * image_shape[-1], axis=-1) > 0.25
            mask_array = np.where(condition, image_array, mask_background_array)
            # mask_array = np.where(condition, image_data,mask_background_array)
            mask_arrays.append(mask_array)

        # Merge our masks taking the maximum from each
        merged_mask_arrays = reduce(np.maximum, mask_arrays)

        # Dilate or erode the mask
        if mask_dilation > 0:
            merged_mask_arrays = cv2.dilate(merged_mask_arrays, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (2*mask_dilation + 1, 2*mask_dilation + 1), (mask_dilation, mask_dilation)))
        elif mask_dilation < 0:
            merged_mask_arrays = cv2.erode(merged_mask_arrays, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (2*mask_dilation + 1, 2*mask_dilation + 1), (mask_dilation, mask_dilation)))

        # Create the image
        mask_image = Image.fromarray(cv2.cvtColor(merged_mask_arrays, cv2.COLOR_BGR2RGB))
        mask_image.save(output_image_path)
        mask_image_np = np.array(mask_image)
        return mask_image_np.astype('float32')

image = sys.argv[1]
im = cv2.imread(image)
h, w, _ = im.shape
inputs = cv2.resize(im, (480, 480))
inputs = inputs.astype('float32')
inputs.shape = (1,) + inputs.shape
inputs /= 255
process(inputs)

Other info / Complete Logs

python3 mult_test.py ./IMG313.jpg 
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1712389878.567908       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M1 Pro
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
E0000 00:00:1712389878.577591       1 calculator_graph.cc:876] INVALID_ARGUMENT: CalculatorGraph::Run() failed: 
Calculator::Process() for node "mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe_tasks_components_processors_imagepreprocessinggraph__ImageToTensorCalculator" failed: Unsupported format: 9
Traceback (most recent call last):
  File "/Users/kangjian/media/mult_test.py", line 63, in process
    segmented_masks = segmenter.segment(media_pipe_image)
  File "/Users/kangjian/miniconda3/envs/mediapipe/lib/python3.9/site-packages/mediapipe/tasks/python/vision/image_segmenter.py", line 302, in segment
    output_packets = self._process_image_data({
  File "/Users/kangjian/miniconda3/envs/mediapipe/lib/python3.9/site-packages/mediapipe/tasks/python/vision/core/base_vision_task_api.py", line 95, in _process_image_data
    return self._runner.process(inputs)
ValueError: CalculatorGraph::Run() failed: 
Calculator::Process() for node "mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe_tasks_components_processors_imagepreprocessinggraph__ImageToTensorCalculator" failed: Unsupported format: 9

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/kangjian/media/mult_test.py", line 115, in <module>
    process(inputs)
  File "/Users/kangjian/media/mult_test.py", line 106, in process
    return mask_image_np.astype('float32')
  File "/Users/kangjian/miniconda3/envs/mediapipe/lib/python3.9/site-packages/mediapipe/tasks/python/vision/core/base_vision_task_api.py", line 226, in __exit__
    self.close()
  File "/Users/kangjian/miniconda3/envs/mediapipe/lib/python3.9/site-packages/mediapipe/tasks/python/vision/core/base_vision_task_api.py", line 209, in close
    self._runner.close()
ValueError: CalculatorGraph::Run() failed: 
Calculator::Process() for node "mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe_tasks_components_processors_imagepreprocessinggraph__ImageToTensorCalculator" failed: Unsupported format: 9

tarakang · 2024-04-06T07:55:59Z

image_format = mp.ImageFormat.SRGB if numpy_image.shape[-1] == 4: image_format = mp.ImageFormat.SRGBA elif numpy_image.shape[-1] == 3: image_format = mp.ImageFormat.SRGB If I change this to mp.ImageFormat.SRGB, I get the following error: Traceback (most recent call last):
File "/Users/kangjian/media/mult_test.py", line 115, in
process(inputs)
File "/Users/kangjian/media/mult_test.py", line 61, in process
media_pipe_image = get_mediapipe_image(numpy_image=image)
File "/Users/kangjian/media/mult_test.py", line 53, in get_mediapipe_image
return mp.Image(image_format=image_format, data=numpy_image)
RuntimeError: float image data should be either VEC32F1, VEC32F2, or VEC32F4 MediaPipe image formats.

kuaashish · 2024-04-12T04:38:57Z

Hi @kinarr,

I can replicate the problem in Colab Gist, as indicated by @tarakang. I encounter the same error message: "failed: Unsupported format: 9". For now, this issue does not appear to be specific to macOS. Could you please have look into this issue? From our standpoint, it appears to be a legitimate bug.

Thank you!!

kinarr · 2024-04-12T11:33:06Z

@kuaashish I believe it's caused by the limited Image formats supported by MediaPipe but there should be a way to normalize the inputs so that it can passed to the model. Lemme take a look.

kinarr · 2024-04-12T11:59:19Z

Oh and there's no need to preprocess the inputs like this:

inputs = inputs.astype('float32')
inputs.shape = (1,) + inputs.shape
inputs /= 255

Please see the following for the correct usage of the API:

https://github.com/google/mediapipe/blob/7dcf9ae0e6ec82559e7e733ff71063960650874e/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_test.cc#L99

https://github.com/google/mediapipe/blob/7dcf9ae0e6ec82559e7e733ff71063960650874e/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_test.cc#L381

These are C++ API calls but it should be similar for Python.

kinarr · 2024-04-12T12:12:04Z

@tarakang @kuaashish Here's the working notebook for your reference: https://colab.research.google.com/drive/1B0mPPfcWCyr07CBwXgraBIkdqri5BGff?usp=sharing

kinarr · 2024-04-12T12:26:16Z

The model card for the MediaPipe Selfie Segmentation model provides essential information about the expected input format for the model. So PTAL here for the respective model cards for each segmentation model: https://developers.google.com/mediapipe/solutions/vision/image_segmenter#multiclass-model

As for the input features this should be helpful: https://developers.google.com/mediapipe/solutions/vision/image_segmenter#features

kinarr · 2024-04-12T12:44:15Z

@tarakang I've added your image utility (get_mediapipe_image) in the notebook so you can uncomment it and try it out too and it should work fine.

kuaashish · 2024-04-15T05:41:22Z

Hi @kinarr,

Thank you, @tarakang. It appears there may be an issue with the limited image format support in Mediapipe. Could you kindly test the provided working notebook and inform us if the issue persists?

Thank you!!

github-actions · 2024-04-23T01:45:58Z

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions · 2024-05-01T01:47:37Z

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler bot assigned kuaashish Apr 6, 2024

kuaashish added os:macOS Issues on MacOS task:image segmentation Issues related to image segmentation: Locate objects and create image masks with labels platform:python MediaPipe Python issues labels Apr 8, 2024

kuaashish added the stat:awaiting googler Waiting for Google Engineer's Response label Apr 12, 2024

kuaashish added stat:awaiting response Waiting for user response and removed stat:awaiting googler Waiting for Google Engineer's Response labels Apr 15, 2024

github-actions bot added the stale label Apr 23, 2024

github-actions bot closed this as completed May 1, 2024

kuaashish removed stat:awaiting response Waiting for user response stale labels May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe_tasks_components_processors_imagepreprocessinggraph__ImageToTensorCalculator #5298

mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe_tasks_components_processors_imagepreprocessinggraph__ImageToTensorCalculator #5298

tarakang commented Apr 6, 2024

tarakang commented Apr 6, 2024

kuaashish commented Apr 12, 2024

kinarr commented Apr 12, 2024 •

edited

kinarr commented Apr 12, 2024

kinarr commented Apr 12, 2024

kinarr commented Apr 12, 2024 •

edited

kinarr commented Apr 12, 2024

kuaashish commented Apr 15, 2024

github-actions bot commented Apr 23, 2024

github-actions bot commented May 1, 2024

mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe_tasks_components_processors_imagepreprocessinggraph__ImageToTensorCalculator #5298

mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe_tasks_components_processors_imagepreprocessinggraph__ImageToTensorCalculator #5298

Comments

tarakang commented Apr 6, 2024

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

OS Platform and Distribution

MediaPipe Tasks SDK version

Task name (e.g. Image classification, Gesture recognition etc.)

Programming Language and version (e.g. C++, Python, Java)

Describe the actual behavior

Describe the expected behaviour

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

tarakang commented Apr 6, 2024

kuaashish commented Apr 12, 2024

kinarr commented Apr 12, 2024 • edited

kinarr commented Apr 12, 2024

kinarr commented Apr 12, 2024

kinarr commented Apr 12, 2024 • edited

kinarr commented Apr 12, 2024

kuaashish commented Apr 15, 2024

github-actions bot commented Apr 23, 2024

github-actions bot commented May 1, 2024

kinarr commented Apr 12, 2024 •

edited

kinarr commented Apr 12, 2024 •

edited