Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: operands could not be broadcast together with shapes (480,132) (960,1280) (960,1280) #1

Open
RyuseiiSama opened this issue Jun 5, 2024 · 2 comments

Comments

@RyuseiiSama
Copy link

Hello! I chanced upon this study and was just fiddling around trying to apply it to my own objects.

Steps taken:

  1. Followed your instructions to setup
  2. Ran os_tog.ipynb with your sample images, succeeded
  3. Reran using my own sample images, each as .jpeg and .png (not sure if its relevant), particularly this cell:
target_object = "marker"
target_task = "handover"

scene_img = cv2.imread("../samples/test2.jpeg")
scene_img = cv2.cvtColor(scene_img, cv2.COLOR_BGR2RGB)

grasps = framework.get_prediction(scene_img, target_object, target_task)
  1. Got this output:
[INFO] Found object 'marker' in database.
Attempting K=10
Using KMeans - PyTorch, Cosine Similarity, No Elbow
Output centroids are normalized
used 3 iterations (0.0083s) to cluster 26 items into 10 clusters
Generating Saliency mask
Attempting K=80
Using KMeans - PyTorch, Cosine Similarity, No Elbow
Output centroids are normalized
used 9 iterations (0.022s) to cluster 832 items into 80 clusters
Attempting K=80
Using KMeans - PyTorch, Cosine Similarity, No Elbow
Output centroids are normalized
used 8 iterations (0.0058s) to cluster 393 items into 80 clusters
Starting processing of the affordances

And this error:

	"name": "ValueError",
	"message": "operands could not be broadcast together with shapes (480,132) (960,1280) (960,1280) ",
	"stack": "---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[13], line 7
      4 scene_img = cv2.imread(\"../samples/test2.jpeg\")
      5 scene_img = cv2.cvtColor(scene_img, cv2.COLOR_BGR2RGB)
----> 7 grasps = framework.get_prediction(scene_img, target_object, target_task)

File ~/Desktop/os_tog/os_tog/os_tog/framework.py:68, in OS_TOG.get_prediction(self, scene_img, target_object, target_task)
     66 if self.cfg.MULTI_REF_AFF: # align affordance to object through rotations
     67     ref_aff, ref_img = self.get_nearest_affordance(ref_aff, ref_img, scene_img, (pred_mask[obj_idx], pred_boxes[obj_idx]))
---> 68 pred_aff = self.get_affordance_recognition_predictions(ref_img, ref_aff, scene_img, (pred_mask[obj_idx], pred_boxes[obj_idx]))
     70 grasps = self.get_valid_grasps(scene_img, pred_aff)    
     71 return grasps[0] # return final grasp

File ~/Desktop/os_tog/os_tog/os_tog/framework.py:268, in OS_TOG.get_affordance_recognition_predictions(self, ref_img, ref_aff, obs_img, segm_preds)
    266 if self.cfg.VISUALIZE:
    267     visualize(np.array(ref_img), masks=np.asarray([ref_aff]), title=\"Reference Affordance\", figsize=(5,5)) # may be rotate if u chose MULTI_REF_AFF=True in cfg
--> 268     visualize(obs_img, masks=np.asarray([uncrop_mask]), title=\"Affordance Prediction\", figsize=(5,5))
    269 return uncrop_mask

File ~/Desktop/os_tog/os_tog/os_tog/utils.py:51, in visualize(image, boxes, masks, class_ids, grasps, figsize, ax, title)
     49 if masks is not None:
     50     mask = masks[i, :, :]
---> 51     masked_image = apply_mask(masked_image, mask, color)
     53 # plot grasps
     54 if grasps is not None:

File ~/Desktop/os_tog/os_tog/os_tog/utils.py:123, in apply_mask(image, mask, color, alpha)
    121 \"\"\"Apply a binary mask to an image.\"\"\"
    122 for c in range(3):
--> 123     image[:, :, c] = np.where(mask == 1,
    124                               image[:, :, c] *
    125                               (1 - alpha) + alpha * color[c] * 255,
    126                               image[:, :, c])
    127 return image

File <__array_function__ internals>:180, in where(*args, **kwargs)

ValueError: operands could not be broadcast together with shapes (480,132) (960,1280) (960,1280) "

Images that appeared:

image

Wondering if it was meant to run with non-sample images? if so how may I (in future) come around to implement this?

Do note that i am EXTREMELY new to anything computer vision related. That being said, please throw any technicalities that resulted in this issue!

Thanks in advance :)

@Sai-Yarlagadda
Copy link

Sai-Yarlagadda commented Jul 15, 2024

You can add this line before getting grasps.
scene_img = cv2.resize(scene_img, (480,132))

@valerija-h
Copy link
Owner

The demo scene image and real-world experiment scenes were dimensions of 640x480 pixels, I believe yours may be 1280x960. Could you try resizing it as @Sai-Yarlagadda suggested, but instead do it after converting the scene to RGB near the start of the cell and make the output size 640x480 like so;

scene_img = cv2.imread("../samples/test2.jpeg")
scene_img = cv2.cvtColor(scene_img, cv2.COLOR_BGR2RGB)
# added resizing below
scene_img = cv2.resize(scene_img, (640, 480))

If that doesn't work could you attach your test image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants