Inaccurate grasp predictions #4

abhinavkk · 2022-04-07T23:48:09Z

Hello, firstly I want to mention this is some great work!

I was using the trained model to generate grasps for my own object pointclouds generated from simulation. Surprisingly the generated hand vertices were very distant from the object. I am not very sure the reason for this, is there any requirement on the input object pointcloud origin and axes orientation before using the network that I may have missed?

I am attaching an image of the predicted grasp for one of the input object pointcloud I used:

hwjiang1510 · 2022-04-08T00:59:39Z

Hi, thanks for the question.

In training, we do not apply transformation augmentation (while the ObMan dataset has a small object translation coverage in 3D). So the model will not be able to work on out of distribution object position.

This can be solved by:

Move the position of the object to in distribution area, generate a hand, and translate back. You can use this translation.
You can train a new model with transformation augmentation.

BTW, the model may not be able to generate hands for incomplete point clouds.

abhinavkk · 2022-04-11T15:42:26Z

Thanks for your quick reply.

Will you be able to help me understand the object distribution that you are using? Or if it depends on the ObMan dataset used for training, then how can I find out the distribution that was used in the dataset.

Sorry if this is a naive question, but I am little new to this area.

abhinavkk · 2022-04-14T14:00:44Z

Hi, thanks for the question.

In training, we do not apply transformation augmentation (while the ObMan dataset has a small object translation coverage in 3D). So the model will not be able to work on out of distribution object position.

This can be solved by:
* Move the position of the object to in distribution area, generate a hand, and translate back. You can use [this](https://github.com/hwjiang1510/GraspTTA/blob/cecb9642e6d63670d4e954cf420d03f1a93b5a90/gen_diverse_grasp_ho3d.py#L60) translation.

* You can train a new model with transformation augmentation.
BTW, the model may not be able to generate hands for incomplete point clouds.

The translation you suggested did help! Thanks for that, but I will still love to understand how did you come up with the distribution and translation for the network?

hwjiang1510 · 2022-04-18T08:37:07Z

Hi, thanks for the question.
In training, we do not apply transformation augmentation (while the ObMan dataset has a small object translation coverage in 3D). So the model will not be able to work on out of distribution object position.
This can be solved by:
* Move the position of the object to in distribution area, generate a hand, and translate back. You can use [this](https://github.com/hwjiang1510/GraspTTA/blob/cecb9642e6d63670d4e954cf420d03f1a93b5a90/gen_diverse_grasp_ho3d.py#L60) translation.

* You can train a new model with transformation augmentation.
BTW, the model may not be able to generate hands for incomplete point clouds.
The translation you suggested did help! Thanks for that, but I will still love to understand how did you come up with the distribution and translation for the network?

Actually, this also happened when I want to make use of this model

hwjiang1510 · 2022-04-19T02:22:58Z

Thanks for your quick reply.

Will you be able to help me understand the object distribution that you are using? Or if it depends on the ObMan dataset used for training, then how can I find out the distribution that was used in the dataset.

Sorry if this is a naive question, but I am little new to this area.

Maybe a straightforward method is checking the range and mean of the ObMan hand translation to understand the data distribution.

abhinavkk · 2022-04-28T13:33:29Z

Thank you so much! After following as you suggested I tried using to predict for complete pointclouds and to my surprise the results were kind of unexpected and interesting (see the attached images).

I notice there might be a scaling that I am missing because currently the hand and object scale does not look coherent
The predictions are strange as quite often the hands are inside or penetrating the object. Do you think this might require retraining/tuning as I am using different dataset (YCB). If so, I will appreciate if you can give insights on re-tuning the model.

hwjiang1510 · 2022-04-29T16:25:23Z

Yes, you should scale the input object point cloud to roughly match the size of the hand

abhinavkk · 2022-06-21T16:53:54Z

Thank you so much for the help until now! I managed to scale down the input point cloud size to have better results. Now the results are way better, though I was curious to know what can be different methods that we can utilize to remove the contact gap for some of the hand predictions. Currently, there are few predictions which have a contact gap between the object and the hand as below:

fuq1ang · 2023-03-15T14:03:00Z

Hi, thanks for the question.

In training, we do not apply transformation augmentation (while the ObMan dataset has a small object translation coverage in 3D). So the model will not be able to work on out of distribution object position.

This can be solved by:

Move the position of the object to in distribution area, generate a hand, and translate back. You can use this translation.

You can train a new model with transformation augmentation.

BTW, the model may not be able to generate hands for incomplete point clouds.

I want to know what the code corresponding to the link you gave in this answer means? I know it is used for translation, but I'm curious about what you mean by the initial value "np. array ([- 0.0793, 0.0208, -0.6924])"?

hwjiang1510 · 2023-03-17T16:11:55Z

Hi, thanks for the question.
In training, we do not apply transformation augmentation (while the ObMan dataset has a small object translation coverage in 3D). So the model will not be able to work on out of distribution object position.
This can be solved by:

Move the position of the object to in distribution area, generate a hand, and translate back. You can use this translation.

You can train a new model with transformation augmentation.

BTW, the model may not be able to generate hands for incomplete point clouds.

I want to know what the code corresponding to the link you gave in this answer means? I know it is used for translation, but I'm curious about what you mean by the initial value "np. array ([- 0.0793, 0.0208, -0.6924])"?

It is a random value sampled from the ObMan dataset translation distribution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inaccurate grasp predictions #4

Inaccurate grasp predictions #4

abhinavkk commented Apr 7, 2022

hwjiang1510 commented Apr 8, 2022

abhinavkk commented Apr 11, 2022 •

edited

abhinavkk commented Apr 14, 2022 •

edited

hwjiang1510 commented Apr 18, 2022

hwjiang1510 commented Apr 19, 2022

abhinavkk commented Apr 28, 2022

hwjiang1510 commented Apr 29, 2022

abhinavkk commented Jun 21, 2022 •

edited

fuq1ang commented Mar 15, 2023

hwjiang1510 commented Mar 17, 2023

Inaccurate grasp predictions #4

Inaccurate grasp predictions #4

Comments

abhinavkk commented Apr 7, 2022

hwjiang1510 commented Apr 8, 2022

abhinavkk commented Apr 11, 2022 • edited

abhinavkk commented Apr 14, 2022 • edited

hwjiang1510 commented Apr 18, 2022

hwjiang1510 commented Apr 19, 2022

abhinavkk commented Apr 28, 2022

hwjiang1510 commented Apr 29, 2022

abhinavkk commented Jun 21, 2022 • edited

fuq1ang commented Mar 15, 2023

hwjiang1510 commented Mar 17, 2023

abhinavkk commented Apr 11, 2022 •

edited

abhinavkk commented Apr 14, 2022 •

edited

abhinavkk commented Jun 21, 2022 •

edited