-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with diffusion head #43
Comments
Okay, I was able to figure out that the horizon was too large; 1~3 was fine but 9 led to an unstable model. |
Reopening since although the output values are more stable, I found that the policy is still considerably worse than an L1-based policy. |
Thanks for digging into this! |
Similar problem. The paper indicate that the Diffusion head outperforms the L1 head. I guess, this conclusion is not necessarily extended to the fintuing condition. |
Maybe we can build a Discord to discuss these questions, the official reply is too slow these days. |
Merge in R2D2 eval script and small QOL changes to dataset.py
Compared with the L1 and MSE heads, the diffusion head seems to show more unstable characteristics under fewer training steps. In MuJoCO, simulation instability will occur, resulting in the following error:
Even reducing the size of the horzion parameter still has similar problems, and it seems that using a diffusion head for fine-tuning is more expensive. I don't know whether the head parameters in the config should be adjusted more carefully, or whether this problem is simply a flaw in the diffusion algorithm. |
Hi,
Hi, have you checked the reason of the error? Try to print the action the model output, you could see that the number is out of bound or hard to be solved by IK optimizer. |
Thank you for your attention. I tried Aloha simulation with 1000 and 5000 steps on Diffusion. Unfortunately, this problem was not reproduced. Maybe this problem itself is also unstable like Diffusion Head. |
I was able to fine-tune with a modified version of example 2 with the following action head:
The policy works reasonably well on the robot.
After, I've been trying to fine-tune with a diffusion head, but the robot goes out of control with this.
The rest of the script is unchanged.
What else could be the problem?
Update: The diffusion-based model seems to be outputting fairly extreme action values, like those that are less than the minimum value or more than the maximum value in the dataset.
These are the action statistics:
and these are sample outputs after unnormalization:
which are clearly out of bounds.
The text was updated successfully, but these errors were encountered: