Issue with diffusion head #43

seann999 · 2024-01-29T01:16:35Z

I was able to fine-tune with a modified version of example 2 with the following action head:

config["model"]["heads"]["action"] = ModuleSpec.create(
    L1ActionHead,
    pred_horizon=9,
    action_dim=11,
    readout_key="readout_action",
)

The policy works reasonably well on the robot.
After, I've been trying to fine-tune with a diffusion head, but the robot goes out of control with this.

config["model"]["heads"]["action"] = ModuleSpec.create(
    # L1ActionHead,
    DiffusionActionHead,
    use_map=False,

    pred_horizon=9,
    action_dim=11,
    readout_key="readout_action",
)

The rest of the script is unchanged.
What else could be the problem?

Update: The diffusion-based model seems to be outputting fairly extreme action values, like those that are less than the minimum value or more than the maximum value in the dataset.

These are the action statistics:

'max': array([6.92096353e-03, 7.15068638e-01, 2.65712190e+00, 1.22000003e+00, 4.09653854e+00, 9.43594933e-01, 1.17203128e+00, 2.65219069e+00, 1.00000000e+00, 1.50034428e-01, 4.94167267e-04]),
'mean': array([-1.45641267e+00,  2.27537051e-01, -2.96192672e-02, -6.16574585e-01, -7.61023015e-02,  2.06921268e-02,  4.98067914e-03, -1.26738450e-04, 5.67098975e-01,  7.91745202e-04, -7.36813426e-01]),
'min': array([-2.61999989, -0.05653883, -1.91999996, -1.75      , -2.45071816, -0.88507879, -1.3124876 , -1.5       ,  0.        , -0.09809657, -1.35774779]),
'std': array([0.50778913, 0.21539085, 0.41647774, 0.72304732, 0.8022927, 0.11436515, 0.08398118, 0.30631647, 0.49528143, 0.01135448, 0.26658419])}

and these are sample outputs after unnormalization:

act: [-3.9953585   1.3044913   2.0527694  -4.231811    3.9353614   0.59251785   -0.41492522 -1.5317091   3.0435061  -0.05598066 -2.0697343 ]
act: [ 1.082533    1.3044913   2.0527694   2.998662   -4.087566    0.59251785
 -0.41492522  1.5314556   3.0435061  -0.05598066 -2.0697343 ]

which are clearly out of bounds.

The text was updated successfully, but these errors were encountered:

seann999 · 2024-01-29T09:44:12Z

Okay, I was able to figure out that the horizon was too large; 1~3 was fine but 9 led to an unstable model.

seann999 · 2024-01-29T12:12:49Z

Reopening since although the output values are more stable, I found that the policy is still considerably worse than an L1-based policy.

kpertsch · 2024-01-29T16:56:18Z

Thanks for digging into this!
We are debugging the ALOHA finetuning on our side as well, and also found that the diffusion head is somehow substantially worse than the L1 head in this particular case and outputs non-sensical values. I think it should work in principle so we are digging into this as well. Will update here if we find a solution!

zwbx · 2024-04-08T08:34:52Z

Similar problem. The paper indicate that the Diffusion head outperforms the L1 head. I guess, this conclusion is not necessarily extended to the fintuing condition.

zwbx · 2024-04-08T08:37:04Z

Maybe we can build a Discord to discuss these questions, the official reply is too slow these days.

Merge in R2D2 eval script and small QOL changes to dataset.py

BUAAZhangHaonan · 2024-05-11T15:31:53Z

Thanks for digging into this! We are debugging the ALOHA finetuning on our side as well, and also found that the diffusion head is somehow substantially worse than the L1 head in this particular case and outputs non-sensical values. I think it should work in principle so we are digging into this as well. Will update here if we find a solution!

Compared with the L1 and MSE heads, the diffusion head seems to show more unstable characteristics under fewer training steps. In MuJoCO, simulation instability will occur, resulting in the following error:

Nan, Inf or huge value in QACC at DOF 0. The simulation is unstable. Time = 0.0840.

dm_control.rl.control.PhysicsError: Physics state is invalid. Warning(s) raised: mjWARN_BADQACC

Even reducing the size of the horzion parameter still has similar problems, and it seems that using a diffusion head for fine-tuning is more expensive. I don't know whether the head parameters in the config should be adjusted more carefully, or whether this problem is simply a flaw in the diffusion algorithm.

zwbx · 2024-05-13T10:50:51Z

Hi,

Thanks for digging into this! We are debugging the ALOHA finetuning on our side as well, and also found that the diffusion head is somehow substantially worse than the L1 head in this particular case and outputs non-sensical values. I think it should work in principle so we are digging into this as well. Will update here if we find a solution!

Compared with the L1 and MSE heads, the diffusion head seems to show more unstable characteristics under fewer training steps. In MuJoCO, simulation instability will occur, resulting in the following error:
Nan, Inf or huge value in QACC at DOF 0. The simulation is unstable. Time = 0.0840.
dm_control.rl.control.PhysicsError: Physics state is invalid. Warning(s) raised: mjWARN_BADQACC
Even reducing the size of the horzion parameter still has similar problems, and it seems that using a diffusion head for fine-tuning is more expensive. I don't know whether the head parameters in the config should be adjusted more carefully, or whether this problem is simply a flaw in the diffusion algorithm.

Hi, have you checked the reason of the error? Try to print the action the model output, you could see that the number is out of bound or hard to be solved by IK optimizer.

BUAAZhangHaonan · 2024-05-13T14:27:20Z

Hi,
Thanks for digging into this! We are debugging the ALOHA finetuning on our side as well, and also found that the diffusion head is somehow substantially worse than the L1 head in this particular case and outputs non-sensical values. I think it should work in principle so we are digging into this as well. Will update here if we find a solution!

Compared with the L1 and MSE heads, the diffusion head seems to show more unstable characteristics under fewer training steps. In MuJoCO, simulation instability will occur, resulting in the following error:
Nan, Inf or huge value in QACC at DOF 0. The simulation is unstable. Time = 0.0840.
dm_control.rl.control.PhysicsError: Physics state is invalid. Warning(s) raised: mjWARN_BADQACC
Even reducing the size of the horzion parameter still has similar problems, and it seems that using a diffusion head for fine-tuning is more expensive. I don't know whether the head parameters in the config should be adjusted more carefully, or whether this problem is simply a flaw in the diffusion algorithm.
Hi, have you checked the reason of the error? Try to print the action the model output, you could see that the number is out of bound or hard to be solved by IK optimizer.

Thank you for your attention. I tried Aloha simulation with 1000 and 5000 steps on Diffusion. Unfortunately, this problem was not reproduced. Maybe this problem itself is also unstable like Diffusion Head.
But I do get action prediction output from two different heads:
The first is the output of L1, which can be simulated normally and performs well:
action[100]: [ 0.57204974 -0.23319605 0.76363015 -0.09170198 -0.07699968 0.77580905 0.96044242 0.75630909 -0.45451248 1.13872492 -0.46 336147 -0.58958995 0.5670839 1.06122994]
action[200]: [-0.99930573 0.55937028 -0.52284944 -0.59289199 -0.51960701 0.75874555 0.97301495 0.14920622 0.19651346 -0.29161143 -0.02 436048 0.51398659 0.05542321 -1.52965879]
75%|action[300]: [-0.46441016 0.91491914 -1.4249717 -0.39001578 -0.50952721 0.77543199 -0.09499384 -0.16243899 0.69015974 -1.14362192 0.15943812 0.9602797 -0.15596175 0.2628575 ]
Then there is the output of Diffusion, which completely freezes after 100 steps and maintains consistent movement:
action[100]: [-3.59837627 5. 5. -4.92470407 5. 4.25408506 -4.62254953 -4.55121994 -5. -5. 1.16204941 4.00920582 -4.54648066 5. ]
action[200]: [-3.59454226 5. 5. -4.92671156 5. 4.25115156 -4.61913872 -4.55262327 -5. -5. 1.16107273 4.01593781 -4.55090284 5. ]
action[300]: [-3.59459043 5. 5. -4.92677879 5. 4.25059795 -4.61849213 -4.55246067 -5. -5. 1.16351843 4.01590443 -4.55066919 5. ]
Compared with the two, the latter does have some extreme values, showing its instability.

seann999 closed this as completed Jan 29, 2024

seann999 reopened this Jan 29, 2024

WenchangGaoT pushed a commit to WenchangGaoT/octo1 that referenced this issue May 10, 2024

Merge pull request octo-models#43 from rail-berkeley/r2d2_eval

67a3650

Merge in R2D2 eval script and small QOL changes to dataset.py

BUAAZhangHaonan mentioned this issue May 13, 2024

Issue of gym_wrapper and action_heads #93

Open

BUAAZhangHaonan mentioned this issue May 31, 2024

Issue of action heads for version 1.5 #105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with diffusion head #43

Issue with diffusion head #43

seann999 commented Jan 29, 2024 •

edited

Loading

seann999 commented Jan 29, 2024

seann999 commented Jan 29, 2024

kpertsch commented Jan 29, 2024

zwbx commented Apr 8, 2024

zwbx commented Apr 8, 2024

BUAAZhangHaonan commented May 11, 2024

zwbx commented May 13, 2024

BUAAZhangHaonan commented May 13, 2024

Issue with diffusion head #43

Issue with diffusion head #43

Comments

seann999 commented Jan 29, 2024 • edited Loading

seann999 commented Jan 29, 2024

seann999 commented Jan 29, 2024

kpertsch commented Jan 29, 2024

zwbx commented Apr 8, 2024

zwbx commented Apr 8, 2024

BUAAZhangHaonan commented May 11, 2024

zwbx commented May 13, 2024

BUAAZhangHaonan commented May 13, 2024

seann999 commented Jan 29, 2024 •

edited

Loading