Questions about the training and evaluation pipelines #16

Ji4chenLi · 2023-07-05T21:26:36Z

Hi Yunfan,

Thank you so much for the great work! Since I'm trying to reproduce the results, I would like to ask some questions regarding the training and evaluation details.

Can you provide the number of training epochs? (Training hyperparameters besides Table 6 #9)
Let's look at Table 7. Denote the number of gradient steps as $N_{gs}$. Since you are using learning rate warm-up and cosine annealing, I assume the learning rate first increase linearly from 0 to 1e-4 when $N_{gs} \in [0, 7K]$. When $N_{gs}\in [7K + (2i * 17K), 7k + ( (2i + 1) * 17K)]$, the learning rate decrease from 1e-4 to 0. And when $N_{gs}\in [7K + ((2i + 1) * 17K), 7k + ( (2i + 2) * 17K)]$, the learning rate increase from 0 to 1e-4. Am I right?
I notice that you fine-tune the last two layers and freeze all other layers of T5. Does it correspond to the following codes?

        for n, p in self.policy.t5_prompt_encoder.named_parameters():
            p.requires_grad = False
            if "t5.encoder.block.11.layer.1." in n or "final_layer_norm" in n:
                p.requires_grad = True

When calculating the success rate (SR) for each task distribution and level, how many task instances did you sample? I assume the equation you used is $$SR = \frac{\text{number of success}}{\text{number of total task instances}} $$
Can you share your vectorized implementation for the policy evaluation?
When evaluating the performance of your methods and the other baseline, how did you set the parameter hide_arm_rgb when making the env? Should we always set it to True?

Thanks and regards,
Jiachen

The text was updated successfully, but these errors were encountered:

yunfanjiang · 2023-07-06T04:51:08Z

Hi Jiachen,

Thanks for your interest in our project. To answer your questions:

We trained for 10 epochs in total.
The equation you wrote seems to be cyclical. We used a schedule that first linearly increases then monotonically decreases. A similar implementation can be found here.
We used "layer" to refer to transformer layer (block).
Yes, we computed success rate for each task averaged over 100 instances.
Our vectorized env implementation is based on this.
Yes, we used that to emulate workspaces that are free of robot arm occlusions.

Feel free to let me know if you have further questions.

Ji4chenLi · 2023-07-06T06:48:23Z

Thank you so much for your response! My questions now get mostly addressed. The questions I left are:

What's the min_lr when performing the cosine annealing? Is it 1e-5 per here and Chinchilla?
The Warmup Steps = 7K and the LR Cosine Annealing Steps = 17K as per Table 7. Could you let me know when the learning_rate decreases to min_lr? Is it step 17K or step 24K (7K + 17K)?

yunfanjiang · 2023-07-06T12:52:15Z

Thank you so much for your response! My questions now get mostly addressed. The questions I left are:

What's the min_lr when performing the cosine annealing? Is it 1e-5 per here and Chinchilla?

The Warmup Steps = 7K and the LR Cosine Annealing Steps = 17K as per Table 7. Could you let me know when the learning_rate decreases to min_lr? Is it step 17K or step 24K (7K + 17K)?

We used min_lr = 1e-7.
It decreased for 17K steps.

yunfanjiang closed this as completed Jul 6, 2023

aopolin-lv mentioned this issue Sep 13, 2023

[Evaluation] How to get the result reported in the paper #33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about the training and evaluation pipelines #16

Questions about the training and evaluation pipelines #16

Ji4chenLi commented Jul 5, 2023 •

edited

Loading

yunfanjiang commented Jul 6, 2023

Ji4chenLi commented Jul 6, 2023

yunfanjiang commented Jul 6, 2023

Questions about the training and evaluation pipelines #16

Questions about the training and evaluation pipelines #16

Comments

Ji4chenLi commented Jul 5, 2023 • edited Loading

yunfanjiang commented Jul 6, 2023

Ji4chenLi commented Jul 6, 2023

yunfanjiang commented Jul 6, 2023

Ji4chenLi commented Jul 5, 2023 •

edited

Loading