You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, sir. I run the code in the repository and got some questions about it :
I use the script to run finetuning on the CoNLL task and I notice that the weights of the 2-layer MLP (which is denoted as h_phi in the paper) do not change as the training goes on. And I found that the reason is that the per-layer learning rates for h_phi are initialized to 0 and are set to untrainable as follows:
This means that the task-specific weights for h_phi are not adapted in the inner loop. It is inconsistent with what the paper says.
The learning rates for the 2-layer MLP (which is denoted as g_psi in the paper) seems redundant as they haven't been used in the adaption phase.
The text was updated successfully, but these errors were encountered:
Hi. As you can see from the code snippet that you shared, these are treated as warp layers and hence they are not adapted during fine-tuning. So what you are seeing is the intended behavior.
Hello, sir. I run the code in the repository and got some questions about it :
This means that the task-specific weights for h_phi are not adapted in the inner loop. It is inconsistent with what the paper says.
The text was updated successfully, but these errors were encountered: