question about the initialization experiment #39

Tsingularity · 2022-04-01T20:28:20Z

Hi, thanks for the great work!

In section 7.4, it conducts an initialization experiment with real words. I am just wondering, does this initialization applies to prompts in every layer? Or just the prompts in the first layer? And how does this work together with the re-parameterization method since the input dimension of re-param is much smaller?

And I also noticed that in your code, instead of directly adding prompts to the input of each layer (as described in ur paper), what u actually did is appending vectors to key value matrices directly via the past_key_values argument. Just wondering, how does the initialization experiment work in this setup/implementation? Directly initialize the key/value vectors? But seems that the dimension doesn't match?

Thanks!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about the initialization experiment #39

question about the initialization experiment #39

Tsingularity commented Apr 1, 2022

question about the initialization experiment #39

question about the initialization experiment #39

Comments

Tsingularity commented Apr 1, 2022