Clarification about IA^3 #5

sordonia · 2022-05-18T15:25:12Z

Hi :)

I was reading your interesting paper https://arxiv.org/pdf/2205.05638.pdf.

In Section 3.3, you specify that IA^3 adds a total of d_k + d_v + d_ff parameters.

However, if I look at this line, you seem to be allocating 2 * d vectors for each linear layer (multi_lora_a, multi_lora_b) and multiplying multi_lora_a with the input and multi_lora_b with the transformed input.

t-few/src/models/lora.py

Line 43 in 9dbc9cc

hidden = hidden * self.multi_lora_b.flatten()

Am I missing something?

Thank you for your clarification :-)

sordonia · 2022-05-18T15:26:49Z

Sorry, I just realized that in your config file you restrict the trainable parameters so all good, thank you!

t-few/configs/ia3.json

Line 7 in 9dbc9cc

"trainable_param_names": ".*lora_b.*",

HaokunLiu · 2022-05-19T01:08:03Z

Hey, you found the hidden story. IA3 is actually morphed from LoRA.

sordonia closed this as completed May 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification about IA^3 #5

Clarification about IA^3 #5

sordonia commented May 18, 2022

sordonia commented May 18, 2022

HaokunLiu commented May 19, 2022

Clarification about IA^3 #5

Clarification about IA^3 #5

Comments

sordonia commented May 18, 2022

sordonia commented May 18, 2022

HaokunLiu commented May 19, 2022