Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-task batching #30

Open
einarbmag opened this issue Jul 28, 2023 · 4 comments
Open

Multi-task batching #30

einarbmag opened this issue Jul 28, 2023 · 4 comments

Comments

@einarbmag
Copy link

In the paper, you mention that IA^3 is compatible with multi-task batching, a requirement to be comparable to ICL. Unfortunately, the current implementation of Huggingface PEFT does not support this, and it would apparently be a big refactoring to do so huggingface/peft#759.

Do you know of an implementation or example that shows how to do this?

@dptam
Copy link
Collaborator

dptam commented Nov 21, 2023

Hi @einarbmag, sorry for the long delay.
I could be wrong, but I think mixture-of-experts might have an implementation for this to use different experts within a batch. @muqeeth might know more about this.

@muqeeth
Copy link
Collaborator

muqeeth commented Nov 23, 2023

Hi @einarbmag, here is one possible implementation we can use for a batch containing examples from multiple tasks:

Assume B is the batch size, N is the number of tasks, and H is the hidden dimension at which IA^3 is applied.

  • Task indices T are represented by a B x N tensor. This tensor is one-hot, where the index corresponding to the task index is set to 1 for each example.
  • IA^3 vectors V are defined as an N x H tensor.

We can obtain the required IA^3 vectors for each example by using L_batch = torch.matmul(T, V).

Then, we modify the input activations, which have the shape (B x num_tokens x H), by multiplying them with L_batch unsqueezing along the sequence dimension.

@dodoyeon
Copy link

Hi I have to use the mixed task batch so I'll do it if I need to,..
Did you implement IA3 mixed task batch?

@dptam
Copy link
Collaborator

dptam commented Feb 29, 2024

Hi @dodoyeon sorry we did not, but Muqeeth's sketch above can provide a starting point!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants