Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-task training #4

Closed
puraminy opened this issue Jan 2, 2023 · 4 comments
Closed

multi-task training #4

puraminy opened this issue Jan 2, 2023 · 4 comments

Comments

@puraminy
Copy link

puraminy commented Jan 2, 2023

Hi
How can I train the model when I provide multiple datasets? It concatenates the datasets, but I know that each batch must have one task. How do you shuffle such batches?

Currently I get error on this line:

    def check_uniqueness(self, samples):
        assert len(np.unique(samples)) == 1

@AkariAsai
Copy link
Owner

Hi thank you so much for your interests in our work! Could you tell me a bit more about the error? What are the datasets do you use for the multi-task training and what is the configuration do you use?

I suspect it happens when the concatnated data has different data fields (e.g., MultiRC or ReCORD uses additional meta data fields). I should have written that in README, but when I did multi-task training of tasks with different format, I modified the task field to avoid an error (or I think you can simply comment out the check_uniqueness but I haven't tried that).

@puraminy
Copy link
Author

Thanks for your reply.

Actually, I am working to extend your project to my case. I supposed when you check the uniqueness of task_ids in a batch, so you somehow guarantees it in multi-tasking fashion when multiple task exist and each of which has its own task id.
I thought it should be implemented when you concatenate them together.

Anyways I implemented a function like interleave(train_datasets, ...) to do so, and used that in place of concatenate(train_datasets, ...)

Yet, I am not sure if it's a good solution for multi-task training. I mean to have batches of different task where each batch is entirely from one task, particularly in my prompt-tuning solution when just prompts are trained and prompts of each task could be independent of each other with some shared data. I mean some consideration probably must be done for the optimizer and scheduler...

@puraminy
Copy link
Author

puraminy commented Jan 19, 2023

Anyways thanks, I resolved the problem and currently just commented it out.

By the way what does option multi_task mean? I didn't find you setting it in any configuration file.

https://github.com/AkariAsai/ATTEMPT/search?q=multi_task

@AkariAsai
Copy link
Owner

Thanks again for your detailed comments & interest! Yes, different mini-batch constructions can be considered. I personally think that having multiple different tasks in the same minibatch might help to learn better attention layers, but we haven't explored those different strategies.

The multi-task option indicates the multi-task training, but it's unnecessary (if you set the shared_attn option, then the code assumes the multi-task training automatically). It's from the older version and I forgot to remove the option when I was refactoring. Thanks for the heads-up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants