Multi Task Learning implementation and example #778

johann-petrak · 2021-05-20T15:28:05Z

This fixes #408 for me, did some testing with two classification heads so far and most everything works fine: I am getting proper evaluations for each head during training and the inferencer provides a list of two predictions for each instance.

Still to check:

why do we not get the proper task name for the predictions from the inferencer?
what if we use one or more per-token heads?

Help with these things to check and with how to best test this in general would be appreciated.

Timoeller edit: fixes #742

Timoeller · 2021-05-20T16:44:16Z

Incredible. You are doing a little sweep through all of FARM and it is much cleaner than before : ) Thanks a lot @johann-petrak

tholor · 2021-05-21T09:36:36Z

I had a quick look at your code and I think your changes make sense. Haven't tried if they are breaking anything around NaturalQuestions, but the NQ inference part was broken already before and we will need to refactor this anyways.

why do we not get the proper task name for the predictions from the inference?

I guess you mean "task":"text_classification" in the dict returned by the inferencer?
This is set in the head itself here and is rather reflecting the task type:
https://github.com/deepset-ai/FARM/blob/master/farm/modeling/prediction_head.py#L413

I believe the order of the predictions should reflect the order of your prediction heads. However, I can totally see that for multiple classification heads (especially with very similar predictions) you want a clear identification in the results. We could probably add a field "task_name" to the dict at the above location in the code by reading from the prediction head's self.task_name.

johann-petrak · 2021-05-22T06:52:53Z

I believe the order of the predictions should reflect the order of your prediction heads. However, I can totally see that for multiple classification heads (especially with very similar predictions) you want a clear identification in the results. We could probably add a field "task_name" to the dict at the above location in the code by reading from the prediction head's self.task_name.

My main concern is that it is a bit confusing how the keys/parameter names "task", "task_type" and "task_name" are used, ideally it would always be according to the convention that "task" is a task object, "task_name" is the name of a task, and "task_type" is the type, but this is not always the case. (I know I can be a bit too OCD about such things)

Of course hard to change now as it would break backwards compatibility, so I have no opinion :)

Adding a field "task_name" in the prediction would definitely help to immediately realize that "task" is not meant to be the task name. Should we make this part of this PR or have a different issue for it?

Timoeller · 2021-05-26T08:56:01Z

Hey @johann-petrak I agree that the task setup could be improved - inconsistent use of naming is very confusing to users (and also us, looking at code we wrote a year ago and just shaking heads), so we also appreciate rigorous improvement on this end.

In general tasks is a nested dict looking like:

self.tasks[name] = {
            "label_list": label_list,
            "metric": metric,
            "label_tensor_name": label_tensor_name,
            "label_name": label_name,
            "label_column_name": label_column_name,
            "text_column_name": text_column_name,
            "task_type": task_type
        }

and I like the suggestion to add the tasks name as "task_name" to the prediction heads output. People might want to have n classification heads that would be only distinguishable through this task_name.

So lets add this "task_name" in this PR and improve upon the general naming of tasks in a separate PR?

Timoeller · 2021-06-03T14:16:19Z

Hey @johann-petrak any progress on the task_name field?

I think the PR is already in good shape and could be merged soon.

johann-petrak · 2021-06-03T18:50:43Z

Hey @johann-petrak any progress on the task_name field?

I think the PR is already in good shape and could be merged soon.

Sorry, did not forget about this, I have been crazy busy with other stuff and planning to get back this this in the next days.

johann-petrak · 2021-06-06T17:37:40Z

OK, this is now now ready from my side, the prediction heads for text classifcation, regression and a few others now include the key "task_name" in the formatted prediction result. The key "task" contains the type as before.
There is also an example script to illustrate using two text classification heads.

This did not address or test using multiple heads where the heads are of different type, e.g. classification and NER. Probably better to open or re-use a separate issue for that, as it will need some good dataset for testing as well.

tholor

Looking good. Let's only remove the diary.md before merging. I also started a quick benchmark run to validate that the changes didn't impact performance.

johann-petrak · 2021-06-09T12:01:38Z

Sorry, that was my insanity, that file should not have been added, thanks for spotting!
Removed now.

This reverts commit 577e6f3.

julian-risch · 2021-06-09T15:38:01Z

Had some problems with executing the benchmark but now the results are here: #797 (comment) -> Ready to merge

Benchmarks: QA Accuracy

✔️ QA Accuracy Benchmark Passed

	run	f1_change	em_change	tnacc_change	elapsed_change	f1	em	tnacc	elapsed	f1_gold	em_gold	tnacc_gold	elapsed_gold
0	FARM internal evaluation	0.017	0.0001	0.0169	-0.5025	82.6841	78.4722	84.3763	39.4975	82.6671	78.4721	84.3594	40
1	outside eval script	-0.0045	-0.0077	-	0.0436	82.9125	79.8703	-	27.0436	82.917	79.878	-	27
2	train evaluation	0.4809	-0.2106	0.1685	-20.0116	82.6359	78.4469	97.5406	1114.99	82.155	78.6575	97.3721	1135

johann-petrak added 2 commits May 19, 2021 22:28

Start working on #408

fe554e7

Merge branch 'master' into issue408

c73d84d

Timoeller requested a review from tholor May 20, 2021 16:44

Add example file for simple MTL

9039da5

johann-petrak mentioned this pull request May 20, 2021

WIP - Mult-task learning (MTL) using FARM example #779

Closed

Timoeller changed the title ~~Work on fixing #408~~ Multi Task Learning implementation and example May 26, 2021

johann-petrak added 2 commits June 6, 2021 15:03

Merge branch 'master' into issue408

ca4271f

Add taks_name to formatted predictions.

a19e770

tholor added the benchmark label Jun 9, 2021

tholor approved these changes Jun 9, 2021

View reviewed changes

Should not have been added.

681d69d

julian-risch added benchmark and removed benchmark labels Jun 9, 2021

Fix problem with ec2 availability zone

577e6f3

julian-risch added benchmark and removed benchmark labels Jun 9, 2021

Revert "Fix problem with ec2 availability zone"

ee062b3

This reverts commit 577e6f3.

julian-risch merged commit 47dfa4f into deepset-ai:master Jun 9, 2021

shahrukhx01 mentioned this pull request Sep 28, 2021

MTL Processor QA + Classification #842

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi Task Learning implementation and example #778

Multi Task Learning implementation and example #778

johann-petrak commented May 20, 2021 •

edited by Timoeller

Loading

Timoeller commented May 20, 2021

tholor commented May 21, 2021

johann-petrak commented May 22, 2021

Timoeller commented May 26, 2021

Timoeller commented Jun 3, 2021

johann-petrak commented Jun 3, 2021

johann-petrak commented Jun 6, 2021

tholor left a comment

johann-petrak commented Jun 9, 2021

julian-risch commented Jun 9, 2021

Multi Task Learning implementation and example #778

Multi Task Learning implementation and example #778

Conversation

johann-petrak commented May 20, 2021 • edited by Timoeller Loading

Timoeller commented May 20, 2021

tholor commented May 21, 2021

johann-petrak commented May 22, 2021

Timoeller commented May 26, 2021

Timoeller commented Jun 3, 2021

johann-petrak commented Jun 3, 2021

johann-petrak commented Jun 6, 2021

tholor left a comment

Choose a reason for hiding this comment

johann-petrak commented Jun 9, 2021

julian-risch commented Jun 9, 2021

Benchmarks: QA Accuracy

✔️ QA Accuracy Benchmark Passed

johann-petrak commented May 20, 2021 •

edited by Timoeller

Loading