Added support for Multitask regression task in HuggingFace models #3389
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
I have updated
HuggingFaceModel
class to allow for different type of tasks.Earlier, task can only be
finetuning
orpretraining
. If the task waspretraining
, then the defaultpretraining mode assumed is masked language modeling and data was prepared accordingly. The problem with this approach was other pretraining task like multitask regression uses different data batching process (the batching does not involve masking as required for mlm) and hence, there was no way to generate data for tasks like
mtr
.The PR broadens the scope of task by including various types of task like
mlm
for masked language modeling,mtr
for multitask regression,regression
andclassification
for simple regression and classification tasks. Task of typefinetuning
is removed since finetuning itself involves multiple subtasks like regression, classification, mtr. Instead, users can directly specify the task.Type of change
Please check the option that is related to your PR.
Checklist
yapf -i <modified file>
and check no errors (yapf version must be 0.32.0)mypy -p deepchem
and check no errorsflake8 <modified file> --count
and check no errorspython -m doctest <modified file>
and check no errors