[ENH] Extend HFTransformersForecaster for PEFT methods #6457

geetu040 · 2024-05-20T21:29:56Z

Reference Issues/PRs

Fixes #6435.

What does this implement/fix? Explain your changes.

This PR extends the fit_strategy of HFTransformersForecaster for these PEFT methods

LoRa
LoHa
AdaLora

Does your contribution introduce a new dependency? If yes, which one?

yes, it introduces peft

Did you add any tests for the change?

Yes, I have added the tests for fit_strategy param in test_hf_transformers_forecaster.py

PR checklist

For all contributions

I've added myself to the list of contributors with any new badges I've earned :-)
How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
See here for full badge reference
Optionally, for added estimators: I've added myself and possibly to the maintainers tag - do this if you want to become the owner or maintainer of an estimator you added.
See here for further details on the algorithm maintainer role.
The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.

pyproject.toml

fkiraly

I'm a bit surprised that a lot of the configs are hard coded.

E.g., r, lora_alpha, etc. I would suggest to allow the user to pass either the parameters as peft_config or similar, or directly in fit_strategy.

fkiraly · 2024-05-27T16:13:08Z

sktime/forecasting/hf_transformers_forecaster.py

@@ -40,7 +43,8 @@ class HFTransformersForecaster(BaseForecaster):
        Path to the huggingface model to use for forecasting. Currently,
        Informer, Autoformer, and TimeSeriesTransformer are supported.
    fit_strategy : str, default="minimal"
-        Strategy to use for fitting the model. Can be "minimal" or "full"


I think it needs to be much clearer to the user how the fine-tuning is done here - or that fit, in fact, is fine-tuning.

I think it should also be possible to fine-tune the model with time series that are later not used in the forecast - but that might require the global forecasting interface to be in place.

That will not be an issue because once the model has been configured to use peft in train, it will remain the same in predict as well

benHeid

Thank you for your contribution. I have some requests regarding the configuration of the different testing strategies. Additionally, I would like to know why the new test file is required

benHeid · 2024-05-27T16:17:23Z

sktime/forecasting/hf_transformers_forecaster.py

@@ -227,6 +231,31 @@ def _fit(self, y, X, fh):
        elif self.fit_strategy == "full":
            for param in self.model.parameters():
                param.requires_grad = True
+        elif self.fit_strategy == "lora":
+            peft_config = LoraConfig(
+                r=8,


The user would probably like to have control about the peft_configs. I.e., by providing a dict of parameters that is passed to the LoraConfig.

LoraConfig(**self.peft_config_dict)

sure makes sense, 2 things:

no need to give any default value to param peft_config_dict, right?

should I also create an example with peft in docstring?

benHeid · 2024-05-27T16:17:33Z

sktime/forecasting/hf_transformers_forecaster.py

+            )
+            self.model = get_peft_model(self.model, peft_config)
+        elif self.fit_strategy == "loha":
+            peft_config = LoHaConfig(


Seem comment above for lora.

benHeid · 2024-05-27T16:17:43Z

sktime/forecasting/hf_transformers_forecaster.py

+            )
+            self.model = get_peft_model(self.model, peft_config)
+        elif self.fit_strategy == "adalora":
+            peft_config = AdaLoraConfig(


See comment above for Lora

benHeid · 2024-05-27T16:20:01Z

sktime/forecasting/tests/test_hf_transformers_forecaster.py

Why is this test file required?

I suppose that these tests are covered by the automated tests that are triggered with the params set via get_test_params. Thus, I propose to add tests for the different fit strategies in get_test_params.

I just thought get_test_params returning 5 set of parameters was too much, because check_estimator was taking too long to execute - I'll put them in get_test_params now

you can also try to find parameter sets with shorter fit or inference time, while maintaining coverage - is that possible?

review-notebook-app · 2024-06-04T19:44:45Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

fkiraly · 2024-06-05T16:02:40Z

sktime/forecasting/hf_transformers_forecaster.py

    validation_split : float, default=0.2
        Fraction of the data to use for validation
    config : dict, default={}
        Configuration to use for the model. See the `transformers`
        documentation for details.
+    peft_config_dict : dict, default={}
+        Configuration dictionary specifying parameters and settings relevant to


please explain possible fields, or make them available directly

I created an example in docstring for that. Instead I think I should link here the peft configuration documentation here

by "available directly" did you mean to set the default parameters to something other that {}? but the default one will not work for all peft methods as they take different params.

by "available directly" did you mean to set the default parameters to something other that {}? but the default one will not work for all peft methods as they take different params.

Yes, I meant as args. How bad is the variability in arguments? One option that you can take is always to take the union, and ignore arguments that are not applicable.

More quantitatively, how many config args are there in total, how many are common?

Instead I think I should link here the peft configuration documentation

That may be a good idea. Also, are there any common config sets that we could hardcode by a string?

First I don't think we can have params with a default value of dictionary object - flake 8 B006 Do not use mutable data structures for argument defaults

More quantitatively, how many config args are there in total, how many are common?

Mostly 4-5

are there any common config sets that we could hardcode by a string?

we can use these - https://github.com/sktime/sktime/blob/44580748a82c2c4139c6e79a05a5663b2949a9d0/sktime/forecasting/hf_transformers_forecaster.py#L234C1-L258C65

The only important param for peft_config is target_modules which has to be a list of strings. Rest of the params can be set by default.
I suggest that we only make 2 changes here

add link for peft documentation

set the default value of peft_config_dict to {"target_modules": ["q_proj", "v_proj"]} instead of {}, so user can run the code without any peft_config_dict argument and can also mutate it through argument

I would suggest to add the link, but also document the most important params. You say there are only 4-5?

yes, default settings in estimators should always run, and that should be tested by get_test_params.

Just following up on my query above:

how many parameters - inside the configs - are there in total?

how many are shared for different choices of PEFT method?

First I don't think we can have params with a default value of dictionary object

Yes, one should never use mutable defaults as they can lead to hard to diagnose errors. But that does not mean you cannot have defaults for mutable parameters at all.

The way you set a mutable default is, you set the default to None, and you conditionally write to a private parameter, e.g., self._my_config, depending on the public value of self.my_config.

how many parameters - inside the configs - are there in total?
how many are shared for different choices of PEFT method?

15+ in total, 4-5 used commonly, 2 of which are common to all configs

hm, I would have a light tendency towards making the common ones explicit, and pass the rest as dict.

in the current implementation common parameter is provided as default and thus the code runs when parameters are set to default and users can provide additional params through peft_config_dict specific to a peft strategy

benHeid · 2024-06-12T17:04:03Z

sktime/forecasting/hf_transformers_forecaster.py

    validation_split : float, default=0.2
        Fraction of the data to use for validation
    config : dict, default={}
        Configuration to use for the model. See the `transformers`
        documentation for details.
+    peft_config_dict : dict, default={"target_modules": ["q_proj", "v_proj"]}


Sorry with coming up with this so late. Would it be possible to just pass the LoraConfig/LohaConfig object from HuggingFace directly. In the code the only adaption would be get_peft_model(model, peft_config).

This should significantly reduce the maintainance workload for us since all new adaption methods would be directly available if HF releases a new Peft library version.

I have made the changes

just to confirm we are originally backing out from this - so we have peft_config in param instead of peft_config_dict - but the user still has control so it should not be an issue

#6457 (comment)

The user would probably like to have control about the peft_configs. I.e., by providing a dict of parameters that is passed to the LoraConfig.

fit_strategy can now be one of ["minimal", "full", "peft"] instead of ["minimal", "full", "lora", "loha", "adalora"]

fkiraly · 2024-06-13T15:30:22Z

Tests fail - please ensure there are no syntax errors in the code when requesting a review.

fkiraly · 2024-06-14T19:58:15Z

Btw, I am noticing that tests take very long, the estimator by itself seems to be adding 10min? Is there a way to choose a parameter set that makes the tests faster? Think smaller data set etc.

geetu040 · 2024-06-17T07:25:52Z

Btw, I am noticing that tests take very long, the estimator by itself seems to be adding 10min? Is there a way to choose a parameter set that makes the tests faster? Think smaller data set etc.

@fkiraly we are already running a single epoch, but I'll see if it can be made faster with some other batch size or config.
There are 3 test params

for testing huggingface/informer-tourism-monthly model
for testing huggingface/autoformer-tourism-monthly model
for testing peft fit strategy

(2) and (3) can be merged to make one test case. would you agree with that?

And some test cases are failing without any log and it seems to be happening with only macos environment (not all macos environments)
Do you have any suggestions on that?

…6457

geetu040 · 2024-06-18T15:58:53Z

In the latest workflow test cases are under 10 minutes. Do we still need to work on them?

Although I have tried runs with different parameters, noting duration, giving these results

param 1: testing informer

1.48 s ± 108 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

param 2: testing autoformer

1.54 s ± 151 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

param 3: testing peft {r=8, lora_alpha=32, target_modules=["q_proj", "v_proj"]}

1.62 s ± 131 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

suggested: testing peft with smallest config {r=2, lora_alpha=8, target_modules=["q_proj"]}

1.49 s ± 135 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

@fkiraly @benHeid Should I replace the "param 3" with "suggested param"?

geetu040 requested review from achieveordie, benHeid, fkiraly and yarnabrina as code owners May 20, 2024 21:29

fkiraly reviewed May 20, 2024

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

geetu040 force-pushed the hf-peft branch from 618455b to 30a56d4 Compare May 21, 2024 20:02

geetu040 mentioned this pull request May 25, 2024

[ENH] Loader for pretrained transformer models from hugging face #5790

Open

fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels May 27, 2024

fkiraly requested changes May 27, 2024

View reviewed changes

fkiraly reviewed May 27, 2024

View reviewed changes

benHeid requested changes May 27, 2024

View reviewed changes

geetu040 added 4 commits June 4, 2024 22:46

extend HFTransformersForecaster for peft methods

4458074

add peft dependency

b92a87e

add test cases for fit_strategy

98e3e6f

improve fit_strategy info in HFTransformersForecaster docstring

a46bfeb

geetu040 force-pushed the hf-peft branch from 28d9b42 to 628b717 Compare June 4, 2024 20:47

geetu040 added 4 commits June 5, 2024 06:50

add peft_config_dict parameter

c8b6e5f

delete test file

f2129e8

add more test_params

d163fc8

add example that uses peft config

69d3a3f

geetu040 force-pushed the hf-peft branch from f3646ea to c9683a9 Compare June 5, 2024 04:55

fkiraly assigned geetu040 Jun 5, 2024

fkiraly reviewed Jun 5, 2024

View reviewed changes

geetu040 force-pushed the hf-peft branch from 1d710a3 to b000f8b Compare June 8, 2024 16:47

add more info about peft config dict

a016b93

geetu040 force-pushed the hf-peft branch from b000f8b to a016b93 Compare June 8, 2024 16:49

Merge branch 'main' into pr/6457

6bd922d

geetu040 requested a review from benHeid June 10, 2024 04:01

geetu040 requested a review from fkiraly June 10, 2024 04:01

benHeid requested changes Jun 12, 2024

View reviewed changes

geetu040 force-pushed the hf-peft branch from 7c79b15 to 0f6b34a Compare June 13, 2024 16:21

use peft_config directly as argument

6b974c7

geetu040 force-pushed the hf-peft branch from 0f6b34a to 6b974c7 Compare June 14, 2024 07:49

fkiraly added 2 commits June 17, 2024 17:45

Merge branch 'main' into pr/6457

f4ea928

Merge branch 'hf-peft' of https://github.com/geetu040/sktime into pr/…

781cefc

…6457

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Extend HFTransformersForecaster for PEFT methods #6457

[ENH] Extend HFTransformersForecaster for PEFT methods #6457

geetu040 commented May 20, 2024

fkiraly left a comment •

edited

fkiraly May 27, 2024

fkiraly May 27, 2024

geetu040 May 28, 2024

benHeid left a comment

benHeid May 27, 2024

geetu040 May 28, 2024

benHeid May 27, 2024

benHeid May 27, 2024

benHeid May 27, 2024

geetu040 May 28, 2024

fkiraly May 29, 2024

review-notebook-app bot commented Jun 4, 2024

fkiraly Jun 5, 2024

geetu040 Jun 5, 2024

fkiraly Jun 5, 2024

geetu040 Jun 6, 2024

geetu040 Jun 6, 2024

fkiraly Jun 6, 2024 •

edited

fkiraly Jun 6, 2024 •

edited

geetu040 Jun 6, 2024

fkiraly Jun 11, 2024

geetu040 Jun 12, 2024

benHeid Jun 12, 2024

geetu040 Jun 13, 2024

fkiraly commented Jun 13, 2024

fkiraly commented Jun 14, 2024

geetu040 commented Jun 17, 2024

geetu040 commented Jun 18, 2024

[ENH] Extend HFTransformersForecaster for PEFT methods #6457

Are you sure you want to change the base?

[ENH] Extend HFTransformersForecaster for PEFT methods #6457

Conversation

geetu040 commented May 20, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

Did you add any tests for the change?

PR checklist

For all contributions

fkiraly left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benHeid left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

review-notebook-app bot commented Jun 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fkiraly Jun 6, 2024 • edited

Choose a reason for hiding this comment

fkiraly Jun 6, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fkiraly commented Jun 13, 2024

fkiraly commented Jun 14, 2024

geetu040 commented Jun 17, 2024

geetu040 commented Jun 18, 2024

fkiraly left a comment •

edited

fkiraly Jun 6, 2024 •

edited

fkiraly Jun 6, 2024 •

edited