Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate adapter for s3prl frontend #5609

Merged
merged 65 commits into from
Feb 22, 2024
Merged

Conversation

Stanwang1210
Copy link
Contributor

What?

The PR is to support add adapter to s3prl frontend SSL models.
I integrate it with original lora implementation.
Currently, it only supports houlsby adapter. However, we can easily integrate other types of adapter very soon.
Two config yaml give examples of the configuration of adapter

TODO: Havn't done with the adapter load/save function. Will handle it soon

Why?

The original lora implementation are not compatible with other kinds of adapter.

See also

#5034
@ftshijt

@mergify mergify bot added the ESPnet2 label Jan 5, 2024
@sw005320 sw005320 added New Features ASR Automatic speech recogntion labels Jan 5, 2024
@sw005320 sw005320 added this to the v.202312 milestone Jan 5, 2024
@sw005320
Copy link
Contributor

sw005320 commented Jan 5, 2024

Cool!
How about adding the result and corresponding config as well?

@Stanwang1210
Copy link
Contributor Author

For the lora case, I didn't modified any code except for the entry to create_lora_adapter function.
So the main modification is at the houlsby adapter case.
For those target layers idx assigned in the adapter_conf ,
model.frontend.upstream.upstream.model.encoder.layers.idx will become HoulsbyTransformerSentenceEncoderLayer.

Basically, it looks like
layer

Did I make it clear?

@sw005320
Copy link
Contributor

sw005320 commented Jan 8, 2024

Cool.
Can you add some usages to https://github.com/espnet/espnet/blob/master/egs2/ml_superb/asr1/README.md?

@ftshijt, by following the convention, it would be better to have the result and model link.

@Stanwang1210
Copy link
Contributor Author

Sorry for the late reply
Now I add model link and result in the README.
Please let me know if I miss anything

@ftshijt

Copy link
Collaborator

@ftshijt ftshijt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! The current implementation seems cool to me, but I would like to hold more discussions on some design points.

if adapter == "lora" and lora is None:
raise RuntimeError("Requiring loralib. Do 'pip install loralib'")

# TODO: houlsby adapter may need S3PRL?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make it general to other modules as well? If that's difficult, we may keep only to s3prl for now.

In that case, you may consider add a check to ensure that s3prl is installed when using the houlsby adapter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this update compatible with pretrained models which have configs use_lora and save_lora_only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply.
For @ftshijt questions, I think currently it's difficult to integrate adapters and lora together. The reason is adapter implementation needs to modify the forward function of SSL models, like here, while LoRa did not. Therefore, it may requires some efforts to integrate them together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply.
@simpleoier I think it's compatible but need to do some revision.
And I did it by rename use_lora to use_adapter (more general), and the same way with save_lora_only.
The reason is that in most case, we add lora only on the pre-trained model. However, in SSL settings, most of the time we need to initialize a downstream model for each task. That is to say, when applying lora to SSL models as adapters, we need to save not only the lora parameters but also the downstream model and other tunable parameters. Therefore, I choose to save all parameters requires_grad = True.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some revisions here.
Please check d7f0b39

Comment on lines 367 to 376
# TODO: This kind of design may not facilitate the integration of other kind of the adapter
# Maybe we can only save those with requires_grad = True ?
# If we use lora.lora_state_dict, we will not save the downstream model in SSL settings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have a clear question to this one. But I feel only save models with requires_grad = True is a bit risky.

Ping @sw005320 @wanchichen @simpleoier @pyf98 for some discussion here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. One case is that it can fail if the pretrained SSL models are updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For that purpose, I would suggest we simply follow the setting with the current design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I choose to save all parameters with requires_grad = True is that in adapter setting, we usually need to save the downstream model or other tunable parameters together (Ex, the weighted_sum). Therefore, we may not able to follow what loralib did, which only save lora_layers.

I understand your concern about the current design. However, for the case SSL models are updated, I think it would be better to save the SSL weights (or I can not see why we need to update it without saving it).

If we have to follow what loralib does by offering an options which saves the adapter parameter only, then I would like to add another option for saving parameters with requires_grad = True. In this case, we can fulfill the requirement but also make sure adapter and lora can work in s3prl settings.

Do you think it's a good idea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some revisions here.
Please check d7f0b39

@ftshijt @simpleoier

Copy link
Contributor

mergify bot commented Jan 30, 2024

This pull request is now in conflict :(

@mergify mergify bot added the conflicts label Jan 30, 2024
@ftshijt
Copy link
Collaborator

ftshijt commented Jan 30, 2024

Could you fix the conflicts and update the pre-trained model in the readme session? then we can move forward to merge it as it is a dependency for some other projects~

@kan-bayashi kan-bayashi modified the milestones: v.202312, v.202405 Feb 6, 2024
Copy link
Contributor

mergify bot commented Feb 6, 2024

This pull request is now in conflict :(

@sw005320
Copy link
Contributor

Please fix the above conflict.

@@ -96,6 +96,66 @@ General steps to run tasks in LID trask are as follows:
```
./run_multi.sh --asr_config <your_training_config> --duration {10min, 1h} --lid true --only_lid false
```
## Adapter usage guidelines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to note that the challenge does not allow the fine-tuning of SSL models?
@ftshijt, what do you think?

HoulsbyTransformerSentenceEncoderLayer = None
else:

class HoulsbyTransformerSentenceEncoderLayer(TransformerSentenceEncoderLayer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this class does not go through the test.
Can you check it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May you check this now?

Stanwang1210 and others added 23 commits February 20, 2024 21:38
@Stanwang1210
Copy link
Contributor Author

It seems like the CI error happened due to some weird reason. Is this caused by my PR?

@sw005320
Copy link
Contributor

It's not related to your PR.
This happens accidentally.
I reran the CI.

@sw005320
Copy link
Contributor

@ftshijt, do you know why codecov complains about this?
I think @Stanwang1210 correctly prepared the test.

If this is due to some issues in codecov, we can ignore it and merge this PR.

image

@ftshijt
Copy link
Collaborator

ftshijt commented Feb 21, 2024

@ftshijt, do you know why codecov complains about this? I think @Stanwang1210 correctly prepared the test.

If this is due to some issues in codecov, we can ignore it and merge this PR.

image

Not exactly from the codecov, I think it is mainly due to an in-successfully running of other CI tests (i.e., the time out for vits decoding), which automatically stop some running CIs, resulting in no execuation of the test function from Stan.

@sw005320
Copy link
Contributor

Thanks, @Stanwang1210!

@sw005320 sw005320 merged commit 98b0387 into espnet:master Feb 22, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASR Automatic speech recogntion ESPnet2 New Features README
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants