[Q] What's the best way to override a PyTorch module used in a timm model? #2101

NightMachinery · 2024-02-24T16:07:26Z

NightMachinery
Feb 24, 2024

For my research, I need to override the forward method of some modules such as nn.SiLU. I have been doing this manually before, going through the code of the ViT model and just changing the modules used to my overridden versions. But I like to be able to create a general way to use any timm (or ideally PyTorch) model with my modifications.

I have seen forward hooks, but ideally I want to be able to use class MySiLU(nn.SiLU) which gives me more flexibility. The forward hook would be inefficient for my purposes, as it would compute the forward pass twice (I need to recompute the forward pass for my changes).

I also need to override F.scaled_dot_product_attention. Should I just do F.scaled_dot_product_attention = my_scaled_dot_product_attention? Ideally, I also need to access the parent module that called F.scaled_dot_product_attention. Is that possible through some hack?

Is F.scaled_dot_product_attention even reliable? @rwightman mentions that F.scaled_dot_product_attention is buggy in DINOv2 worse performance compared to the original version · Issue #2094 · huggingface/pytorch-image-models. I am using PyTorch 1.13.1+cu116, as it's the latest version my vGPU driver supports.

Any suggestions?

rwightman · 2024-02-26T15:59:16Z

rwightman
Feb 26, 2024
Maintainer

F.sdpa usually works fine but there was a sequence of bugs related to masking over recent 2.x releases.

F.sdpa usage can be disabled in timm by setting TIMM_FUSED_ATTN=0 in your environment.

0 replies

rwightman · 2024-02-26T16:03:17Z

rwightman
Feb 26, 2024
Maintainer

To replace modules, the best way is as freeze_bn or syncbn examples, iterate over the modules recursively and rebuild the model by swapping ones that match your criteria

https://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/norm_act.py#L148-L187

...but, if the bits of code you want to replace are called functionally you cannot do that, you could FX the model and manipulate the graph but that is complicated and has limitations, you typically need to alter the code at that point or have flags in place to allow different paths

1 reply

NightMachinery Mar 5, 2024
Author

Thanks. Can I use the following approach to patch the PyTorch function scaled_dot_product_attention? What are the potential drawbacks?

import torch.nn.functional as F

# If `scaled_dot_product_attention_orig` is not already defined (useful for when the module is reloaded):
if not 'scaled_dot_product_attention_orig' in globals():
    scaled_dot_product_attention_orig = F.scaled_dot_product_attention

def scaled_dot_product_attention_patched(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False, scale=None, **kwargs):
    print("Welcome to the patched function!")

    return scaled_dot_product_attention_orig(query=query, key=key, value=value, attn_mask=attn_mask, dropout_p=dropout_p, is_causal=is_causal, scale=scale, **kwargs)

F.scaled_dot_product_attention = scaled_dot_product_attention_patched

yiakwy-xpu-ml-framework-team · 2024-03-19T06:43:36Z

yiakwy-xpu-ml-framework-team
Mar 19, 2024

Simply write an unbounded function f with self as first input:

def forward(self, ...): pass

Now import transformers and assign f to a module forward.

transformers.some.modeling_other.attention.forward = forward

You then override the hugging face functions.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Q] What's the best way to override a PyTorch module used in a timm model? #2101

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

[Q] What's the best way to override a PyTorch module used in a timm model? #2101

NightMachinery Feb 24, 2024

Replies: 3 comments · 1 reply

rwightman Feb 26, 2024 Maintainer

rwightman Feb 26, 2024 Maintainer

NightMachinery Mar 5, 2024 Author

yiakwy-xpu-ml-framework-team Mar 19, 2024

NightMachinery
Feb 24, 2024

Replies: 3 comments 1 reply

rwightman
Feb 26, 2024
Maintainer

rwightman
Feb 26, 2024
Maintainer

NightMachinery Mar 5, 2024
Author

yiakwy-xpu-ml-framework-team
Mar 19, 2024