Registering (forward/backward) hooks on intermediate Mamba layers #59

SrGonao · 2023-12-15T15:52:22Z

I'm trying to look at the internal activations of a MambaLMHeadModel (eg. the one you get when you load pretrained). If I look through the modules, I can find all the layers, and each of the modules of the Mamba block.

If I register a forward hook to "backbone.layers.0.mixer" (or any other layer) I can get an activation every forward pass. On the other hand, if I hook on "backbone.layers.0.mixer.in_proj" or "backbone.layers.0.mixer.conv1d", they don't get called during the forward pass of the model.

Is this the expected behaviour?

albertfgu · 2023-12-15T22:19:51Z

In the fused fast implementations, the inner modules might not be getting called directly. Instead, we use their weights (e.g. backbone.layers.0.mixer.conv1d.weight) and pass it into a different function (e.g. from the causal-conv1d package).

SrGonao · 2023-12-15T23:27:03Z

I tried setting the fast path variable of the layers to false but I still couldn't get it to work. Is it even possible in the pretained versions?

albertfgu · 2023-12-16T17:39:42Z

Is it even possible in the pretained versions?

It should be. Pretrained versions just give a set of weights which are the same no matter the computation path.

I tried setting the fast path variable of the layers to false but I still couldn't get it to work.

You should double check that it's running the path you intend and that the modules are being called directly. For example, you want to hit the following line to call the conv module, instead of using the fast causal_conv1d_fn. Perhaps it's not hitting this path if you've already pip installed the module because it will automatically use the fast conv1d function.

https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba_simple.py#L169

SrGonao · 2023-12-16T19:34:11Z

Thank you I hadn't seen that logic there. Because it does not seem feasible to make sure that causal-conv1d is not installed, I will probably have to do some hacky work around to go around that

albertfgu · 2023-12-17T21:03:03Z

It should be fairly easy to add another flag to the init function and change the line

            if causal_conv1d_fn is None:

to something like

            if causal_conv1d_fn is None or not self.use_fast_conv:

SrGonao · 2023-12-17T21:10:30Z

Yes, that's what I was thinking

zjq0455 · 2024-11-02T11:51:18Z

I'm trying to look at the internal activations of a MambaLMHeadModel (eg. the one you get when you load pretrained). If I look through the modules, I can find all the layers, and each of the modules of the Mamba block.

If I register a forward hook to "backbone.layers.0.mixer" (or any other layer) I can get an activation every forward pass. On the other hand, if I hook on "backbone.layers.0.mixer.in_proj" or "backbone.layers.0.mixer.conv1d", they don't get called during the forward pass of the model.

Is this the expected behaviour?

Hi! I also face the same issue about register_forward_hook to get the input and output of linear layers in Mamba blocks. Did you solve the problem?

SrGonao closed this as completed Dec 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Registering (forward/backward) hooks on intermediate Mamba layers #59

Registering (forward/backward) hooks on intermediate Mamba layers #59

SrGonao commented Dec 15, 2023 •

edited

Loading

albertfgu commented Dec 15, 2023

SrGonao commented Dec 15, 2023

albertfgu commented Dec 16, 2023

SrGonao commented Dec 16, 2023

albertfgu commented Dec 17, 2023

SrGonao commented Dec 17, 2023

zjq0455 commented Nov 2, 2024

Registering (forward/backward) hooks on intermediate Mamba layers #59

Registering (forward/backward) hooks on intermediate Mamba layers #59

Comments

SrGonao commented Dec 15, 2023 • edited Loading

albertfgu commented Dec 15, 2023

SrGonao commented Dec 15, 2023

albertfgu commented Dec 16, 2023

SrGonao commented Dec 16, 2023

albertfgu commented Dec 17, 2023

SrGonao commented Dec 17, 2023

zjq0455 commented Nov 2, 2024

SrGonao commented Dec 15, 2023 •

edited

Loading