You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When initializing a Mamba block, why is it that we don't need to pass in a sequence length input?
For example, I want to do next-step prediction by mapping timesteps [s ... t] to [s + 1 ... t + 1]. Does this mean that I can train mamba on any arbitrary sequence length, and it will accurately do next-step prediction for any arbitrary sequence length input? I'm wondering if I can do autoregressive prediction this way; or are we expected to pass in a constant sequence length.