You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am testing the model on multiple configs. While using step() method to get both the output and the states, I observed that models with sLSTM layer does not have method step. Instead, to get the state, we must use the argument return_last_state=True. This causes the xLSTM Language model cannot get state. This is my code used:
from omegaconf import OmegaConf
import torch
from dacite import from_dict
from dacite import Config as DaciteConfig
from xlstm import xLSTMLMModel, xLSTMLMModelConfig
xlstm_cfg = """
vocab_size: 50304
mlstm_block:
mlstm:
conv1d_kernel_size: 4
qkv_proj_blocksize: 4
num_heads: 4
slstm_block:
slstm:
backend: vanilla
num_heads: 4
conv1d_kernel_size: 4
bias_init: powerlaw_blockdependent
feedforward:
proj_factor: 1.3
act_fn: gelu
context_length: 256
num_blocks: 7
embedding_dim: 128
slstm_at: [1]
"""
cfg = OmegaConf.create(xlstm_cfg)
cfg = from_dict(data_class=xLSTMLMModelConfig, data=OmegaConf.to_container(cfg), config=DaciteConfig(strict=True))
model = xLSTMLMModel(cfg)
x = torch.randint(0, 50304, size=(4, 256)).to("cpu")
model = model.to("cpu")
model.step(torch.Tensor([1]).unsqueeze(dim=0).long())
This is the error: AttributeError: 'sLSTMLayer' object has no attribute 'step'
I wonder can you change the method in the sLSTM for analogousness. Thank you so much.
The text was updated successfully, but these errors were encountered:
Hi, thank so much for your work.
I am testing the model on multiple configs. While using
step()
method to get both the output and the states, I observed that models with sLSTM layer does not have methodstep
. Instead, to get the state, we must use the argumentreturn_last_state=True
. This causes the xLSTM Language model cannot get state. This is my code used:This is the error:
AttributeError: 'sLSTMLayer' object has no attribute 'step'
I wonder can you change the method in the sLSTM for analogousness. Thank you so much.
The text was updated successfully, but these errors were encountered: