Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two modification of codes is needed to run sample program in version 1.0.4 #24

Open
miaozhixu opened this issue Jun 18, 2024 · 2 comments

Comments

@miaozhixu
Copy link

miaozhixu commented Jun 18, 2024

In source code: xlstm/blocks/slstm/layer.py, Line 134:
if return_last_state:
x_conv = self.conv1d(x, conv_state, return_last_state=return_last_state)
else:
x_conv, conv_state = self.conv1d(
x, conv_state, return_last_state=return_last_state
)
These lines of code is calling forward method in Class CausalConv1d.

In source code: xlstm/components/conv.py, Line 126:
if return_last_state:
return y[:, :, : -self.pad].transpose(2, 1), x[:, -self.pad :]
else:
return y[:, :, : -self.pad].transpose(2, 1)

When "return_last_state" 's value is false, the method "forward" in conv.py, will only return one value, not include the last state of x.
But in the layer.py, Line 137, it's expecting the method will return two values.

Should I swap the lines of layer.py (Line 135 and 137)?

@miaozhixu
Copy link
Author

miaozhixu commented Jun 19, 2024

According the meaning of "return_last_state", I consider the codes in xlstm/blocks/slstm/layer.py to achieve CausalConv1d.forward()'s return value need to swap.
if return_last_state:
x_conv, conv_state = self.conv1d(x, conv_state, return_last_state=return_last_state)
else:
x_conv = self.conv1d(x, conv_state, return_last_state=return_last_state)

@miaozhixu miaozhixu changed the title Calling sLSTM block yields a ValueError: too many values to unpack 2 bugs in version 1.0.4, they are related to the last state Jun 19, 2024
@miaozhixu
Copy link
Author

miaozhixu commented Jun 19, 2024

When I set return_last_state in README sample:
y = xlstm_stack(x, return_last_state=True)
and I swap the lines in slstm layer src, it is stop in xlstm_block.py Line 76.
so I modify it as following to check the return_last_state:
if kwargs['return_last_state']:
x = x + self.xlstm(self.xlstm_norm(x), **kwargs)[0]
else:
x = x + self.xlstm(self.xlstm_norm(x), **kwargs)

now, x(tensor) will not add a tuple.

@miaozhixu miaozhixu changed the title 2 bugs in version 1.0.4, they are related to the last state Two modification of codes is needed to run sample program in version 1.0.4 Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant