Lstm panics when batch size is greater than 1 #872

mosheduminer · 2023-10-17T01:24:22Z

Describe the bug
The LSTM module provided by burn seems to always fail (panic) when batch size is not equal to 1.

To Reproduce
To reproduce, you can run this code:

use burn::{
    backend::TchBackend,
    nn::LstmConfig,
    tensor::{Data, Tensor},
};

type B = TchBackend<f32>;

fn main() {
    // This part works
    let lstm = LstmConfig::new(1, 1, false, 1).init::<B>();

    let a = Tensor::<B, 3>::from_data(Data::from([[[0.0]]]));
    println!("shape: {:?}", a.shape());
    lstm.forward(a, None);
    println!("success");

    // This part panics
    let lstm = LstmConfig::new(1, 1, false, 2).init::<B>();

    let a = Tensor::<B, 3>::from_data(Data::from([[[0.0]], [[0.0]]]));
    println!("shape: {:?}", a.shape());
    lstm.forward(a, None); // Panic
}

Expected behavior
I expect that the lstm should handle various batch sizes correctly, without panicking.

Desktop (please complete the following information):

OS: Windows 11

Burn version

0.9.0 (using burn = { version = "0.9.0", features = ["train", "tch"] })

Additional context
I see that recently the batch size parameter was removed from the LstmConfig (though this change wasn't released yet), and is instead inferred dynamically at runtime. I suppose it's possible that this bug wasfixed as part of that change, but I have not checked to see whether that is the case.

The text was updated successfully, but these errors were encountered:

agelas · 2023-10-17T06:09:50Z

@mosheduminer Based on the description this bug is almost definitely still present in development/main.

@nathanielsimard Based on the discord snippets if I had to guess it's probably got to do with the squeezing/unsqueezing, so either this:

for (t, input_t) in batched_input.iter_dim(1).enumerate() {
            let input_t = input_t.squeeze(1);

or this

// store the state for this timestep
batched_cell_state = batched_cell_state.slice_assign(
  [0..batch_size, t..(t + 1), 0..self.d_hidden],
  cell_state.clone().unsqueeze(),
);
batched_hidden_state = batched_hidden_state.slice_assign(
  [0..batch_size, t..(t + 1), 0..self.d_hidden],
  hidden_state.clone().unsqueeze(),
);

agelas · 2023-10-25T06:07:53Z

@nathanielsimard I think this issue can be closed!

mosheduminer · 2023-10-29T03:05:39Z

I guess you're waiting on me to close this?

agelas mentioned this issue Oct 17, 2023

Bug/lstm unsqueeze #873

Merged

1 task

mosheduminer closed this as completed Oct 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lstm panics when batch size is greater than 1 #872

Lstm panics when batch size is greater than 1 #872

mosheduminer commented Oct 17, 2023 •

edited

Loading

agelas commented Oct 17, 2023

agelas commented Oct 25, 2023

mosheduminer commented Oct 29, 2023

Lstm panics when batch size is greater than 1 #872

Lstm panics when batch size is greater than 1 #872

Comments

mosheduminer commented Oct 17, 2023 • edited Loading

agelas commented Oct 17, 2023

agelas commented Oct 25, 2023

mosheduminer commented Oct 29, 2023

mosheduminer commented Oct 17, 2023 •

edited

Loading