DPLSTM for multiclass text classification #72

iamsusiep · 2020-10-07T14:32:51Z

Hi, I was trying to use LSTM for text classification with sequence by referring to char-lstm-classification.py.

class LSTMClassifier(nn.Module):
    # https://github.com/prakashpandey9/Text-Classification-Pytorch/blob/master/load_data.py
    # + Opacus example
    def __init__(
        self,
        batch_size,
        output_size,
        hidden_size,
        vocab_size,
        embedding_length,
        weights,
    ):
        super(LSTMClassifier, self).__init__()

        self.batch_size = batch_size
        self.output_size = output_size
        self.hidden_size = hidden_size
        self.vocab_size = vocab_size
        self.embedding_length = embedding_length

        self.embedding = nn.Embedding(
            vocab_size, embedding_length
        )  # Initializing the look-up table.

        self.lstm = DPLSTM(embedding_length, hidden_size, batch_first=False)
   
        self.out_layer = nn.Linear(hidden_size, output_size)

    def forward(self, input, hidden):
        input_emb = self.embedding(input)
        input_emb = input_emb.permute(1, 0, 2)
        lstm_out, _ = self.lstm(input_emb, hidden)
        # batch dimension = 1 is needed throughout, so we add an additional
        # dimension and subsequently remove it before the softmax
        output = self.out_layer(lstm_out[-1].unsqueeze(0))
        return output[-1]

    def init_hidden(self):
        return (
            torch.zeros(1, self.batch_size, self.hidden_size),
            torch.zeros(1, self.batch_size, self.hidden_size),
        )

This model works with regular LSTM, fails with DPLSTM, where error appears on loss.backward(),
RuntimeError: the size of tensor a (100) must match the size of tensor b (16) at non-singleton dimension 0
16 is a batch size and 100 is input text sequence length.
Is there any insights to why this error occurs? Thank you!

The text was updated successfully, but these errors were encountered:

Darktex · 2020-10-07T18:04:57Z

Hi @iamsusiep ! Can you please share the entire script so that we can reproduce faster?

iamsusiep · 2020-10-08T14:35:26Z

Closing the issue as it has been resolved (fixed by modifying lstm_out[-1].unsqueeze(0) to lstm_out[:, -1])

iamsusiep changed the title ~~DPLSTM for multiclass sentiment~~ DPLSTM for multiclass text classification Oct 7, 2020

Darktex assigned sayanghosh Oct 7, 2020

iamsusiep closed this as completed Oct 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DPLSTM for multiclass text classification #72

DPLSTM for multiclass text classification #72

iamsusiep commented Oct 7, 2020

Darktex commented Oct 7, 2020

iamsusiep commented Oct 8, 2020

DPLSTM for multiclass text classification #72

DPLSTM for multiclass text classification #72

Comments

iamsusiep commented Oct 7, 2020

Darktex commented Oct 7, 2020

iamsusiep commented Oct 8, 2020