Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not found recurrent layer in model files #6

Open
courao opened this issue Apr 21, 2019 · 1 comment
Open

Not found recurrent layer in model files #6

courao opened this issue Apr 21, 2019 · 1 comment
Labels
enhancement New feature or request

Comments

@courao
Copy link

courao commented Apr 21, 2019

I checked the network roughly, and I found it seems no recurrent layers like Bi-LSTM?
Is this repo another implementation for CRNN? I just see several CNN backbone and fully connected layers, but not found RNN layers.

@zhiqwang
Copy link
Owner

zhiqwang commented Apr 21, 2019

As I mentioned in issue (#4), the current network only contain a CNN backbone, the fully connected (FC) layers acts as a decoder part. I test with a datasets only containing numbers, and the result shows that CNN (encoder) + FC(decoder) is better than the CNN (encoder) + [RNN + FC] (decoder) architecture, I believe the same results in Chinese character datasets. I don't test the current architecture for English character datasets.

You can modify the ./model/crnn.py scripts as in following to add recurrent layers, which is the same Bi-LSTM layers used in author's original CRNN implementation. Also here, I attach a FC layer after the recurrent layers, you can remove the FC layers, the FC layers here also performs better in my datasets.

class BLSTM(nn.Module):

    def __init__(self, input_size, hidden_size, output_size):
        super(BLSTM, self).__init__()

        self.rnn = nn.LSTM(input_size, hidden_size, bidirectional=True)
        self.out = nn.Linear(hidden_size * 2, output_size)

    def forward(self, input):
        rnn_output, _ = self.rnn(input)
        output = self.out(rnn_output)  # [seq_len, batch, output_size]

        return output


class CRNN(nn.Module):

    def __init__(self, features, meta):

        super(CRNN, self).__init__()
        self.features = nn.Sequential(*features)
        self.avgpool = nn.AdaptiveAvgPool2d((1, None))
        self.encoder = BLSTM(meta['output_dim'], meta['hidden_dim'], meta['hidden_dim'])
        self.decoder = BLSTM(meta['hidden_dim'], meta['hidden_dim'], meta['hidden_dim'])
        self.classifier = nn.Linear(meta['hidden_dim'], meta['num_classes'])
        self.meta = meta

    def forward(self, x):
        # x -> features
        out = self.features(x)
        # features -> pool -> flatten -> decoder -> softmax
        out = self.avgpool(out)
        out = out.permute(3, 0, 1, 2).view(out.size(3), out.size(0), -1)
        out = self.classifier(self.decoder(self.encoder(out)))
        out = F.log_softmax(out, dim=2)

        return out

zhiqwang added a commit that referenced this issue Apr 24, 2019
@zhiqwang zhiqwang mentioned this issue May 3, 2019
@zhiqwang zhiqwang added the enhancement New feature or request label Jun 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants