Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do you make the input to LSTM module has length of 1 at the time dimension? #6

Open
xiaobin-xs opened this issue Aug 1, 2023 · 0 comments

Comments

@xiaobin-xs
Copy link

In the code for many of the decoder models, you have self.agvpool = nn.AdaptiveAvgPool2d((1,1)) (for example, in the builder/models/detector_models/resnet_dilation_lstm.py file at Line 119), which, if I understand it correctly, averages out the output from the CNN module at channel and time dimension so that the output is 1-by-1 at those two dimensions. I can understand that for the channel dimension, but not for the time-series dimension. As this output will then be sent to the LSTM module, the whole point of which is to process time-series signals. If it is of length 1 at time dimension, then why does it need an LSTM? Am I missing anything?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant