Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size mismatch during the forward-propagation? #9

Closed
zjysteven opened this issue May 16, 2019 · 2 comments
Closed

Size mismatch during the forward-propagation? #9

zjysteven opened this issue May 16, 2019 · 2 comments

Comments

@zjysteven
Copy link

Hi! First, thank you for this well-designed benchmark!

However, I have a question about the tensor size before and after the CNN. The input of CNN is (Cin x T x 1024) (I'm using the channel-first notion for convenience) and there are three max-pooling operations with the kernel size of 8, 8, 4 along the last dimension. I expect this will lead the output of CNN to be (Cout x T x 4), which is inconsistent with (Cout x T x 2) as illustrated in your image of the network architecture. Please correct me if I miss something.

Besides, I'm wondering why you choose to use only a one-layer linear projection without any non-linear activation for the DOA prediction? Is it related to the performance according to your experiments? Thanks!

@sharathadavanne
Copy link
Owner

Hi @zjysteven thanks for pointing the error. You are right, the CNN output dimension should be (Cout x T x 4). I will update the image soon.

I remember trying ReLU and tanh activations in the penultimate layer of the DOA output, but I dont think I got good results. Similarly, more number of fully-connected layers didn't help either. Both these studies were done for Cartesian DOA output as discussed in the original SELDnet paper, and I used the same model here with only one change, i.e., Spherical DOA coordinates output. So I am not sure if the number of fully-connected layers and different activations would help for Spherical coordinates.

@zjysteven
Copy link
Author

@sharathadavanne Thanks for your quick response! That makes sense since while I am reproducing the baseline, I used more number of fully-connected layers together with ReLU activation, but I couldn't achieve similar DOA results to yours. Thought I might suffer from over-fitting so I asked the above question. Thanks again for your confirmation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants