Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Softmax in Model Output, then using CE Loss #23

Open
kerrgarr opened this issue Dec 9, 2021 · 1 comment
Open

Softmax in Model Output, then using CE Loss #23

kerrgarr opened this issue Dec 9, 2021 · 1 comment

Comments

@kerrgarr
Copy link

kerrgarr commented Dec 9, 2021

Thank you for the interesting work here.

I've just encountered one issue with the code. The ConvLSTM model outputs softmax as the last layer, but then in the training script CrossEntropyLoss is performed. CE Loss already performs a softmax on the input, so you do not want to do softmax on a softmax twice. Instead, the ConvLSTM should output the classification (Linear) layer prior to the Softmax to put into CE loss. The softmax probabilities can be computed later in the test set evaluation step to determine the test accuracy.

Please let me know if others agree with this small change to the code.

Also, what type of Attention is being used? Is it the dot-product?

@takekbys
Copy link

Same understanding. Softmax in the ConvLSTM.output_layers is not necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants