Incorrect implied shape inside loss function #8350
Comments
When I reshaped my label to (32, 1), as the error message stated, I got the training to run, but the alignment between the output batch and the associated label makes no sense...There is now a single label for an entire batch. |
Please post your custom loss function |
No custom loss function
It looks like the
My labels are one-hot 1 arrays...so it appears that the 0 entry of each label tensor (that's what shape (32,) is, right?) in the batch is used as the label for every training example in the batch, which doesnt make sense to me. |
When I decrease the batch size to 1, the training labels are simply integers (which is actually the 0th entry in the one-hot array for each actual label), confirming my above assumption.... |
You can use gluon.loss.SoftmaxCrossEntropyLoss, where you can specify |
This is my new loss function:
And Im still getting:
Note: batch size is 32, and one-hot length is 51 |
I had to set
|
@nickeleres Just to mention that what I meant is sparse_label=False, from_logits is used if log_softmax is applied prior to the loss function. |
Ok. So what is the explicit correct loss function for one-hot labels? |
gluon.loss.SoftmaxCrossEntropyLoss(sparse_label=False) |
Awesome, thank you so much |
Ive seen this brought up in a couple other issues, but it hasnt been resolved as far as I know.
The data I am feeding into my loss function is of the following shape (batch size 32):
When the
output
andlabel
are fed into the loss function, tho, I get the following error:MXNetError: Shape inconsistent, Provided=(32,51), inferred shape=(32,1)
Why is the loss function implying an incorrect shape? In the line above said loss function, Gluon knows the correct shape of each input matrix, but the loss function auto-implies the shape incorrectly.
The text was updated successfully, but these errors were encountered: