New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train_reloss.py is not runnable #2
Comments
Thanks for your reply. |
Yes. This was fixed through 7a23ccc. |
In line 36 of the latest version, targets are obtained by taking a max function upon logits on dimension 1(the batch dimension rather than the token dimension, -1 ?), which will also ignore the input targets from the batch. I wonder why this is conducted. |
The full logits data in TensorDataset is with shape To compute the accuracy, the straightforward way is to use the original targets in dataset (this could be done by commenting lines 36~38). Nevertheless, we only uses the trained model to predict the logits, which means that the diversity of accuracy is limited (e.g., almost all accuracies are larger than 90% in CIFAR-10). Therefore, the surrogate loss trained in this way would not be used at the early period of training. In order to increase the diversity of accuracy, we generate pseudo targets with a randomly sampled accuracy (line 37). In line 36, the Hope this could help you understand our code more clearly :) |
Thanks for the detailed explanation. In the paper, you mention the training data for the Reloss is generated by GR with probability p and GM with 1-p. However, there is no mention in the code. I wonder how this is implemented. |
Actually we did not use random data generator in the classification loss ( The random generator is only used on scene text recognition task for comparisons to previous work LS-ED [1], and you can find the details of random generator on this task in the paper. [1] Learning surrogates via deep embedding. In European Conference on Computer Vision, pp. 205–221. Springer, 2020. |
It means if we do not want to use the random generator, we should comment the lines 36~38, is this correct? Also, logits are the predictions from a model and the targets are the ground-truth labels corresponding to the logits, right? |
Yes, you can comment lines 36~38 to use the original predictions and labels without any pseudo data generated, but this may have a potential risk of performing worse if your data distribution is not diverse enough. |
What is the best option for natural language generation tasks including machine reading comprehension and machine translation? |
For these two NLP tasks, we do not involve any random generation and directly use the pure outputs and labels generated by the network and dataset. |
thanks for share the information. |
Hi,
Thanks for sharing your awesome work. I notice there are many typos and un-existed parameters in train_reloss.py, such as the code from line 33-40. 'i' is not defined, and 'logits, targets' from the corresponding batch are not used in the following code.
The text was updated successfully, but these errors were encountered: