You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, thanks a lot for sharing this great repo.
I have two questions with the computation of relation prediction accuracy:
Suppose the model is trained and we only want to evaluate the trained model. The accuracy can be different with different values for the batch-size parameter (however, it should not be influenced by batch-size because the model does not change), especially when the number of test examples is not very large. The reason could be that not all batches have batch-size examples (if num_test_example % batch-size != 0). I feel it is better that edge_accuracy() in utils.py returns the average accuracy and the number of examples in this batch, and then compute the average in the main script by taking the division.
(If I understand correctly), we (or you) do not care about the ''absolute'' class label. It is more like clustering instead of classification. So, for the two-relation cases, the accuracy should be max(acc, 1.0-acc)? Besides, I wonder do you have some ideas to compute the accuracy with multiple (>2) relation cases? (the current edge_accuracy() function seems only suitable for two-relation case).
The text was updated successfully, but these errors were encountered:
Regarding 1): You're right, the way we accumulate metrics doesn't correctly account for incomplete batches. This should only make a minor difference in the evaluation scores, but please feel free to submit a pull request.
First, thanks a lot for sharing this great repo.
I have two questions with the computation of relation prediction accuracy:
batch-size
parameter (however, it should not be influenced bybatch-size
because the model does not change), especially when the number of test examples is not very large. The reason could be that not all batches havebatch-size
examples (ifnum_test_example % batch-size != 0
). I feel it is better that edge_accuracy() in utils.py returns the average accuracy and the number of examples in this batch, and then compute the average in the main script by taking the division.max(acc, 1.0-acc)
? Besides, I wonder do you have some ideas to compute the accuracy with multiple (>2) relation cases? (the current edge_accuracy() function seems only suitable for two-relation case).The text was updated successfully, but these errors were encountered: