-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This code chooses best model on test set instead of validation set. #2
Comments
This is actually a common practice in session-based recommendation. The same setting was adopted in some previous works such as STAMP, SR-GNN, and FGNN, and in some recently published works such as DHCN and TAGNN. We could even find the same setting in other fields of recommender systems, such as LightGCN and NGCF. Despite the popularity of our setting, I agree with you that in principle, we should choose the best model on the validation set and evaluate the model on the test set. The common approach is to first extract a most recent fraction of the data as the test set, and then extract a most recent fraction of the remaining data as the validation set. But this approach may also have a problem, i.e., the distributions of the validation set and the test set are different because their collection times are different, so the performance on the validation set does not reveal the true performance on the test set. I guess this is part of the reason some existing methods did not adopt the correct-in-principle approach. I think a better approach is to first extract a most recent fraction of the data and then randomly split it into the validation and test sets. But I did not think of this approach when we published this work. In conclusion, I think our setting is OK because it is very common. Besides, it is used for all models in our experiments. The comparison of models' performance is fair. |
Still confused by this common practice, but thanks for your reply. |
At
utils/train.py
, the train process returns the best MRR and HIT on TEST SET, RESPECTIVELY:I believe this code is not the final version, please update it to the code used in your paper.
The text was updated successfully, but these errors were encountered: