Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significance of MSE vs Kendall's Tau / Spearman's Rank Correlation #19

Closed
afzalxo opened this issue Dec 22, 2022 · 1 comment
Closed

Comments

@afzalxo
Copy link

afzalxo commented Dec 22, 2022

Hi folks. I am working on designing a surrogate benchmark for some hardware specific performance metrics based on the principles suggested in your work.

I am currently using a small dataset of between 500 - 1000 model architectures from within an MNASNet-like search space with XGB to evaluate the performance of this surrogate with only this small dataset. The hyperparameters utilized for XGB are copied from your work.

I am getting high validation/test MSE results (~ 0.4 to 0.6) but with a high Kendall's Tau (~0.92) and Spearman's rank correlation (~0.98).

When I utilize the same number of models selected randomly from nb301_dataset (from random search directory) offered by you to train the surrogate, I get low MSE (~0.16) but with low KT (0.60) and Spearman's (0.78).

I'm wondering if this disparity could potentially be due to sub-optimal values of the hyperparameters. Do you have some insights on what could cause such a huge difference in the performance of the predictor? Furthermore, for evaluating performance of a surrogate, do you think Kendall's Tau or Spearman's rank correlation is a better metric compared to MSE, or vice versa.

@arberzela
Copy link
Contributor

Thanks for trying out our code. The discrepancy might be because we used an ensemble of XGB models in NB301. But I agree that better training hyperparameters might increase the performance even more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants