Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top-5%/top-64 computation #11

Open
egracheva opened this issue Sep 20, 2022 · 3 comments
Open

Top-5%/top-64 computation #11

egracheva opened this issue Sep 20, 2022 · 3 comments

Comments

@egracheva
Copy link

Hello,

Thanks for a great paper.

When you compute top-5%/top-64 score (Tables 4, 11), how many architectures are there in total?
Is it 3000 architectures (only warmup) or the size of the entire dataset?

Cheers,
Ekaterina

@vaenyr
Copy link
Contributor

vaenyr commented Sep 20, 2022

From the top of my head I would say all models from the search space were included.
This is also what the notebook seems to suggest, although I wasn't the one to run these experiments.
Perhaps @mohsaied can verify.

@mohsaied
Copy link
Collaborator

Correct. They are the top-64 models in the entire search space. The idea is to quantify the degree by which zero-cost warmup improves the sampled architectures. If you took 64 random models then the number of top-5 models would simply be 5% of 64 = 3 models. However, when we use a zero-cost metric like synflow, and take the top 64 models in the search space, we increase that number significantly as shown in the tables.

So this comparison shows the best case scenario of zero-cost warmup. It would be interesting to also try it out with smaller warmup sizes as you suggested and that should be somewhat straightforward to do. If you end up doing this experiment, we'd love a pull request :)

@egracheva
Copy link
Author

Thanks for your replies!

I was confused by the fact that these Tables are given in the "Warmup" section. Actually, I am still inclined to believe that the numbers are given for a random 3000 warmup.

Some time ago I plotted synflow metric vs accuracy, and the numbers in the tables did not seem to fit the shape of the cloud. Now I have double-checked and recomputed the value for the whole set (using the provided nasbench101_correlations notebook). My result with synflow for the top-5%/top64 is 4 for the whole search space of NAS-Bench-101 (compared to 12 given in the paper).

This is confirmed by the plots below:
Screenshot 2022-09-21 at 9 56 37

Zoom:
Screenshot 2022-09-21 at 9 56 21

My final aim is to compare my zero-cost metric to your results, and even though I have higher overall and top-10% correlations, my top-5%/top-64 for NAS-Bench-101 is also very low (I'd guess this is the nature of the benchmark, probably, not the optimal set of hyperparameters during the training, or not enough epochs.)

I think I can do multiple warmup test as you suggested some time later (soon).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants