Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCP Search Space #1

Closed
panizbehboudian opened this issue May 6, 2021 · 4 comments
Closed

NCP Search Space #1

panizbehboudian opened this issue May 6, 2021 · 4 comments

Comments

@panizbehboudian
Copy link

Hello!

I am interested in using you search space, specifically for image segmentation task. Roughly speaking, I want to search through all valid codes and occasionally query the oracle (dataset) to get the true mean accuracy. The issue is that the size of the dataset you provided (~2500) is much much lower than all possible codes so there must be a pattern or set of rules to validate a code (or network). For example if in the second stage we get 2 blocks then the number of channels has to be within a special range. Thank you in advance for taking the time and reading my question.

Paniz,

@dingmyu
Copy link
Owner

dingmyu commented May 7, 2021

Hi Paniz,

The 2500 architectures are randomly sampled, and you can reduce the search space by enlarging/reducing the number of blocks, the number of convolutions, and the number of channels in equal proportions, as explored in EfficientNet and RegNet.

In fact, the search space of NCP is too large to be traversed by verifying all the codes, so we use NCP to find the optimal code by backpropagating the gradient of a learned neural predictor, which is more efficient.

Regards,
Mingyu

@panizbehboudian
Copy link
Author

Thanks for your prompt response. This makes sense, but I was wondering how one should get the ground truth accuracy. Do you use a predictor for generating the 2500 arch target accuracy values or you have an oracle?

@dingmyu
Copy link
Owner

dingmyu commented May 8, 2021

Thanks for your prompt response. This makes sense, but I was wondering how one should get the ground truth accuracy. Do you use a predictor for generating the 2500 arch target accuracy values or you have an oracle?

Hi Paniz,

I have an oracle where all 2500 models are trained and evaluated using standard settings to get the target accs. It is difficult to get the GT accuracy of all architectures because the training for one model takes several hours on 8 GPUs. Our solution is thus, sampling as much as possible (2500 models cost about 400K GPU hours) and using NCP to traverse the space.

Thanks,
Mingyu

@dingmyu
Copy link
Owner

dingmyu commented Jun 15, 2021

Feel free to reopen the issue if you have any further questions.

Mingyu

@dingmyu dingmyu closed this as completed Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants