Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproduce your one/few-shot results on nas-bench 201 #1

Open
Jxu-Thu opened this issue Jul 12, 2021 · 36 comments
Open

reproduce your one/few-shot results on nas-bench 201 #1

Jxu-Thu opened this issue Jul 12, 2021 · 36 comments

Comments

@Jxu-Thu
Copy link

Jxu-Thu commented Jul 12, 2021

I try to reproduce your one-shpt results on nas-bench 201.
By running the code
https://github.com/aoiang/few-shot-NAS/blob/main/Few-Shot_NasBench201/supernet/one-shot/run.sh

and eval
https://github.com/aoiang/few-shot-NAS/blob/main/Few-Shot_NasBench201/supernet/one-shot/eval.sh

However, your eval.sh is

for (( c=0; c<=4; c++ ))
do
OMP_NUM_THREADS=4 python ./exps/supernet/one-shot-supernet_eval.py
--save_dir ${save_dir} --max_nodes ${max_nodes} --channel ${channel} --num_cells ${num_cells}
--dataset ${dataset} --data_path ${data_path}
--search_space_name ${space}
--arch_nas_dataset ${benchmark_file}
--config_path configs/nas-benchmark/algos/FEW-SHOT-SUPERNET.config
--track_running_stats ${BN}
--select_num 100
--output_dir ${OUTPUT}
--workers 4 --print_freq 200 --rand_seed 0 --edge_op ${c}
done

Why using te FEW-SHOT-SUPERNET.config?

@aoiang
Copy link
Owner

aoiang commented Jul 12, 2021

Good catch. Thank you for pointing it out. Actually, the only difference between "FEW-SHOT-SUPERNET.config" and "ONE-SHOT-SUPERNET.config" is the training epoch. But this script is used for evaluation. Therefore, using the FEW-SHOT-SUPERNET.config should be worked as desired. But to eliminate the confusion, I will change this line to ONE-SHOT-SUPERNET.config later. Thank you again for your attention.

@aoiang
Copy link
Owner

aoiang commented Jul 12, 2021

I try to reproduce your one-shpt results on nas-bench 201.
By running the code
https://github.com/aoiang/few-shot-NAS/blob/main/Few-Shot_NasBench201/supernet/one-shot/run.sh

and eval
https://github.com/aoiang/few-shot-NAS/blob/main/Few-Shot_NasBench201/supernet/one-shot/eval.sh

However, your eval.sh is

for (( c=0; c<=4; c++ ))
do
OMP_NUM_THREADS=4 python ./exps/supernet/one-shot-supernet_eval.py
--save_dir ${save_dir} --max_nodes ${max_nodes} --channel ${channel} --num_cells ${num_cells}
--dataset ${dataset} --data_path ${data_path}
--search_space_name ${space}
--arch_nas_dataset ${benchmark_file}
--config_path configs/nas-benchmark/algos/FEW-SHOT-SUPERNET.config
--track_running_stats ${BN}
--select_num 100
--output_dir ${OUTPUT}
--workers 4 --print_freq 200 --rand_seed 0 --edge_op ${c}
done

Why using te FEW-SHOT-SUPERNET.config?

I have changed the FEW-SHOT-SUPERNET.config to ONE-SHOT-SUPERNET.config. This is a typo. Thanks again for pointing it out and you can run this script directly. If you have further questions, please let me know.

@linnanwang
Copy link
Collaborator

@aoiang can you please list the "exact step" to reproduce the results? can you please run your codes on your end to ensure everything are okay with those steps? After you will have done that, please update this thread.

@Jxu-Thu
Copy link
Author

Jxu-Thu commented Jul 13, 2021

Thanks four your kind reply!
I run the train.sh and eval.sh, get the result as follows:

urrent test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|none2|)
evaluate : loss=78.47, accuracy@1=10.04%, accuracy@5=50.98% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|skip_connect2|)
evaluate : loss=1018350.04, accuracy@1=12.98%, accuracy@5=55.38% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|nor_conv_1x12|)
evaluate : loss=122.31, accuracy@1=11.96%, accuracy@5=59.11% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|nor_conv_3x32|)
evaluate : loss=115.14, accuracy@1=13.03%, accuracy@5=60.53% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|avg_pool_3x32|)
evaluate : loss=1125109.73, accuracy@1=12.93%, accuracy@5=54.39% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|none2|)
evaluate : loss=1.50, accuracy@1=53.53%, accuracy@5=93.50% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|skip_connect2|)
evaluate : loss=26184.69, accuracy@1=12.32%, accuracy@5=57.85% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|nor_conv_1x12|)
evaluate : loss=3.00, accuracy@1=46.15%, accuracy@5=90.89% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|nor_conv_3x32|)
evaluate : loss=2.81, accuracy@1=50.91%, accuracy@5=92.60% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|avg_pool_3x32|)
evaluate : loss=24581.70, accuracy@1=11.95%, accuracy@5=58.06% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|none2|)
evaluate : loss=1.18, accuracy@1=63.35%, accuracy@5=95.75% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|skip_connect2|)
evaluate : loss=23631.38, accuracy@1=11.73%, accuracy@5=57.01% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|nor_conv_1x12|)
evaluate : loss=2.63, accuracy@1=52.52%, accuracy@5=93.15% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|nor_conv_3x32|)
evaluate : loss=2.54, accuracy@1=54.93%, accuracy@5=93.94% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|avg_pool_3x32|)
evaluate : loss=21931.14, accuracy@1=11.81%, accuracy@5=57.27% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|none2|)
evaluate : loss=86.85, accuracy@1=10.03%, accuracy@5=50.57% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|skip_connect2|)
evaluate : loss=1080665.85, accuracy@1=12.91%, accuracy@5=54.70% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|nor_conv_1x12|)
evaluate : loss=109.15, accuracy@1=10.89%, accuracy@5=58.14% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|nor_conv_3x32|)
evaluate : loss=101.58, accuracy@1=12.16%, accuracy@5=60.64% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|avg_pool_3x32|)
evaluate : loss=1165846.05, accuracy@1=13.00%, accuracy@5=54.12% |
..... (a lot of info like these)

However, I DO NOT find the result of Kendall tau? Could you provide the steps to reproduce your ranking accuracy result?

As far as I know, evaluating super-net is just a loop to obtain the proxy acc for each arch in the super-net and calculate the rank acc between the proxy acc and GT acc. So another question is that I cannot understand why the eval script should be run 4 times for the one-shot super-net (not parallel for speedup)?

@linnanwang
Copy link
Collaborator

@aoiang I need you to list ALL your "specific" steps to answer the above question!!

@aoiang
Copy link
Owner

aoiang commented Jul 13, 2021

@aoiang I need you to list ALL your "specific" steps to answer the above question!!

Sure, I will double-check my code on my local side and write down every specific step for addressing this issue and reproducing the results. Thank you for your kind reminder.

@aoiang
Copy link
Owner

aoiang commented Jul 14, 2021

Thanks four your kind reply!
I run the train.sh and eval.sh, get the result as follows:

urrent test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|none2|)
evaluate : loss=78.47, accuracy@1=10.04%, accuracy@5=50.98% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|skip_connect2|)
evaluate : loss=1018350.04, accuracy@1=12.98%, accuracy@5=55.38% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|nor_conv_1x12|)
evaluate : loss=122.31, accuracy@1=11.96%, accuracy@5=59.11% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|nor_conv_3x32|)
evaluate : loss=115.14, accuracy@1=13.03%, accuracy@5=60.53% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|skip_connect1|avg_pool_3x32|)
evaluate : loss=1125109.73, accuracy@1=12.93%, accuracy@5=54.39% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|none2|)
evaluate : loss=1.50, accuracy@1=53.53%, accuracy@5=93.50% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|skip_connect2|)
evaluate : loss=26184.69, accuracy@1=12.32%, accuracy@5=57.85% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|nor_conv_1x12|)
evaluate : loss=3.00, accuracy@1=46.15%, accuracy@5=90.89% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|nor_conv_3x32|)
evaluate : loss=2.81, accuracy@1=50.91%, accuracy@5=92.60% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_1x11|avg_pool_3x32|)
evaluate : loss=24581.70, accuracy@1=11.95%, accuracy@5=58.06% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|none2|)
evaluate : loss=1.18, accuracy@1=63.35%, accuracy@5=95.75% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|skip_connect2|)
evaluate : loss=23631.38, accuracy@1=11.73%, accuracy@5=57.01% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|nor_conv_1x12|)
evaluate : loss=2.63, accuracy@1=52.52%, accuracy@5=93.15% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|nor_conv_3x32|)
evaluate : loss=2.54, accuracy@1=54.93%, accuracy@5=93.94% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|nor_conv_3x31|avg_pool_3x32|)
evaluate : loss=21931.14, accuracy@1=11.81%, accuracy@5=57.27% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|none2|)
evaluate : loss=86.85, accuracy@1=10.03%, accuracy@5=50.57% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|skip_connect2|)
evaluate : loss=1080665.85, accuracy@1=12.91%, accuracy@5=54.70% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|nor_conv_1x12|)
evaluate : loss=109.15, accuracy@1=10.89%, accuracy@5=58.14% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|nor_conv_3x32|)
evaluate : loss=101.58, accuracy@1=12.16%, accuracy@5=60.64% |
current test-geno is Structure(4 nodes with |avg_pool_3x30|+|avg_pool_3x30|avg_pool_3x31|+|avg_pool_3x30|avg_pool_3x31|avg_pool_3x32|)
evaluate : loss=1165846.05, accuracy@1=13.00%, accuracy@5=54.12% |
..... (a lot of info like these)

However, I DO NOT find the result of Kendall tau? Could you provide the steps to reproduce your ranking accuracy result?

As far as I know, evaluating super-net is just a loop to obtain the proxy acc for each arch in the super-net and calculate the rank acc between the proxy acc and GT acc. So another question is that I cannot understand why the eval script should be run 4 times for the one-shot super-net (not parallel for speedup)?

Thank you for your asking, before answering your questions, I would like to list the specific steps to run our few-shot NAS on NasBench201.

  1. Train the one-shot model based on the script in README. Note that the checkpoint will be saved on ./supernet_checkpoint/one-shot.
  2. Train the few-shot model based on the script in README. Note that the checkpoint will be saved on ./supernet_checkpoint/few-shot.
  3. Evaluate the one-shot model, run the one-shot evaluation script based on README. Please customize the OUTPUT_FILE. After running the evaluation, the program would generate an output file that contains the evaluated accuracy of all architectures in NasBench201.
  4. Evaluate the few-shot model, run the one-shot evaluation script based on README. Please customize the OUTPUT_FILE. After running the evaluation, the program would generate an output file that contains the evaluated accuracy of all architectures in NasBench201.
  5. Now, you have the two output files located on the OUTPUT_FILE you set(named "one-shot supernet" and "few-shot supernet") for both one-shot and few-shot models. Please refer to the README, moving them to ./supernet_info to replace the original files.
  6. Run the rank.py at the root directory of Nasbench201 project, and then you will find the Kendall tau.

Back to your questions, First, I would like to say that the type of the results to the screen was as desired. In other words, the script was running properly

For the first question(Kendall tau), you can follow the steps I list above, after running the step6, the Kendall tau will be out to your screen.

The second question is about the eval script. Actually, the total number of architectures in NasBench201 is 15625 with 5 different operator types, including skip_connection, average pooling, convolution3x3, convolution1x1, and none. I convert all architectures to a JSON file and then split it into 5 files based on the first operator in Nasbench201(The files are located on few-shot-NAS/Few-Shot_NasBench201/search_pool/). I did this is because it is convenient for the training few-shot models. Therefore, each file contains 3125 architecture. Back to the evaluation script, we should evaluate all 15625 architectures in NasBench201 to get the proxy accuracy, we run the eval script 5 times because the architectures to be evaluated are from the 5 files I mentioned before(You can go to the few-shot-NAS/Few-Shot_NasBench201/search_pool/ and open them to find the details). In other words, each time is to evaluate 3125 architectures and we run them 5 times to evaluate all 15625 architectures.

I am sorry for the confusion with the eval script. Thank you again for your asking.

@Jxu-Thu
Copy link
Author

Jxu-Thu commented Jul 14, 2021

Thanks!
I have reproduced your one-shot results with Kendall tau 0.5384 (close to your paper 0.5436) :) .
I am reproducing the few-shot result on nas-bench 201 now.

@linnanwang
Copy link
Collaborator

Thanks for your effort to replicate our results. We're looking forward to your results.

@Jxu-Thu
Copy link
Author

Jxu-Thu commented Jul 19, 2021

I cannot reproduce your few-shot results.

I run the few-shot scripts 5 times follow your readme.

In each training script

FLOP = 61.52 M, Params = 1.42 MB
search-space : ['none', 'skip_connect', 'nor_conv_1x1', 'nor_conv_3x3', 'avg_pool_3x3']
try to create the NAS-Bench-201 api from /home/data/NASBench-102/NAS-Bench-201-v1_1-096897.pth
[2021-07-14 13:27:57] create API = NASBench201API(15625/15625 architectures, file=NAS-Bench-201-v1_1-096897.pth) done
=> loading checkpoint of the last-info '/home/checkpoint_few_shot_nas/one-shot/seed-0-last-info.pth' start
=> loading checkpoint of the last-info '{'epoch': 600, 'args': Namespace(arch_nas_dataset='/home/data/NASBench-102/NAS-Bench-201-v1_1-096897.pth', channel=16, config_path='configs/nas-benchmark/algos/ONE-SHOT-SUPERNET.config', data_path='/home/data/NASBench-201/cifar10/', dataset='cifar10', edge_op=None, edge_to_split=None, max_nodes=4, num_cells=5, print_freq=200, rand_seed=0, save_dir='/home/checkpoint_few_shot_nas/one-shot', search_space_name='nas-bench-201', select_num=100, track_running_stats=0, workers=4), 'last_checkpoint': PosixPath('/home/checkpoint_few_shot_nas/one-shot/checkpoint/seed-0-basic.pth')}' start with 0-th epoch.

[Search the 000-150-th epoch] Time Left: [00:00:00], LR=0.025
SEARCH [2021-07-14 13:28:01] [000-150][000/391] Time 0.40 (0.40) Data 0.05 (0.05) Base [Loss 0.558 (0.558) Prec@1 82.81 (82.81) Prec@5 99.22 (99.22)]
SEARCH [2021-07-14 13:28:18] [000-150][200/391] Time 0.09 (0.09) Data 0.05 (0.04) Base [Loss 2.289 (1.923) Prec@1 17.97 (27.60) Prec@5 58.59 (65.96)]
SEARCH [2021-07-14 13:28:33] [000-150][390/391] Time 0.06 (0.08) Data 0.02 (0.04) Base [Loss 2.308 (1.970) Prec@1 7.50 (25.30) Prec@5 57.50 (64.64)]
[000-150] search [base] : loss=1.97, a

I see the few-shot super-net is inherited from the one-shot super-net as expected.

After running, I obtain files as follows: /home/checkpoint_few_shot_nas/few-shot

drwxrwxrwx 2 root root 4096 Jul 15 02:50 checkpoint
-rwxrwxrwx 1 root root 609555 Jul 15 09:40 few-shot-NAS_split_edge_ID_0_split_operation_conv1.log
-rwxrwxrwx 1 root root 609622 Jul 15 12:02 few-shot-NAS_split_edge_ID_0_split_operation_conv3.log
-rwxrwxrwx 1 root root 4185 Jul 15 02:51 few-shot-NAS_split_edge_ID_0_split_operation_none.log
-rwxrwxrwx 1 root root 610791 Jul 15 14:15 few-shot-NAS_split_edge_ID_0_split_operation_pool.log
-rwxrwxrwx 1 root root 610863 Jul 15 07:20 few-shot-NAS_split_edge_ID_0_split_operation_skip.log
-rwxrwxrwx 1 root root 1327 Jul 14 14:50 seed-0-last-info-split-0-op-0.pth
-rwxrwxrwx 1 root root 1327 Jul 14 14:57 seed-0-last-info-split-0-op-1.pth
-rwxrwxrwx 1 root root 1327 Jul 14 15:29 seed-0-last-info-split-0-op-2.pth
-rwxrwxrwx 1 root root 1327 Jul 14 15:12 seed-0-last-info-split-0-op-3.pth
-rwxrwxrwx 1 root root 1327 Jul 14 14:59 seed-0-last-info-split-0-op-4.pth

Then, I run the few-shot/eval.sh

got the file few-shot-supernet

Finally, I run the rank.sh

got kendall tau 0.54295.

I run the code with pytorch 1.6.0, Cuda 10.1.

@Jxu-Thu Jxu-Thu changed the title reproduce your one-shot results on nas-bench 201 reproduce your one/few-shot results on nas-bench 201 Jul 19, 2021
@aoiang
Copy link
Owner

aoiang commented Jul 19, 2021

I cannot reproduce your few-shot results.

I run the few-shot scripts 5 times follow your readme.

In each training script

FLOP = 61.52 M, Params = 1.42 MB
search-space : ['none', 'skip_connect', 'nor_conv_1x1', 'nor_conv_3x3', 'avg_pool_3x3']
try to create the NAS-Bench-201 api from /home/data/NASBench-102/NAS-Bench-201-v1_1-096897.pth
[2021-07-14 13:27:57] create API = NASBench201API(15625/15625 architectures, file=NAS-Bench-201-v1_1-096897.pth) done
=> loading checkpoint of the last-info '/home/checkpoint_few_shot_nas/one-shot/seed-0-last-info.pth' start
=> loading checkpoint of the last-info '{'epoch': 600, 'args': Namespace(arch_nas_dataset='/home/data/NASBench-102/NAS-Bench-201-v1_1-096897.pth', channel=16, config_path='configs/nas-benchmark/algos/ONE-SHOT-SUPERNET.config', data_path='/home/data/NASBench-201/cifar10/', dataset='cifar10', edge_op=None, edge_to_split=None, max_nodes=4, num_cells=5, print_freq=200, rand_seed=0, save_dir='/home/checkpoint_few_shot_nas/one-shot', search_space_name='nas-bench-201', select_num=100, track_running_stats=0, workers=4), 'last_checkpoint': PosixPath('/home/checkpoint_few_shot_nas/one-shot/checkpoint/seed-0-basic.pth')}' start with 0-th epoch.

[Search the 000-150-th epoch] Time Left: [00:00:00], LR=0.025
SEARCH [2021-07-14 13:28:01] [000-150][000/391] Time 0.40 (0.40) Data 0.05 (0.05) Base [Loss 0.558 (0.558) Prec@1 82.81 (82.81) Prec@5 99.22 (99.22)]
SEARCH [2021-07-14 13:28:18] [000-150][200/391] Time 0.09 (0.09) Data 0.05 (0.04) Base [Loss 2.289 (1.923) Prec@1 17.97 (27.60) Prec@5 58.59 (65.96)]
SEARCH [2021-07-14 13:28:33] [000-150][390/391] Time 0.06 (0.08) Data 0.02 (0.04) Base [Loss 2.308 (1.970) Prec@1 7.50 (25.30) Prec@5 57.50 (64.64)]
[000-150] search [base] : loss=1.97, a

I see the few-shot super-net is inherited from the one-shot super-net as expected.

After running, I obtain files as follows: /home/checkpoint_few_shot_nas/few-shot

drwxrwxrwx 2 root root 4096 Jul 15 02:50 checkpoint
-rwxrwxrwx 1 root root 609555 Jul 15 09:40 few-shot-NAS_split_edge_ID_0_split_operation_conv1.log
-rwxrwxrwx 1 root root 609622 Jul 15 12:02 few-shot-NAS_split_edge_ID_0_split_operation_conv3.log
-rwxrwxrwx 1 root root 4185 Jul 15 02:51 few-shot-NAS_split_edge_ID_0_split_operation_none.log
-rwxrwxrwx 1 root root 610791 Jul 15 14:15 few-shot-NAS_split_edge_ID_0_split_operation_pool.log
-rwxrwxrwx 1 root root 610863 Jul 15 07:20 few-shot-NAS_split_edge_ID_0_split_operation_skip.log
-rwxrwxrwx 1 root root 1327 Jul 14 14:50 seed-0-last-info-split-0-op-0.pth
-rwxrwxrwx 1 root root 1327 Jul 14 14:57 seed-0-last-info-split-0-op-1.pth
-rwxrwxrwx 1 root root 1327 Jul 14 15:29 seed-0-last-info-split-0-op-2.pth
-rwxrwxrwx 1 root root 1327 Jul 14 15:12 seed-0-last-info-split-0-op-3.pth
-rwxrwxrwx 1 root root 1327 Jul 14 14:59 seed-0-last-info-split-0-op-4.pth

Then, I run the few-shot/eval.sh

got the file few-shot-supernet

Finally, I run the rank.sh

got kendall tau 0.54295.

I run the code with pytorch 1.6.0, Cuda 10.1.

Thank you for reproducing the few-shot model. We would double-check our code to see if there is any difference between our implementations. Once we finish the check on my side, I will let you know here.

@linnanwang
Copy link
Collaborator

@Jxu-Thu thank you. Did you run each methods for once? Yiyang is investigating, and we'd like confirm this first. Thank you.

@Jxu-Thu
Copy link
Author

Jxu-Thu commented Jul 20, 2021

@linnanwang Yeah. I only run the experiments once.

@Jxu-Thu
Copy link
Author

Jxu-Thu commented Jul 21, 2021

I have a question. Previous work needs to re-tune the BN (batchnorm) [re-calcuate the BN by forwarding the whole validation set] parameters for each architecture when evaluating the performance of the arches in the search space.
After that, they calculate the Kendall tau rank correlation.

It seems that you do not re-tune the BN in the script exp/supernet/one-shot-supernet_eval.py?
Is there any differences between your implements with other methods?

@aoiang
Copy link
Owner

aoiang commented Jul 21, 2021

I have a question. Previous work needs to re-tune the BN (batchnorm) [re-calcuate the BN by forwarding the whole validation set] parameters for each architecture when evaluating the performance of the arches in the search space.
After that, they calculate the Kendall tau rank correlation.

It seems that you do not re-tune the BN in the script exp/supernet/one-shot-supernet_eval.py?
Is there any differences between your implements with other methods?

Thank you for asking, refer to table 5 in NasBench201 original paper(https://arxiv.org/pdf/2001.00326.pdf), there are two blocks in the table representing different BN types, the BN type in the first block uses track_running_stats and in the second block, it does not keep running estimates but always use batch statistics.

For our script, take one-shot for example(bash ./supernet/one-shot/train.sh cifar10 0 NASBENCH201_PATH), the third parameter "0" controls the BN type, which is equivalent to the second BN type in the table(not keep running estimates but always use batch statistics). Therefore, this is why we do not re-tune the BN in the script.

@Jxu-Thu
Copy link
Author

Jxu-Thu commented Jul 22, 2021

@aoiang Many thanks! You are right. I found that some codes with BN=0 still re-tune BN, which is not necessary.

@Jxu-Thu
Copy link
Author

Jxu-Thu commented Jul 29, 2021

Any update of few-shot results?

@aoiang
Copy link
Owner

aoiang commented Jul 29, 2021

Any update of few-shot results?

Hi, I have trained and tuned the few-shot supernet in recent days and the results are upgraded significantly, which is currently obvious better than one-shot model. This is a positive signal and I will upload the checkpoints of my few-shot model later today and you are free test them directly on your side. Thank you.

@aoiang
Copy link
Owner

aoiang commented Jul 29, 2021

Any update of few-shot results?

Hi there, first, sorry for the late reply. Recently, I am doing a full-time job and am busy with work. Also, due to the limited computation resource, the experiment results would be out slowly.

So far, the Kendall tau is:
few-shot: 0.606
V.S.
one-shot: 0.502,

where the few-shot model shows a significant improvement.

we have uploaded the checkpoints of the few-shot model into google drive, see here(https://drive.google.com/drive/folders/13sZBqPxQsaoxxsJqDA6moPPgMI_udiHL?usp=sharing). You are free to download and evaluate it by yourself. There is a README in the folder and you can follow the README to evaluate the models and get the Kendall tau. To further tune the models, the Kendall tau can be much higher. Thank you for your patience.

@linnanwang
Copy link
Collaborator

@aoiang great job Yiyang, thank you for the effort!

@Jxu-Thu
Copy link
Author

Jxu-Thu commented Jul 31, 2021

I run two experiments.

seed 0:
one shot: 0.5384
few-shot (five super-net): 0.54295

seed 1:
one shot: 0.530
few-shot (five super-net): 0.55288

I fail to reproduce your results based on your codes. I have tried my best to reproduce your results in the paper.
However, I re-implemented your concepts of few-shot by myself and got
one-shot 0.52
few-shot (five super-net) 0.69. [a big improvement]

It validates the effectiveness of few-shot concepts although I cannot reproduce your results based on your codes.

Anyway. Thanks for your kind reply and nice work :)

@linnanwang
Copy link
Collaborator

linnanwang commented Jul 31, 2021

What are you talking about? First you have costed us nearly two weeks to help you get the checkpoint attached above, which gives you a rank correlation of 0.6 based on our implementation. You need respect our time, and read the results above. I don’t know what might be wrong, there must be something fishy in your codes.

Second, it looks like we establish a common ground that few shot NAS is effective. You independently replicated few shot NAS and tested on NASBench 201. Here is the rank correlation:
few shot 0.69 v.s. one shot 0.52.
0.69 is actually better than 0.65 claimed in paper.

Alright, to others that might have a future question like this, this is a great thread to read. It looks like we all nail down the conclusion that few shot NAS is effective, and let’s move forward and focus on the right thing. We will leave the thread open as a reference to other people.

Thank you all and great job Yiyang!

@ShunLu91
Copy link

ShunLu91 commented Nov 3, 2022

Any update of few-shot results?

Hi there, first, sorry for the late reply. Recently, I am doing a full-time job and am busy with work. Also, due to the limited computation resource, the experiment results would be out slowly.

So far, the Kendall tau is: few-shot: 0.606 V.S. one-shot: 0.502, where the few-shot model shows a significant improvement.

we have uploaded the checkpoints of the few-shot model into google drive, see here(https://drive.google.com/drive/folders/13sZBqPxQsaoxxsJqDA6moPPgMI_udiHL?usp=sharing). You are free to download and evaluate it by yourself. There is a README in the folder and you can follow the README to evaluate the models and get the Kendall tau. To further tune the models, the Kendall tau can be much higher. Thank you for your patience.

Hi aoiang,

Thanks for your nice work and the code.
Unfortunately, when I strictly followed the README to reproduce the results using your released code, I only got the Kendall’s Tau 0.591 for few-shot with 5 supernets. I wonder how to get the Kendall’s Tau 0.653 as reported in your paper? I provide my running log and evaluation results as below. few_shot_5_supernets_exp.zip
Could you please help me find out why this difference?

Great thanks and best wishes.

@Pcyslist
Copy link

@ShunLu91 @aoiang The Kendall’s Tau 0.653 as reported in this paper is a mean value, that means there are 6 choice to split 1 edge to generate 5 sub-supernet . I guess if I choose any other edge(one of six edges) to split , it maby could get better Kendall’s Tau ,such as above 0.653. In your training code , could user change the different edge to split ? OPERATION_TO_SPLIT(0-4) just give the choice of different operations in ONE edge, any other edge maybe better.

@ShunLu91
Copy link

@Pcyslist Thanks for your constructive suggestions. I concur with you that different edge_to_split may yield different results as Kendall’s Tau is really sensitive according to my experience. After checking the experiments above, I adopted the code provided by @aoiang and found that the default edge_to_split is 0 as defined here. Generally, it's an interesting discussion and worth studying. I will try different edge_to_split following your advice in the future.

@Pcyslist
Copy link

@ShunLu91 I have checked your few-shot training log, I found that when you trained five subnetworks, which you transferred different model weights to . This is incorrect. You should always transfer the model weights obtained by one-shot training to the sub-supernet.

@Pcyslist
Copy link

@ShunLu91 In addition, I would like to ask what type of GPU do you use? Why it is so slow for me to train supernet of one-shot with two 3090 gpus?

@ShunLu91
Copy link

@Pcyslist Sorry but I have to clarify that you might have some misunderstandings about my training. I followed the official instructions here to conduct the few-shot training rather than the one-shot training. I split the one-shot supernet into 5 sub-supernets. Thus for evaluation, I transferred different model weights for each sub-network from their corresponding sub-supernet. As for the part of the transfer learning, the author said "For this experiment, we train sub-supernets by skipping the transfer learning described in Section 3.2." in Section 4.1.1 in their paper. Therefore, the experiments on NAS-Bench-201 do not have the part of the transfer learning. Besides, I trained each sub-supernet on a single card of NVIDIA V100 GPU. I am glad for more discussions.

@Pcyslist
Copy link

Pcyslist commented Mar 23, 2023 via email

@ShunLu91
Copy link

@Pcyslist Thanks for pointing our this error. I notice that a transfer learning procedure was missing in my training and I will repeat this experiment. In summary, I should first transfer the same weights from a pre-trained one-shot supernet and then train the few-shot sub-supernet. After the few-shot training, I can use the corresponding few-shot sub-supernet to evaluate each subnetwork. Is my understanding correct now? In this way, the training cost is obviously higher than the conventional one-shot NAS because of this post-training, i.e. the transfer learning and then few-shot.

As for the training speed, I have chekced my previous one-shot training logs on one NVIDIA V100 GPU and found that the training time is nearly the same as reported 6.8 hours. Maybe you can increase the num_workers to accelerate training speed.

@Pcyslist
Copy link

@ShunLu91 Yes, you can have a try. I have also solved the problem of how to accelerate training. I have found that using a single GPU is much faster than using two GPUs.

@ShunLu91
Copy link

@Pcyslist OK~Great thanks!

@Pcyslist
Copy link

@ShunLu91 Hi!I wonder how long it took you to evaluate 15625 networks? I found that evaluating 15625 networks(about 15 hours) takes longer than training a supernet(about 6 hours), is this normal?

@ShunLu91
Copy link

@Pcyslist I adopted a large valid batch size 512 with num_workers=16 when I evaluated each sub-network. It only took 6.5 hours for me to evaluate 15625 sub-networks. I suggest you to find out the most time-consuming step and then optimize it.

@Pcyslist
Copy link

@Pcyslist I adopted a large valid batch size 512 with num_workers=16 when I evaluated each sub-network. It only took 6.5 hours for me to evaluate 15625 sub-networks. I suggest you to find out the most time-consuming step and then optimize it.

Thank you, I have completed all the processes of the recurring experiment, and the experimental results obtained by running the rank.py file are as follows:
image
I am slightly disappointed with this result. Obviously, few-shot nas does not improve kendall_tau much, which shows that even select the same edge(edge 0) to split the supernet, the result of each training has great randomness.
@ShunLu91

@ShunLu91
Copy link

@Pcyslist Thanks for your results, which are an important reference for me and other users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants