Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected performance #7

Open
bkj opened this issue Apr 3, 2018 · 12 comments
Open

Expected performance #7

bkj opened this issue Apr 3, 2018 · 12 comments

Comments

@bkj
Copy link

bkj commented Apr 3, 2018

If we run the three experiments from the README:

# Exp. 1
./scripts/ptb_search.sh
./scripts/ptb_final.sh

# Exp. 2
./scripts/cifar10_macro_search.sh
./scripts/cifar10_macro_final.sh

# Exp 3.
./scripts/cifar10_micro_search.sh
./scripts/cifar10_micro_final.sh

what should we expect the final performance metrics to be? Are you able to post the expected results either here or in the README?

Thanks

@hyhieu
Copy link
Collaborator

hyhieu commented Apr 3, 2018

# Exp. 1
./scripts/ptb_search.sh  # should give you a bunch of architectures
./scripts/ptb_final.sh   # should give you around 55.8 test perplexity on PTB

# Exp. 2
./scripts/cifar10_macro_search.sh  # should give you a bunch of architectures
./scripts/cifar10_macro_final.sh   # should give you around 96.1% accuracy on the test set

# Exp 3.
./scripts/cifar10_micro_search.sh  # should give you a bunch of architectures
./scripts/cifar10_micro_final.sh   # should give you around 96.5% accuracy on the test set

@bkj
Copy link
Author

bkj commented Apr 3, 2018

Fantastic -- thank you. I'm running the cifar10_micro_search.sh now, and will post here to confirm once I get some results.

~ Ben

@bkj
Copy link
Author

bkj commented Apr 4, 2018

OK -- tail of cifar10_micro_search.sh looks like:

Eval at 42018
valid_accuracy: 0.6820
Eval at 42018
test_accuracy: 0.6636
epoch=149   ch_step=42050  loss=0.910298 lr=0.0005   |g|=2.4888   tr_acc=105/160 mins=717.36    
epoch=149   ch_step=42100  loss=1.008317 lr=0.0005   |g|=3.0906   tr_acc=110/160 mins=717.86    
epoch=149   ch_step=42150  loss=0.833895 lr=0.0005   |g|=2.0674   tr_acc=107/160 mins=718.36    
epoch=149   ch_step=42200  loss=0.951047 lr=0.0005   |g|=2.4366   tr_acc=104/160 mins=718.85    
epoch=149   ch_step=42250  loss=0.930920 lr=0.0005   |g|=2.1964   tr_acc=107/160 mins=719.35    
epoch=150   ch_step=42300  loss=0.993480 lr=0.0005   |g|=2.3855   tr_acc=98 /160 mins=719.85    
Epoch 150: Training controller
ctrl_step=4470   loss=3.077   ent=53.16 lr=0.0035 |g|=0.0440   acc=0.7375 bl=0.68  mins=719.85
ctrl_step=4475   loss=1.252   ent=53.17 lr=0.0035 |g|=0.0088   acc=0.7000 bl=0.68  mins=720.05
ctrl_step=4480   loss=2.096   ent=53.14 lr=0.0035 |g|=0.0490   acc=0.7188 bl=0.68  mins=720.26
ctrl_step=4485   loss=3.848   ent=53.13 lr=0.0035 |g|=0.0474   acc=0.7625 bl=0.69  mins=720.46
ctrl_step=4490   loss=2.683   ent=53.13 lr=0.0035 |g|=0.1009   acc=0.7375 bl=0.69  mins=720.66
ctrl_step=4495   loss=-5.616  ent=53.09 lr=0.0035 |g|=0.1095   acc=0.5750 bl=0.69  mins=720.86
Here are 10 architectures
[0 1 1 0 1 1 0 3 1 2 0 0 0 1 1 4 1 0 3 1]
[0 1 0 1 1 0 1 1 0 2 2 3 2 1 3 4 4 0 5 1]
val_acc=0.6375
--------------------------------------------------------------------------------
[0 1 1 1 0 2 0 1 1 0 1 3 0 1 1 4 0 1 0 4]
[0 2 0 4 0 0 0 3 0 2 3 4 0 0 2 1 5 1 2 1]
val_acc=0.6313
--------------------------------------------------------------------------------
[0 2 0 0 1 1 1 1 1 2 0 0 0 0 1 0 1 3 0 0]
[0 1 0 4 0 1 1 2 0 2 3 4 1 0 3 4 5 4 4 1]
val_acc=0.7000
--------------------------------------------------------------------------------
[1 2 1 2 0 1 1 1 0 0 1 1 1 0 1 0 1 1 0 4]
[1 1 0 2 2 0 2 4 3 1 3 1 4 4 4 0 2 3 0 0]
val_acc=0.6875
--------------------------------------------------------------------------------
[1 2 0 2 1 4 1 2 1 3 0 4 0 1 2 4 0 4 1 4]
[1 0 1 0 0 2 0 2 1 4 2 4 4 3 1 0 5 0 5 3]
val_acc=0.6500
--------------------------------------------------------------------------------
[1 3 1 0 1 4 0 1 2 1 1 4 1 4 1 1 0 1 1 1]
[1 1 0 0 2 1 1 4 0 0 1 4 2 1 0 0 5 0 0 2]
val_acc=0.7000
--------------------------------------------------------------------------------
[1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1]
[1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1]
val_acc=0.7375
--------------------------------------------------------------------------------
[0 4 0 1 1 1 0 0 0 0 0 1 0 0 0 4 0 2 1 0]
[1 4 1 0 2 1 2 4 3 4 0 0 3 3 2 4 5 2 2 1]
val_acc=0.6188
--------------------------------------------------------------------------------
[1 1 1 0 1 2 0 2 0 2 1 0 1 4 4 0 1 0 0 4]
[0 2 1 1 0 0 1 1 1 0 0 2 1 4 2 0 4 3 0 4]
val_acc=0.6500
--------------------------------------------------------------------------------
[0 2 0 0 1 3 1 2 1 0 0 0 0 0 0 1 0 0 1 0]
[1 3 0 1 0 4 2 0 1 0 2 2 2 3 0 0 5 2 2 0]
val_acc=0.6625
--------------------------------------------------------------------------------
Epoch 150: Eval
Eval at 42300
valid_accuracy: 0.6796
Eval at 42300
test_accuracy: 0.6660

Took 12 hours on a Titan X PASCAL, as advertised.

Now I think I'm supposed to take the architecture w/ the best validation, which is:

[1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1]
[1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1]
val_acc=0.7375

Is that right? Or is there somewhere where a large number of architectures are validated and an optimal one is chosen for me?

~ Ben

EDIT: Here's a plot of the valid and test accuracies over time:
screen shot 2018-04-04 at 10 04 13 am
These are the numbers logged like:

Eval at 42300
valid_accuracy: 0.6796
Eval at 42300
test_accuracy: 0.6660

@hyhieu
Copy link
Collaborator

hyhieu commented Apr 4, 2018

Thank you for dedicating the time and resource to run and verify our code.

We also looked at the architectures that were sampled in the time steps before and took the one with the overall best val_acc. However, I think the one you picked might work well too 😄

@bkj
Copy link
Author

bkj commented Apr 5, 2018

Here's a plot of the test accuracy in cifar10_micro_final.sh:
screen shot 2018-04-06 at 8 55 29 am

This used architectures:

fixed_arc="1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1"
fixed_arc="$fixed_arc 1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1"

w/ all other parameters unchanged.

Final test accuracy of 0.9612 at epoch 630 (w/ maximum test accuracy of 0.9620 at epoch 619). Also, got to 0.9611 accuracy at epoch 306 -- so the extra 300 iterations don't give a whole lot of improvement.

Note this final model takes > 1 day to train -- longer than the initial architecture search.

For me, this prompts the question of how much of the difference between methods reported in the paper is due to the hyperparameters of the final retraining step vs the discovered architecture. It'd be interesting to train a standard ResNet architecture w/ the same parameters as cifar10_micro_final.sh to see how it compares.

@axiniu
Copy link

axiniu commented Apr 18, 2018

@bkj ,Hello.Is everything going well with this work(the macro search space, the micro search space)?
And I want to konw how to get the following architecture in the cifar10_macro_final.sh:
fixed_arc="0"
fixed_arc="$fixed_arc 3 0"
fixed_arc="$fixed_arc 0 1 0"
fixed_arc="$fixed_arc 2 0 0 1"
fixed_arc="$fixed_arc 2 0 0 0 0"
fixed_arc="$fixed_arc 3 1 1 0 1 0"
fixed_arc="$fixed_arc 2 0 0 0 0 0 1"
fixed_arc="$fixed_arc 2 0 1 1 0 1 1 1"
fixed_arc="$fixed_arc 1 0 1 1 1 0 1 0 1"
fixed_arc="$fixed_arc 0 0 0 0 0 0 0 0 0 0"
fixed_arc="$fixed_arc 2 0 0 0 0 0 1 0 0 0 0"
fixed_arc="$fixed_arc 0 1 0 0 1 1 0 0 0 0 1 1"
fixed_arc="$fixed_arc 2 0 1 0 0 0 0 0 1 0 1 1 0"
fixed_arc="$fixed_arc 1 0 0 1 0 0 0 1 1 1 0 1 0 1"
fixed_arc="$fixed_arc 0 1 1 0 1 0 1 0 0 0 0 0 1 0 0"
fixed_arc="$fixed_arc 2 0 0 1 0 0 0 0 0 0 0 1 0 1 0 1"
fixed_arc="$fixed_arc 2 0 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0"
fixed_arc="$fixed_arc 2 0 0 0 0 1 0 1 0 1 0 0 1 0 1 0 0 1"
fixed_arc="$fixed_arc 3 0 1 1 0 1 0 0 0 0 0 1 0 1 0 1 0 0 0"
fixed_arc="$fixed_arc 3 0 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1"
fixed_arc="$fixed_arc 0 1 0 0 1 0 1 1 0 0 0 1 0 0 0 0 0 1 1 0 0"
fixed_arc="$fixed_arc 3 0 1 0 1 1 0 0 1 0 1 1 0 1 1 0 1 0 0 1 0 0"
fixed_arc="$fixed_arc 0 1 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0"
fixed_arc="$fixed_arc 0 1 1 0 0 0 1 1 1 0 1 0 0 0 1 0 1 0 0 1 1 0 0 0", which has 24 cells.
while ,I just can get architectures like :
[1]
[1 1]
[5 0 0]
[5 0 0 0]
[0 0 1 1 0]
[1 1 0 0 0 0]
[1 1 0 1 1 1 0]
[3 0 0 1 0 1 1 1]
[5 0 0 1 0 0 1 0 0]
[1 1 1 0 0 0 0 1 0 0]
[0 1 1 0 0 0 0 1 1 1 1]
[0 0 1 1 1 1 0 1 0 0 1 1], which only has 12 cells.

@lianqing11
Copy link

@bkj Hello, Did you change any parameters on the scripts to get this result. I only could get almost 0.88 acc on cifar using micro structure.

@axiniu
Copy link

axiniu commented Apr 30, 2018

@bkj Hi,I saw you you also are interested in the ENAS-pytorch work in https ://github.com/carpedm20/ENAS-pytorch. When I run the ENAS-pytorch code by : python main.py --network_type cnn --dataset cifar --controller_optim momentum --controller_lr_cosine=True --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1,I met many errors.What I want to do is to find some cnn architectures and make them visulized. Would you please tell me what changes Ishould do to the code before I run it? Thanks for your reply.

@Chen-Hailin
Copy link

Hi @hyhieu , I have run the latest default ./scripts/ptb_final.sh for 600+ epochs, but the ppl remains 69+. May I know what is the expected number of epochs needed to reproduce claimed 55.8 ppl? Or do I need to change ./scripts/ptb_final.sh configuration?
picture1

@Moran232
Copy link

OK -- tail of cifar10_micro_search.sh looks like:

Eval at 42018
valid_accuracy: 0.6820
Eval at 42018
test_accuracy: 0.6636
epoch=149   ch_step=42050  loss=0.910298 lr=0.0005   |g|=2.4888   tr_acc=105/160 mins=717.36    
epoch=149   ch_step=42100  loss=1.008317 lr=0.0005   |g|=3.0906   tr_acc=110/160 mins=717.86    
epoch=149   ch_step=42150  loss=0.833895 lr=0.0005   |g|=2.0674   tr_acc=107/160 mins=718.36    
epoch=149   ch_step=42200  loss=0.951047 lr=0.0005   |g|=2.4366   tr_acc=104/160 mins=718.85    
epoch=149   ch_step=42250  loss=0.930920 lr=0.0005   |g|=2.1964   tr_acc=107/160 mins=719.35    
epoch=150   ch_step=42300  loss=0.993480 lr=0.0005   |g|=2.3855   tr_acc=98 /160 mins=719.85    
Epoch 150: Training controller
ctrl_step=4470   loss=3.077   ent=53.16 lr=0.0035 |g|=0.0440   acc=0.7375 bl=0.68  mins=719.85
ctrl_step=4475   loss=1.252   ent=53.17 lr=0.0035 |g|=0.0088   acc=0.7000 bl=0.68  mins=720.05
ctrl_step=4480   loss=2.096   ent=53.14 lr=0.0035 |g|=0.0490   acc=0.7188 bl=0.68  mins=720.26
ctrl_step=4485   loss=3.848   ent=53.13 lr=0.0035 |g|=0.0474   acc=0.7625 bl=0.69  mins=720.46
ctrl_step=4490   loss=2.683   ent=53.13 lr=0.0035 |g|=0.1009   acc=0.7375 bl=0.69  mins=720.66
ctrl_step=4495   loss=-5.616  ent=53.09 lr=0.0035 |g|=0.1095   acc=0.5750 bl=0.69  mins=720.86
Here are 10 architectures
[0 1 1 0 1 1 0 3 1 2 0 0 0 1 1 4 1 0 3 1]
[0 1 0 1 1 0 1 1 0 2 2 3 2 1 3 4 4 0 5 1]
val_acc=0.6375
--------------------------------------------------------------------------------
[0 1 1 1 0 2 0 1 1 0 1 3 0 1 1 4 0 1 0 4]
[0 2 0 4 0 0 0 3 0 2 3 4 0 0 2 1 5 1 2 1]
val_acc=0.6313
--------------------------------------------------------------------------------
[0 2 0 0 1 1 1 1 1 2 0 0 0 0 1 0 1 3 0 0]
[0 1 0 4 0 1 1 2 0 2 3 4 1 0 3 4 5 4 4 1]
val_acc=0.7000
--------------------------------------------------------------------------------
[1 2 1 2 0 1 1 1 0 0 1 1 1 0 1 0 1 1 0 4]
[1 1 0 2 2 0 2 4 3 1 3 1 4 4 4 0 2 3 0 0]
val_acc=0.6875
--------------------------------------------------------------------------------
[1 2 0 2 1 4 1 2 1 3 0 4 0 1 2 4 0 4 1 4]
[1 0 1 0 0 2 0 2 1 4 2 4 4 3 1 0 5 0 5 3]
val_acc=0.6500
--------------------------------------------------------------------------------
[1 3 1 0 1 4 0 1 2 1 1 4 1 4 1 1 0 1 1 1]
[1 1 0 0 2 1 1 4 0 0 1 4 2 1 0 0 5 0 0 2]
val_acc=0.7000
--------------------------------------------------------------------------------
[1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1]
[1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1]
val_acc=0.7375
--------------------------------------------------------------------------------
[0 4 0 1 1 1 0 0 0 0 0 1 0 0 0 4 0 2 1 0]
[1 4 1 0 2 1 2 4 3 4 0 0 3 3 2 4 5 2 2 1]
val_acc=0.6188
--------------------------------------------------------------------------------
[1 1 1 0 1 2 0 2 0 2 1 0 1 4 4 0 1 0 0 4]
[0 2 1 1 0 0 1 1 1 0 0 2 1 4 2 0 4 3 0 4]
val_acc=0.6500
--------------------------------------------------------------------------------
[0 2 0 0 1 3 1 2 1 0 0 0 0 0 0 1 0 0 1 0]
[1 3 0 1 0 4 2 0 1 0 2 2 2 3 0 0 5 2 2 0]
val_acc=0.6625
--------------------------------------------------------------------------------
Epoch 150: Eval
Eval at 42300
valid_accuracy: 0.6796
Eval at 42300
test_accuracy: 0.6660

Took 12 hours on a Titan X PASCAL, as advertised.

Now I think I'm supposed to take the architecture w/ the best validation, which is:

[1 3 1 1 1 4 0 4 1 0 0 0 0 2 1 4 0 0 0 1]
[1 0 1 4 0 1 1 4 1 1 1 0 4 1 2 3 4 0 3 1]
val_acc=0.7375

Is that right? Or is there somewhere where a large number of architectures are validated and an optimal one is chosen for me?

~ Ben

EDIT: Here's a plot of the valid and test accuracies over time:
screen shot 2018-04-04 at 10 04 13 am
These are the numbers logged like:

Eval at 42300
valid_accuracy: 0.6796
Eval at 42300
test_accuracy: 0.6660

so after you get the arch, have you retrained from scratch to get the acc of 96.5%?

@upwindflys
Copy link

@axiniu I met the same problem,Has the problem been solved now?Thanks a lot

@chg0901
Copy link

chg0901 commented Oct 3, 2019

@upwindflys @axiniu
I also just get the same 12 cell but my macro search get very high accuracy like

[2]
[3 0]
[5 1 0]
[5 0 0 1]
[2 0 0 0 1]
[1 0 0 0 0 0]
[5 1 0 1 0 0 0]
[1 0 0 0 1 0 0 0]
[1 0 0 0 0 1 0 0 0]
[5 0 1 0 0 0 1 1 0 1]
[4 0 0 0 1 0 1 0 1 0 0]
[1 0 0 0 0 0 0 1 1 0 1 1]
val_acc=0.8125
--------------------------------------------------------------------------------
Epoch 310: Eval
Eval at 109120
valid_accuracy: 0.8154
Eval at 109120
test_accuracy: 0.8080

But did you know the difference between 12 and 24 cell?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants