Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNN #1

Open
carpedm20 opened this issue Feb 15, 2018 · 12 comments

Comments

Projects
None yet
4 participants
@carpedm20
Copy link
Owner

commented Feb 15, 2018

No description provided.

@carpedm20 carpedm20 self-assigned this Feb 15, 2018

@bkj

This comment has been minimized.

Copy link

commented Feb 15, 2018

I've been working on an implementation of the CNN portion of this paper, and I may be able to help w/ the CNN model and cell searches if you're interested.

One issue I don't think they address in the paper is how they're handling spatial dimensions -- do you have any thoughts on this? I'm guessing they pad s.t. the input and output of each layer is the same?

@carpedm20

This comment has been minimized.

Copy link
Owner Author

commented Feb 16, 2018

Sure! I'll take some rest for now so any help would be appreciated. Yes, I guess they used padding to make dimension consistent like:

pool = nn.MaxPool2d(kernel_size=3, stride=1, padding=1)
@bkj

This comment has been minimized.

Copy link

commented Feb 16, 2018

Great -- I'll fork and do some work over the weekend. By the way -- how long does the RNN experiment run for, and what's the final PPL you're getting? Is it similar to what they report in the paper?

@carpedm20

This comment has been minimized.

Copy link
Owner Author

commented Feb 16, 2018

40 epochs (train PPL=56) took 6 hours with gpu980 and will take 22 hours for 150 epoch. I didn't reach the end yet and I think the scale of reward and loss might need some changes.

@bkj

This comment has been minimized.

Copy link

commented Mar 4, 2018

I've implemented some of the micro-CNN search space, though in a different project that's not totally compatible with this one. I'm going to clean it up over the next couple of days and I'll post a link here when it'd be reasonable for other people to take a look at it.

I'm currently having trouble reproducing the results from the paper -- the ENAS CNN training seems very unstable. I need to do some further experiments to understand how weight sharing affects the convergence of the individual architectures.

@karandwivedi42

This comment has been minimized.

Copy link

commented Mar 22, 2018

@bkj Did you manage to reproduce the results? I too implemented from scratch but am getting around 82% accuracy.

@bkj

This comment has been minimized.

Copy link

commented Mar 22, 2018

No -- I have not been able to reproduce the results. I moved from using a RL controller to something simpler (random search, basically) and have trained models w/ ENAS-style parameter sharing to 92% test accuracy, while my baseline preactivation ResNet18 gets > 93% when trained w/ the same settings.

~ Ben

@karandwivedi42

This comment has been minimized.

Copy link

commented Mar 22, 2018

Thanks! I am doing the same but getting ~82%. Have you open-sourced your code (or can you please share your code)?

@bkj

This comment has been minimized.

Copy link

commented Mar 22, 2018

Yes it's here -- https://github.com/bkj/ripenet

No documentation yet, open an issue if you have questions.

~ Ben

@axiniu

This comment has been minimized.

Copy link

commented Apr 30, 2018

@carpedm20 @bkj @karandwivedi42 @dukebw ,Hi,Can you run this code successfilly? When I run it by : python main.py --network_type cnn --dataset cifar10 --controller_optim momentum --controller_lr_cosine=True --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1,I met some errors. What I want to do is find cnn arvhitectures and make them visualized. Would you please tell me what changes Ishould do to the code before I run it. Thanks for your reply.

@karandwivedi42

This comment has been minimized.

Copy link

commented Apr 30, 2018

@axiniu

This comment has been minimized.

Copy link

commented May 2, 2018

@karandwivedi42 ,Thank you ,the code linked in the README on this repo I have run successfully.But now what I want to do is make the CNN architectures searched visualized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.