Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low GPU usage when experiments on cifar10 #3

Closed
gitlabspy opened this issue Sep 6, 2020 · 2 comments
Closed

Low GPU usage when experiments on cifar10 #3

gitlabspy opened this issue Sep 6, 2020 · 2 comments

Comments

@gitlabspy
Copy link

Hi Didrik,
Great work from you!! I was trying to run experiments from codes you've given. I run it on cifar10 and I found it has low GPU usage while it occupies a lot of GPU memory. I wonder what causes this phenomenon?
39'C, 35 % | 23651 / 32510 MB 35% is almost the highest usage rate.

Another thing is, I found that image is barely reconstructable. (cuz from what I understand, flow is bijective (obviously it could be other cases according to this paper) so it could kinda always recover image from latent and the performance is good even when multi-scale architecture's on. I only try on 32x32 size image, will it be good (more reconstructable) on higher resolution images?

I was running this python train.py --epochs 500 --batch_size 32 --optimizer adamax --lr 1e-3 --gamma 0.995 --eval_every 1 --check_every 10 --warmup 5000 --num_steps 12 --num_scales 2 --dequant flow --pooling max --dataset cifar10 --augmentation eta --name maxpool line of command and it runs about 300 epochs.

@didriknielsen
Copy link
Owner

Hi,
Thanks!

I've also found GPU utilization to be low for flow models. I haven't investigated this further, but if someone has any insight, I'd love to hear it!

Regarding reconstruction: If you are using only bijections, the input should be exactly reconstructable (up to numerical error). However, if you make use of surjective transformations (such as Slice, MaxPool2d, etc.), some information in lost in the x->z direction which is generated again in the z->x direction. The reconstructions might therefore vary.

@gitlabspy
Copy link
Author

gitlabspy commented Sep 13, 2020

@didriknielsen It's kinda rude for me to ask you for this but I wonder if you can upload the pretrained model forimagenet64please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants