Do you use attention in the upsampling and downsampling blocks when training? #55

DanBigioi · 2022-10-26T10:22:15Z

I notice that in the config files for all the experiments, channel_mults is set to [1,2,4,8], while attn_res is at 16. This means that you dont use attention within the upsampling and downsampling block right? Since according to the documentation:

:param attn_res: a collection of downsample rates at which attention will take place. May be a set, list, or tuple. For example, if this contains 4, then at 4x downsampling, attention will be used.

Is this an intentional design choice?

Also, you mention in the read me that "We used the attention mechanism in low-resolution features (16×16) like vanilla DDPM.". Do you mean [32x32] as the images you train on are 256x256, and the feature size is of [32x32] when you reach the middleblock where attention is used.

Thank you for the great repo!

The text was updated successfully, but these errors were encountered:

codgodtao · 2022-12-10T07:29:46Z

There may be some problems with your understanding. the attention setting is for crosspoding image size ,16 means the image size after downsample,i will give you a example
6464->down sample to 3232
3232 ->down to 1616,so this layer use attention
1616->88
88->44
you could see from code in unet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do you use attention in the upsampling and downsampling blocks when training? #55

Do you use attention in the upsampling and downsampling blocks when training? #55

DanBigioi commented Oct 26, 2022

codgodtao commented Dec 10, 2022

Do you use attention in the upsampling and downsampling blocks when training? #55

Do you use attention in the upsampling and downsampling blocks when training? #55

Comments

DanBigioi commented Oct 26, 2022

codgodtao commented Dec 10, 2022