Question on network architecture #6

codeslake · 2021-09-27T06:50:10Z

Hi,

I've gone through the code and have some questions about the network architecture described in the paper and in the code.

In Figure 2 in the main paper, the network has 4 Aligned Attention(aa) modules, but the code has only 3. Is there a performance decrease when you use 4 aa modules?
For the aa modules defined in dcsr.py, the scale and align arguments are different based on self.flag_8k.
- Is it appropriate to assume that the scale values are higher when self.flag_8k==True for better feature matching between features of higher resolution?
- for an aa2 module, why does it get align==False? I can see that alignment is not necessary when scale==1, as patches become 1*1 tensors. However, for aa2 when self.flag_8k==True, why is align set to False?
Would you please elaborate on the intuition of having coarse==True? I have not detailly checked the patch coordinates used for the evaluation, but I assume coarse is set to True when the LR patch is outside of FOV of ref patch. When coarse==True, the DCSR model downsamples the LR and ref images with factors of 1/16 and 1/8 respectively. Is this for roughly matching the structure as those patches might not share a common context?

Thanks in advance,

The text was updated successfully, but these errors were encountered:

Tengfei-Wang · 2021-09-27T07:17:28Z

Hi,

The number of attention modules depends on the resolution factor in our experiments. Fig.2 demonstrates the general architecture of the approach. Specifically, it shows the 4X SR network (also see customized dataset #2 ). You may increase it to 5 for 8X SR and decrease it to 3 for 2X SR (the released code).
- Yes. To achieve a compelling matching on 8k cases, it is intuitive to increase the kernel size for a proper receptive field.
- In SRA, we need to load the pre-trained model in 4k cases, where `align=false in aa2'. We thus simply set it to false in 8k cases for consistent loading. I suppose it also works if we set it to True for 8k cases.

codeslake · 2021-09-27T07:22:34Z

Thanks for the quick answer!
I've added 3rd question in the original comment. Would you please leave the comment for that one too?

Tengfei-Wang · 2021-09-27T07:29:04Z

Thanks for the quick answer!
I've added 3rd question in the original comment. Would you please leave the comment for that one too?

Yes, you're correct. For patches within the overlapped FoV (near the central region), we perform feature matching locally in a neighboring region. For other patches (outside the overlapped region), we search the whole ref image for reference information. To improve the searching efficiency, we first coarsely find a candidate region, and then see it as the reference patch.

codeslake · 2021-09-27T07:33:58Z

Thanks, Wang. The answers really helped!

codeslake closed this as completed Sep 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on network architecture #6

Question on network architecture #6

codeslake commented Sep 27, 2021 •

edited

Tengfei-Wang commented Sep 27, 2021

codeslake commented Sep 27, 2021

Tengfei-Wang commented Sep 27, 2021

codeslake commented Sep 27, 2021

Question on network architecture #6

Question on network architecture #6

Comments

codeslake commented Sep 27, 2021 • edited

Tengfei-Wang commented Sep 27, 2021

codeslake commented Sep 27, 2021

Tengfei-Wang commented Sep 27, 2021

codeslake commented Sep 27, 2021

codeslake commented Sep 27, 2021 •

edited