You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey guys, your work is so cool. But i'm wondering that, when you do inpaint/outpaint, do you pad the pixel with zero or something else? Because if you do not do so. The encoder may "see" some context imformations, this will cause a information leak. Which means that the reconstruct result may look goot but it use informations it shouldn't known.
Thank you!
The text was updated successfully, but these errors were encountered:
Thanks for your interest! For inpainting/outpainting, we first pad the raw pixels with zero. Then we mask out the corresponding tokens. We observe that in the token space, if we only mask the tokens corresponding to the original pixel mask, the tokens around the input mask will still record the "mask" as part of the ground truth. Therefore, we mask one more token near the input mask.
More specifically, say the original pixel mask is from pixel 64-191 (128x128 pixel mask) in the original 256x256 image. Then the token mask corresponding to the original mask should be from token 4-11 (8x8 token mask) in the 16x16 token space. However, instead of masking tokens from 4-11, we use a token mask from 3-12 (10x10 token mask) to avoid the remaining tokens recording the "mask" in the input.
We will also release a colab for generation and image editing soon (possibly in March).
Hey guys, your work is so cool. But i'm wondering that, when you do inpaint/outpaint, do you pad the pixel with zero or something else? Because if you do not do so. The encoder may "see" some context imformations, this will cause a information leak. Which means that the reconstruct result may look goot but it use informations it shouldn't known.
Thank you!
The text was updated successfully, but these errors were encountered: