New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to gain dense correspondence from the known data augmentation? #11
Comments
Hi, I notice that there is already a paper that is similar to your idea (https://arxiv.org/pdf/2011.10043.pdf). Please correct me if I misunderstand what you mean. |
Yes, using geometric transformation is a straightforward way. In our framework, the two ways can achieve almost the same results. This part of the experiments will be updated in our next version. As discussed in our paper, our proposed method is more flexible and simple. |
Hi, thank you for the information! I haven't read that paper before and it looks interesting. But yeah, that's what I mean. |
Hi, thank you for your reply. Yeah, I get your point. I'm looking forward to the updated version. |
Hi, Thank you very much for the nice work!
I have a question about the dense correspondence of views. In the paper, the correspondence is gained by calculating the similarity between feature vectors from the backbone. Since the data augmentation (e.g. rotating, cropping, flipping) performed to each view of the same image is known, it's possible to obtain the correspondence directly from these transformations.
For example,
Image A
is a left-right flipped copy ofImage B
. The two images are encoded to 3x3 feature maps, which can be represented as:and
Since
A
andB
are flipped views of the same image, the correspondence could be(fa1, fb3), (fa2, fb2), (fa3, fb1), ...
.From my perspective, the transformation-motivated correspondence is more straightforward but the paper doesn't use it. Are there any intuitions behind this?
Thank you again!
The text was updated successfully, but these errors were encountered: