Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

head_rect, tail_rect #13

Closed
alibabadoufu opened this issue Mar 25, 2020 · 4 comments
Closed

head_rect, tail_rect #13

alibabadoufu opened this issue Mar 25, 2020 · 4 comments

Comments

@alibabadoufu
Copy link

❓ Questions and Help

Thanks for your contribution. I really appreciate your efforts for this repo.

May I ask what is the usage for this head_rect and tail rect in roi_relation_feature_extractors.py? I couldn't get it even though I have read through the papers (VCTree, Motitf..).

@KaihuaTang
Copy link
Owner

head_rect

This part of the code is created by neural-motifs. It generates two 0/1 masks to represent the locations and shapes of subject&object bounding boxes, then sends them to a conv layer for spatial features of subs/objs.

@alibabadoufu
Copy link
Author

why is resolution * 4 - 1?
-1 means that it excludes the relation between two identical objects, but it needs to multiply by 4? Thanks so much for your answer in advanced.

@KaihuaTang
Copy link
Owner

1 means that it excludes the relation between two identical objects, but it needs to multiply by 4? Thanks so much for your answer in advanced.

I think you misunderstand my previous answer. They are two masks generated to mark the location of sub and obj on the original image. The size of the masks should be the same as the original image, so they times 4 to reverse the downsamplings of previous max-poolings and minus 1 in case the floor operation was involved in the downsamplings.

@alibabadoufu
Copy link
Author

1 means that it excludes the relation between two identical objects, but it needs to multiply by 4? Thanks so much for your answer in advanced.

I think you misunderstand my previous answer. They are two masks generated to mark the location of sub and obj on the original image. The size of the masks should be the same as the original image, so they times 4 to reverse the downsamplings of previous max-poolings and minus 1 in case the floor operation was involved in the downsamplings.

Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants