Object Region Attention #12

sanketsans · 2022-10-27T12:16:17Z

Hello,
In the paper, it is mentioned that the in the ORVIT block the object region attention is carried out by different q, k and v values i.e; q is set to the patch tokens and k,v are set as the concatenated tokens from the patches and the object regions.

X = THWd , C = T(HW+O)d

So, in the object-region attention; it should be (acc to the paper) : Q = XWq; k = CWk; V = CWv

However, in the code, I realize that the concatenated tokens are being passed to the trajectory attention module.

ORViT/slowfast/models/ORViT/orvit.py

Line 149 in 3bfd2c7

all_tokens, thw = self.attn(

Also, in the trajectory attention module,

ORViT/slowfast/models/attention.py

Line 479 in 3bfd2c7

class TrajectoryAttention(nn.Module):

, the q, k and v values are set as identical to the ones from the concatenated tokens.

Can you please help me explain this ? I cant seem to find where the original patch tokens are set to the q for the trajectory attention mechanism.

Thanks :)

malei207 · 2022-12-06T06:29:16Z

hello, I was wondering if you could run this code? I find the code have some bugs.

deschanel11 · 2023-02-19T07:52:25Z

me too, when I tried to run train code with AVA dataset using MVIT_16X4.yaml file, I got an error getting unexpected keyword argument 'drop_rate'. And also having trouble downloading Something-Something V2 and SomethingElse dataset cause it has 503 error on its downloading webpage. Is there any way to solve these issues??

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Object Region Attention #12

Object Region Attention #12

sanketsans commented Oct 27, 2022 •

edited

Loading

malei207 commented Dec 6, 2022

deschanel11 commented Feb 19, 2023

Object Region Attention #12

Object Region Attention #12

Comments

sanketsans commented Oct 27, 2022 • edited Loading

malei207 commented Dec 6, 2022

deschanel11 commented Feb 19, 2023

sanketsans commented Oct 27, 2022 •

edited

Loading