Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about Paper #2

Closed
duzhenjiang113 opened this issue Sep 7, 2021 · 4 comments
Closed

Some questions about Paper #2

duzhenjiang113 opened this issue Sep 7, 2021 · 4 comments
Labels
good first issue Good for newcomers

Comments

@duzhenjiang113
Copy link

Hi, Xumin. It's a great paper and I'm inspired a lot. I have a question about the ablation experiment in the paper. When using the baseline test, the result generated by the query generator is replaced, how do the Dynamic Queries of the Transformer decoder be generated? Thanks a lot.

@yuxumin
Copy link
Owner

yuxumin commented Sep 7, 2021

We replace the query generator with learnable parameters in the ablation study, which is similar to the practice in DETR.

@duzhenjiang113
Copy link
Author

We replace the query generator with learnable parameters in the ablation study, which is similar to the practice in DETR.

Thanks for your reply. So, it seems it's a linear projection without max pooling operation. Is my understanding correct or not?

@yuxumin
Copy link
Owner

yuxumin commented Sep 7, 2021

Not really, the queries are independent of the outputs of the encoder. In that case, outputs of the encoder are only used as the memory of the decoder (cross attention) and produce the coarse center points, while the queries of the decoder are some pre-defined and learnable parameters. You can get more details from DETR pipeline (End-to-End Object Detection with Transformers) .

@duzhenjiang113
Copy link
Author

Not really, the queries are independent of the outputs of the encoder. In that case, outputs of the encoder are only used as the memory of the decoder (cross attention) and produce the coarse center points, while the queries of the decoder are some pre-defined and learnable parameters. You can get more details from DETR pipeline (End-to-End Object Detection with Transformers) .

Thanks, I'll check it.

@yuxumin yuxumin added the good first issue Good for newcomers label Sep 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants