Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in petrv2, Which parts of the code do the timing embody #60

Closed
guohao02 opened this issue Sep 29, 2022 · 2 comments
Closed

in petrv2, Which parts of the code do the timing embody #60

guohao02 opened this issue Sep 29, 2022 · 2 comments

Comments

@guohao02
Copy link

No description provided.

@yingfei1016
Copy link
Collaborator

Hi,
(1)Temporal alignment is placed in the data processing https://github.com/megvii-research/PETR/blob/main/tools/generate_sweep_pkl.py.
(2)We load the sweep data in pipeline https://github.com/megvii-research/PETR/blob/main/projects/configs/petrv2/petrv2_vovnet_gridmask_p4_800x320.py#L161. The sweep data is concat with key frame data at view axis ( (B, 6, 3, H, W) -> (B, 12, 3, H, W) ). Then the data augmentation and training can be performed similar with single frame.

@exiawsh
Copy link

exiawsh commented Sep 30, 2022

Hello,
I also conducted research on PETRV2. Here are some of my understandings:
(1)There are no explicit time positional embedding, so they regression the position offset instead.
(2)The way to distinguish the time is the multi-view embbeding. 图片
MV in the table. That's may cause the generalization problem in the real application. But I found you can encode the time delay by using sincos embedding, which may cause similiar results (slightly poor in mAVE).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants