Why don't the track queries get updated for two_stage? #61

owen24819 · 2022-09-15T19:47:13Z

trackformer/src/trackformer/models/deformable_transformer.py

Lines 180 to 230 in d62d810

    
           if self.two_stage: 
        
               output_memory, output_proposals = self.gen_encoder_output_proposals(memory, mask_flatten, spatial_shapes) 
        
               # hack implementation for two-stage Deformable DETR 
        
               enc_outputs_class = self.decoder.class_embed[self.decoder.num_layers](output_memory) 
        
               enc_outputs_coord_unact = self.decoder.bbox_embed[self.decoder.num_layers](output_memory) + output_proposals 
        
               topk = self.two_stage_num_proposals 
        
               topk_proposals = torch.topk(enc_outputs_class[..., 0], topk, dim=1)[1] 
        
               topk_coords_unact = torch.gather(enc_outputs_coord_unact, 1, topk_proposals.unsqueeze(-1).repeat(1, 1, 4)) 
        
               topk_coords_unact = topk_coords_unact.detach() 
        
               reference_points = topk_coords_unact.sigmoid() 
        
               init_reference_out = reference_points 
        
               pos_trans_out = self.pos_trans_norm(self.pos_trans(self.get_proposal_pos_embed(topk_coords_unact))) 
        
               query_embed, tgt = torch.split(pos_trans_out, c, dim=2) 
        
           else: 
        
               query_embed, tgt = torch.split(query_embed, c, dim=1) 
        
               query_embed = query_embed.unsqueeze(0).expand(bs, -1, -1) 
        
               tgt = tgt.unsqueeze(0).expand(bs, -1, -1) 
        
               reference_points = self.reference_points(query_embed).sigmoid() 
        
               if targets is not None and 'track_query_hs_embeds' in targets[0]: 
        
                   # print([t['track_query_hs_embeds'].shape for t in targets]) 
        
                   # prev_hs_embed = torch.nn.utils.rnn.pad_sequence([t['track_query_hs_embeds'] for t in targets], batch_first=True, padding_value=float('nan')) 
        
                   # prev_boxes = torch.nn.utils.rnn.pad_sequence([t['track_query_boxes'] for t in targets], batch_first=True, padding_value=float('nan')) 
        
                   # print(prev_hs_embed.shape) 
        
                   # query_mask = torch.isnan(prev_hs_embed) 
        
                   # print(query_mask) 
        
                   prev_hs_embed = torch.stack([t['track_query_hs_embeds'] for t in targets]) 
        
                   prev_boxes = torch.stack([t['track_query_boxes'] for t in targets]) 
        
                   prev_query_embed = torch.zeros_like(prev_hs_embed) 
        
                   # prev_query_embed = self.track_query_embed.weight.expand_as(prev_hs_embed) 
        
                   # prev_query_embed = self.hs_embed_to_query_embed(prev_hs_embed) 
        
                   # prev_query_embed = None 
        
                   prev_tgt = prev_hs_embed 
        
                   # prev_tgt = self.hs_embed_to_tgt(prev_hs_embed) 
        
                   query_embed = torch.cat([prev_query_embed, query_embed], dim=1) 
        
                   tgt = torch.cat([prev_tgt, tgt], dim=1) 
        
                   reference_points = torch.cat([prev_boxes[..., :2], reference_points], dim=1) 
        
                   # if 'track_queries_placeholder_mask' in targets[0]: 
        
                   #     query_attn_mask = torch.stack([t['track_queries_placeholder_mask'] for t in targets]) 
        
               init_reference_out = reference_points

I am confused why the track queries don't get updated for the two-stage.

Also, nice work by the way!

timmeinhardt · 2022-09-15T19:54:28Z

TrackFormer is not implemented to work with the two-stage approach of Deformable DETR. In my opinion, the two-stage approach is a step back from the end-to-end unified solution of DETR. Hence, we never tried to combine two-stage with our track query approach.

owen24819 · 2022-09-15T20:35:56Z

Ah ok. I see. Thanks for the quick response!

Following up with another question, does TrackFormer work with num_feature_levels greater than 1. It makes sense why you chose 1 feature level however I would like to try more feature levels. Although I could run MultiScaleDeformableAttention with multiple feature levels on Trackformer, I noticed the MultiScaleDeformableAttention package was differenent from the Deformable DETR paper. Is it ok to use your MultiScaleDeformableAttention package for multiple feature levels or should I revert back to the Deformable DETR's MultiScaleDeformableAttention package.

timmeinhardt · 2022-09-15T20:38:55Z

TrackFormer already works with multiple feature levels. All our trainings/evaluations (except the MOTS20 models) run the deformable option which loads this config.

The underlying MultiScaleDeformableAttention backend should not make a big difference.

owen24819 · 2022-09-15T20:53:58Z

Great! Thanks a lot!

owen24819 closed this as completed Sep 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why don't the track queries get updated for two_stage? #61

Why don't the track queries get updated for two_stage? #61

owen24819 commented Sep 15, 2022 •

edited

timmeinhardt commented Sep 15, 2022

owen24819 commented Sep 15, 2022

timmeinhardt commented Sep 15, 2022

owen24819 commented Sep 15, 2022

Why don't the track queries get updated for two_stage? #61

Why don't the track queries get updated for two_stage? #61

Comments

owen24819 commented Sep 15, 2022 • edited

timmeinhardt commented Sep 15, 2022

owen24819 commented Sep 15, 2022

timmeinhardt commented Sep 15, 2022

owen24819 commented Sep 15, 2022

owen24819 commented Sep 15, 2022 •

edited