Dear Author,
I encountered an error while reproducing your results. Specifically, the token compression function of TRIM fails to select any tokens (zero tokens selected) for certain samples.
This issue occurred after I configured the environment and integrated the TRIM compression method into the llava-v1.5-7b configuration. I encountered this error consistently when running evaluations on both VQAv2 and GQA. I was wondering if you have encountered similar issues or if you could provide some guidance on how to resolve this.
File "clip_encoder.py", line 199, in forward
image_features, actual_dims = self.token_reduction(image_features, all_image_features, text_features)
File "clip_encoder.py", line 120, in token_reduction
selected_image_features[:, :num_tokens_to_keep] = image_features[token_mask].view(batch_size, num_tokens_to_keep, -1)
RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1]
Dear Author,
I encountered an error while reproducing your results. Specifically, the token compression function of TRIM fails to select any tokens (zero tokens selected) for certain samples.
This issue occurred after I configured the environment and integrated the TRIM compression method into the llava-v1.5-7b configuration. I encountered this error consistently when running evaluations on both VQAv2 and GQA. I was wondering if you have encountered similar issues or if you could provide some guidance on how to resolve this.
File "clip_encoder.py", line 199, in forward
image_features, actual_dims = self.token_reduction(image_features, all_image_features, text_features)
File "clip_encoder.py", line 120, in token_reduction
selected_image_features[:, :num_tokens_to_keep] = image_features[token_mask].view(batch_size, num_tokens_to_keep, -1)
RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1]