New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hi,I want to test a dataset with 200 classes,but get a error #37
Comments
RuntimeError: The expanded size of the tensor (1024) must match the existing size (512) at non-singleton dimension 1. Target sizes: [1, 1024]. Tensor sizes: [1, 512] |
i want to know if the model can test the dataset with 200 classes |
I think the default max class number is 80 |
@weveng @xugy16 There is actually no "max class" concept in GLIP since it is an open-vocabulary contextualized object grounding method. Please just make sure that after converting 200 classes into one detection prompt, it will not exceed the maximum length of the input for the text encoder. If the prompt exceeds the length, you can take a look at the inference codes about how we deal with the LVIS dataset (~1200 classes). |
@Haotian-Zhang Thank you for this tip! I see from the paper that the prompt is split up for training (Appendix B, "When not all object categories can fit into a single prompt") and evaluation (Appendix C.2). This makes a lot of sense. Unfortunately, I cannot find the relevant routines in the code. I wonder if someone can more specifically refer us to where should @weveng and I be looking for the implementation of these splits/accumulations for training and testing? |
I haven't found the corresponding code either. Can you find a way to handle situations where the prompt length is too long? |
No description provided.
The text was updated successfully, but these errors were encountered: