New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does TokenLearner only square inputs supported? #279
Comments
I want to convert your code to PyTorch, but I don't know much about Jax.
Or give class TokenLearner the same tensor shape annotation in the class TokenFuser? |
Hi, I updated model.py so that V1.1 supports inputs with none square shapes. V1 only supports square inputs at this point. In JAX, we follow the channels-last format. The input to the TokenLearner V1.1 module is [B, H, W, C] or simply [B, HW, C]. For 224x224 images with ViT, this typically would be [B, 14, 14, C] or [B, 196, C]. |
To further clarify: V1.1 module supports non-square 4D tensor inputs with [B, H, W, C] and any 3D tensor inputs with [B, HW, C]. V1 module also supports non-square 4D tensor inputs with [B, H, W, C]. However, for 3D tensor inputs with [B, HW, C], it expects HW to be squared for now. |
Thanks, my purpose is to apply TokenLearnner in the field of Human Pose Estimation, so the width and height of each patch are not equal. For the code in the TokenLearner part, I can read it even if I don't know Jax, because each step has detailed shape annotations, like this: scenic/scenic/projects/token_learner/model.py Line 164 in d795fc4
Could you add shape annotations on TokenFuser class either? scenic/scenic/projects/token_learner/model.py Line 176 in d795fc4
Thanks a lot! |
Since ImageNet, inetics-400, Kinetics-600, Charades, and AViD are all used for classified tasks. |
We will update the TokenFuser documentation soon. We have not tried this on segmentation but we tried it on other types of regression tasks for robotics, and it worked fine in our case. |
Pytorch Version
|
TokenLearner has versions of v1.0 and v1.1.
scenic/scenic/projects/token_learner/model.py
Lines 140 to 141 in 98fdaae
The v1.1 said only supported square inputs.
Does the V1.0 version also support square input? Why?
The text was updated successfully, but these errors were encountered: