Support for ViT-L and ViT-H Image Encoder Checkpoints #7

Berlin000000 · 2024-05-27T02:14:39Z

I have noticed that the current implementation only supports the image encoder checkpoint for ViT-B. Could you please clarify why only the ViT-B checkpoint is supported? Additionally, if I want to use the corresponding checkpoints for larger models, such as ViT-L (Large) or ViT-H (Huge), what steps should I take to implement this support? Are there specific modifications or considerations required to extend compatibility to these larger models?

MathieuNlp · 2024-06-07T13:13:17Z

Hello,

In this file, we are loading the vit_b model with the loader from SAM: https://github.com/MathieuNlp/Sam_LoRA/blob/main/train.py#L30

If you want to load another model use the other sam builder here: https://github.com/MathieuNlp/Sam_LoRA/blob/main/src/segment_anything/build_sam.py

Hope it answers your question.

MathieuNlp mentioned this issue Jun 7, 2024

Two questions that I hope to consult with you #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for ViT-L and ViT-H Image Encoder Checkpoints #7

Support for ViT-L and ViT-H Image Encoder Checkpoints #7

Berlin000000 commented May 27, 2024

MathieuNlp commented Jun 7, 2024

Support for ViT-L and ViT-H Image Encoder Checkpoints #7

Support for ViT-L and ViT-H Image Encoder Checkpoints #7

Comments

Berlin000000 commented May 27, 2024

MathieuNlp commented Jun 7, 2024