-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why divide 0.07 #15
Comments
It is the temperature ratio of the original CLIP. |
But i see in https://github.com/mlfoundations/open_clip#usage, they use 100, you think it make result different? |
i dont think it will affect much
…---- Replied Message ----
| From | ***@***.***> |
| Date | 05/24/2024 14:30 |
| To | ***@***.***> |
| Cc | Qihang ***@***.***>***@***.***> |
| Subject | Re: [zqhang/AnomalyCLIP] Why divide 0.07 (Issue #15) |
But i see in https://github.com/mlfoundations/open_clip#usage, they use 100, you think it make result different?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Another question, i want use Linear layer to get local feature of images, You think when train text by your idea and with adapter linear. It can be better |
you can have a try |
I am trying but i don't have make high ACC, I just high in first epoch and after it reduce and I don't know why :( |
I train with dataset visa,
I train 11 epoch, but loss and image_loss is 3.7960, 0.5325. I feel
something wrong. I setup by your setting. Can you share me about loss and
image_loss when u train?
…On Fri, May 24, 2024 at 3:18 PM Qihang Zhou ***@***.***> wrote:
you can have a try
—
Reply to this email directly, view it on GitHub
<#15 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AR65VJJZ5CQCRLN2CUACCLTZD3ZVRAVCNFSM6AAAAABIFKQWISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRYHA4DINJQHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I don't understand why you divide right here
text_probs = image_features.unsqueeze(1) @ text_features.permute(0, 2, 1) text_probs = text_probs[:, 0, ...]/0.07 text_probs = text_probs.squeeze()
and another here
logit_scale = self.logit_scale.exp() # nn.Parameter(torch.ones([]) * np.log(1 / 0.07)) logits_per_image = logit_scale * image_features @ text_features.t() logits_per_text = logits_per_image.t()
And in another paper , they use multi with 100, example
for layer in range(len(det_patch_tokens)): det_patch_tokens[layer] = det_patch_tokens[layer] / det_patch_tokens[layer].norm(dim=-1, keepdim=True) anomaly_map = (100.0 * det_patch_tokens[layer] @ text_features) anomaly_map = torch.softmax(anomaly_map, dim=-1)[:, :, 1] anomaly_score = torch.mean(anomaly_map, dim=-1) det_loss += loss_bce(anomaly_score, image_label)
The text was updated successfully, but these errors were encountered: