Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zero shot classification results #3

Open
gefend opened this issue Jan 16, 2022 · 4 comments
Open

zero shot classification results #3

gefend opened this issue Jan 16, 2022 · 4 comments

Comments

@gefend
Copy link

gefend commented Jan 16, 2022

I tried using your script for zero shot classification together with the pretrained weights (both resnet18 and resnet50). The calssification results I got are very random (accuracy 17-22% for each class). Maybe there is an aditional step needed or the weights I downloaded are not the trained weights?

@marshuang80
Copy link
Owner

To help you debug this, can you please provide the following information?

  • What dataset are you using?
  • How are you preprocessing the data?
  • What are your classification tasks?
  • How are you generating the class prompts?

@gefend
Copy link
Author

gefend commented Jan 20, 2022

  1. I'm using the Chexpert 5X200 dataset
  2. I have used the "Zeroshot classification for CheXpert5x200" script from your Readme file without additional preproccesing.
    The only change I made in the "Zeroshot classification for CheXpert5x200" was the generation of the chexpert 5X200 data set using your function preprocess_chexpert_5X200_data from gloria.datasets.preprocess_datasets .
  3. The classification task is the same as your classification task on chexpert 5X200, classification of each image into on of the 5 classes.
  4. As part of using your script I am generating the class_prompts using your function generate_chexpert_class_prompts
    Thank you for your answer!

@marshuang80
Copy link
Owner

Got it, thanks for the info. May I ask how you are computing the results? Using different random seeds I was still able to get an accuracy of 60+

labels = df[gloria.constants.CHEXPERT_COMPETITION_TASKS].to_numpy().argmax(axis=1)
pred = similarities[gloria.constants.CHEXPERT_COMPETITION_TASKS].to_numpy().argmax(axis=1)
acc = len(labels[labels == pred]) / len(labels)
print(acc) # 0.607

@gefend
Copy link
Author

gefend commented Jan 23, 2022

The following is the code I used, I tried to replace the results calculation to the one you suggested but I get the same results. The only thing that I think I'm doing different than you is that I take only 200 images every run because of my GPU memory capacity.

import torch
import gloria
import pandas as pd
from gloria.datasets.preprocess_datasets import preprocess_chexpert_5x200_data

df = preprocess_chexpert_5x200_data()
df = df[0:200]
# load model
device = "cuda" if torch.cuda.is_available() else "cpu"
gloria_model = gloria.load_gloria(device=device)

cls_prompts = gloria.generate_chexpert_class_prompts()

# process input images and class prompts
processed_txt = gloria_model.process_class_prompts(cls_prompts, device)
processed_imgs = gloria_model.process_img(df['Path'].tolist(), device)

# zero-shot classification on 1000 images
similarities = gloria.zero_shot_classification(
    gloria_model, processed_imgs, processed_txt)

labels = df[gloria.constants.CHEXPERT_COMPETITION_TASKS].to_numpy().argmax(axis=1)
pred = similarities[gloria.constants.CHEXPERT_COMPETITION_TASKS].to_numpy().argmax(axis=1)
acc = len(labels[labels == pred]) / len(labels) #0.17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants