Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image preprocessing #3

Closed
bakachan19 opened this issue May 10, 2023 · 4 comments
Closed

image preprocessing #3

bakachan19 opened this issue May 10, 2023 · 4 comments

Comments

@bakachan19
Copy link

bakachan19 commented May 10, 2023

Hi.
Thank you for this interesting package.

I was looking at the examples in the tools, more precisely at docta_rare_pattern.py and using it with the following command:
%run ./tools/docta_rare_pattern.py --feature_type 'embedding' --suffix 'c1m_subset' , following the demo in the notebook docta_rare_pattern_clothes.ipynb.

In this case the dataset is created using:
dataset = Customize_Image_Folder(root=cfg.data_root, transform=None).

Looking at the Customize_Image_Folder(), given that transform =None the image will be transformed using the following function:

if transform is None:
            self.transform = transforms.Compose([
                transforms.Resize(256),
                transforms.CenterCrop(224),
                transforms.ToTensor(),
                transforms.Normalize((0.6959, 0.6537, 0.6371), (0.3113, 0.3192, 0.3214)),
            ])

and followed up by:

feature_all.append(tmp_feature.permute(1, 2, 0).numpy().astype(np.uint8))  

But then in the docta_rare_pattern.py file the pre_processor.encode_feature() is used to get the image embeddings but also customized the data (performs another normalization step before extracting the embeddings) with:
dataset_list += [CustomizedDataset(feature=self.dataset.feature, label=self.dataset.label, preprocess=preprocess)], where the preprocess function is retrieved from get_encoder():
model_embedding, _, preprocess = open_clip.create_model_and_transforms(self.cfg.embedding_model)

The CLIP preprocessing function seems to be this one:

Compose(
    Resize(size=224, interpolation=bicubic, max_size=None, antialias=warn)
    CenterCrop(size=(224, 224))
    <function _convert_to_rgb at 0x7ff56ae71f30>
    ToTensor()
    Normalize(mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711])
)

I am wondering why the images are normalized two times: one within the Customized_Image_Folder() and a second time with the preprocessing function from open_clip.
Are there any steps that I am missing? I cannot fully understand why the double normalization & preprocessing.
Could you please explain why this strategy is used in the method?

I am greatly thankful for any hints and insights that you can provide.
Thank you for your time.

Later edit: I also noticed this in docta_cifar10.py where the data is loaded with Cifar10_noisy(cfg, train=True) which has a transform function defined and then the pre_process.encode_feature() is called which customized the data with another preprocessing step.

I apologize for the long post.

weijiaheng added a commit that referenced this issue May 10, 2023
@weijiaheng
Copy link
Contributor

Hi

Thanks for the great catching and your detailed post! We have fixed this "double processing" in our latest commit.

Briefly speaking, in the file customize_img_folder.py, we removed the data preprocessing code when transform=None to avoid double preprocessing. In this case, Docta will simply append features by changing each image from <PIL.Image.Image> to the numpy array. Thus, the revised code will process & normalize the image data for only once when transform=None (default value).

Please feel free to let us know if you have any additional concerns!

Best,
Jiaheng

@bakachan19
Copy link
Author

Hi @weijiaheng.
Thank you for your reply and for helping me clear my doubts.

I also went over again the CIFAR10 preprocessing cause I thought there was the same double processing issue also there.
But I missed the following:

self.feature = self.data

and here self.data is not normalized yet (since the normalization is applied by the dataloader and not by the dataset)
and then it just undergoes the preprocessing of CLIP with:
model_embedding, _, preprocess = open_clip.create_model_and_transforms(self.cfg.embedding_model)
and
CustomizedDataset(..., preprocess = preprocess) in line:

dataset_list += [CustomizedDataset(feature=self.dataset.feature, label=self.dataset.label, preprocess=preprocess)]

Thank you once again for your help.
Have an amazing day.

@zwzhu-d
Copy link
Contributor

zwzhu-d commented May 11, 2023

Yes, the CIFAR data is not double-processed. The transforms in the CIFAR dataloader are left for future extensions. Thank you for the good catching.

@bakachan19
Copy link
Author

Thank you for the clarification @zwzhu-d and @weijiaheng.
It was extremely useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants