Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible inconsistency in data preprocessing #4

Closed
alirezazareian opened this issue Jun 23, 2020 · 3 comments
Closed

Possible inconsistency in data preprocessing #4

alirezazareian opened this issue Jun 23, 2020 · 3 comments

Comments

@alirezazareian
Copy link

Hi, thank you so much for sharing this code. It is very helpful.

However, I am confused about the data preprocessing configuration. In the config files, Caffe-style image mean and std is used, but it seams they are not used in the code. Instead, the code seems to hard-code torchvision-style mean and std (here). Can you confirm that both pretraining and fine-tuning use the latter?

Furthermore, I am not sure whether the images are in 0-255 range or 0-1. For Caffe-style mean and std, it should be 0-255, but it seems with your hard-coded mean and std, it should be 0-1. However, I noticed you are using opencv to load images, which loads in 0-255, and I did not find anywhere in the code that they are transformed into 0-1, except in supervised pretraining (here).

Could you please comment on the aforementioned issues? Especially it is important to make sure the config is identical for all pretraining and downstream settings. Since you fine-tune all layers and don't freeze the stem, it is hard to notice if such inconsistencies exist, because the fine-tuning process would fix them to some extent.

Thank you so much.

@kdexd
Copy link
Owner

kdexd commented Jun 23, 2020

Hi @alirezazareian:

In the config files, Caffe-style image mean and std is used, but it seams they are not used in the code.

By "config files", I think you probably mean the Detectron2 config files in configs/detectron2 directory. These config files are used in fine-tuning tasks that use Detectron2, specifically --d2-config argument in scripts/eval_detectron2.py. Detectron2 accepts Caffe-style ImageNet color mean and std (in range 0-255). I simply changed them from BGR order (Detectron2 default) to RGB order. Detectron2 internally loads images in 0-255 range and normalizes them in that range.

I noticed you are using opencv to load images, which loads in 0-255, and I did not find anywhere in the code that they are transformed into 0-1, except in supervised pretraining (here).

VirTex pretraining uses Normalize from albumentations (like ImageNet supervised pretraining) but with default max_pixel_value = 255.0 (whereas ImageNet supervised pretraining sets it as 1.0, here). Both are equivalent — they load image in 0-255 format, but finally Normalize to N(0, 1) — a standard convention in torchvision.

Also, note that our ImageNet supervised pretraining script is exactly the same as torchvision pretraining script — we only use common libraries with rest of our codebase (like albumentations instead of torchvision), and do some code-style formatting.

Could you please comment on the aforementioned issues? Especially it is important to make sure the config is identical for all pretraining and downstream settings.

Detectron2 config and our VirTex config follow a completely different structure, and are meant for very different use-cases. We have a limitation in enforcing too much similarity. But rest assured, the models receive inputs normalized with N(0, 1) in RGB format during both, pretraining and fine-tuning.

For this issue, I will make the Normalize call uniform between VirTex pretraining (here) and ImageNet supervised pretraining (here). add some helpful inline comments to clear your confusion (along with someone who might have similar questions in future).

@kdexd
Copy link
Owner

kdexd commented Jun 23, 2020

I pushed 91bfd0c with some inline comments and uniform API calls of Normalize. My commit message triggered a close on this issue, but feel free to re-open otherwise. I hope this helps!

@alirezazareian
Copy link
Author

Thank you for your prompt response. It is much more clear now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants