You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the pretrained models, should I normalize the image tensor before feeding it to the model?
For training, COCODataset applies the usual Normalize transform (code) when loading images. The config files also include NormalizeTensor steps in train_pipeline and val_pipeline, though that part of the config doesn't appear to be used in this repo.
On the inference side, however, the inference.py script only scales the image input to [0, 1], without normalizing (code). Is that a bug, or am I missing something?
The text was updated successfully, but these errors were encountered:
Hello, thanks for your interest and for the bug report, I agree the preprocessing is different in inference, the original implementation I built on (https://github.com/jaehyunnn/ViTPose_pytorch) seems to contain this bug and I did not notice.
Should be fixed with c8a7a14, I inspected the inference results visually and they improve. I referenced this issue on the original implementation.
Great, thanks for the quick response! The fix looks correct to me. I'm using the ViTPose model directly, so I'm happy to have confirmation on what preprocessing I should do.
Thanks for the helpful repo!
When using the pretrained models, should I normalize the image tensor before feeding it to the model?
For training,
COCODataset
applies the usualNormalize
transform (code) when loading images. The config files also includeNormalizeTensor
steps intrain_pipeline
andval_pipeline
, though that part of the config doesn't appear to be used in this repo.On the inference side, however, the
inference.py
script only scales the image input to[0, 1]
, without normalizing (code). Is that a bug, or am I missing something?The text was updated successfully, but these errors were encountered: