Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in CPU/GPU results larger than expected #2

Closed
danieldeutsch opened this issue Nov 5, 2021 · 3 comments
Closed

Difference in CPU/GPU results larger than expected #2

danieldeutsch opened this issue Nov 5, 2021 · 3 comments

Comments

@danieldeutsch
Copy link

Hello,

I've successfully reproduced the results from the Readme on the CPU, but the results when I switch the code to the GPU are very different. I'm aware of the minor difference you say could happen in the Readme, but this is larger.

For example, the "good" captions have a CLIPScore of 0.8585, but when I switch to GPU the score is 2.52734375. I think this may be specific to my environment/GPU because this didn't happen when I ran it on a GPU with Codalab.

Have you seen this issue before? Here is my exact environment

clip @ git+git://github.com/openai/CLIP.git@2867559c5fe0b02a2d3167aeacd333b3c4276847
cycler==0.11.0
Cython==0.29.24
ftfy==6.0.3
joblib==1.1.0
kiwisolver==1.3.2
matplotlib==3.4.3
numpy==1.21.3
Pillow==8.4.0
pycocoevalcap @ git+git://github.com/jmhessel/pycocoevalcap.git@273d8d5c42ca81fb43fd6bd699cfc4aa26bcda9a
pycocotools==2.0.2
pyparsing==3.0.4
python-dateutil==2.8.2
regex==2021.11.2
scikit-learn==1.0.1
scipy==1.7.1
six==1.16.0
sklearn==0.0
threadpoolctl==3.0.0
torch==1.7.1
torchvision==0.8.2
tqdm==4.62.3
typing-extensions==3.10.0.2
wcwidth==0.2.5

I tried with CUDA 10.2 and 11.0 with the same result.

Thanks!

@jmhessel
Copy link
Owner

jmhessel commented Nov 5, 2021

Hi @danieldeutsch !

Thanks for your interest in our work! I spent yesterday debugging this problem. After much digging, It turns out that float16 einsums are broken in numpy greater than 1.21. This einsum is used in sklearn.preprocessing.normalization because float16 is used in the GPU version of CLIPscore.

See:

scikit-learn/scikit-learn#21559
and
numpy/numpy#20305

For now, I have added a numpy version check that does a different method of normalization, so the code should work now, though the new normalization scheme differs (up to numerical precision)

https://github.com/jmhessel/clipscore/blob/main/clipscore.py#L148-L157

But, if you want to replicate the settings that we used in the paper exactly, you could pip install numpy==1.20.3.

Sorry for the error --- was a tricky one to diagnose!!

@jmhessel jmhessel closed this as completed Nov 5, 2021
@danieldeutsch
Copy link
Author

Awesome! That fixed the problem. Nice debugging. It sounds hard to find.

For what it's worth, I made a Dockerized version of the metric that you can find here.

Thanks!

@jmhessel
Copy link
Owner

jmhessel commented Nov 5, 2021

Awesome!!! Thanks Daniel --- I'll check out your repro repo :-) Definitely let me know if I can be helpful further

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants