Difference in CPU/GPU results larger than expected #2

danieldeutsch · 2021-11-05T00:55:51Z

Hello,

I've successfully reproduced the results from the Readme on the CPU, but the results when I switch the code to the GPU are very different. I'm aware of the minor difference you say could happen in the Readme, but this is larger.

For example, the "good" captions have a CLIPScore of 0.8585, but when I switch to GPU the score is 2.52734375. I think this may be specific to my environment/GPU because this didn't happen when I ran it on a GPU with Codalab.

Have you seen this issue before? Here is my exact environment

clip @ git+git://github.com/openai/CLIP.git@2867559c5fe0b02a2d3167aeacd333b3c4276847
cycler==0.11.0
Cython==0.29.24
ftfy==6.0.3
joblib==1.1.0
kiwisolver==1.3.2
matplotlib==3.4.3
numpy==1.21.3
Pillow==8.4.0
pycocoevalcap @ git+git://github.com/jmhessel/pycocoevalcap.git@273d8d5c42ca81fb43fd6bd699cfc4aa26bcda9a
pycocotools==2.0.2
pyparsing==3.0.4
python-dateutil==2.8.2
regex==2021.11.2
scikit-learn==1.0.1
scipy==1.7.1
six==1.16.0
sklearn==0.0
threadpoolctl==3.0.0
torch==1.7.1
torchvision==0.8.2
tqdm==4.62.3
typing-extensions==3.10.0.2
wcwidth==0.2.5

I tried with CUDA 10.2 and 11.0 with the same result.

Thanks!

The text was updated successfully, but these errors were encountered:

jmhessel · 2021-11-05T17:18:33Z

Hi @danieldeutsch !

Thanks for your interest in our work! I spent yesterday debugging this problem. After much digging, It turns out that float16 einsums are broken in numpy greater than 1.21. This einsum is used in sklearn.preprocessing.normalization because float16 is used in the GPU version of CLIPscore.

See:

scikit-learn/scikit-learn#21559
and
numpy/numpy#20305

For now, I have added a numpy version check that does a different method of normalization, so the code should work now, though the new normalization scheme differs (up to numerical precision)

https://github.com/jmhessel/clipscore/blob/main/clipscore.py#L148-L157

But, if you want to replicate the settings that we used in the paper exactly, you could pip install numpy==1.20.3.

Sorry for the error --- was a tricky one to diagnose!!

danieldeutsch · 2021-11-05T18:00:15Z

Awesome! That fixed the problem. Nice debugging. It sounds hard to find.

For what it's worth, I made a Dockerized version of the metric that you can find here.

Thanks!

jmhessel · 2021-11-05T19:08:17Z

Awesome!!! Thanks Daniel --- I'll check out your repro repo :-) Definitely let me know if I can be helpful further

jmhessel closed this as completed Nov 5, 2021

jmhessel mentioned this issue Mar 23, 2022

Reproducing Pascal-50S #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference in CPU/GPU results larger than expected #2

Difference in CPU/GPU results larger than expected #2

danieldeutsch commented Nov 5, 2021

jmhessel commented Nov 5, 2021 •

edited

danieldeutsch commented Nov 5, 2021

jmhessel commented Nov 5, 2021

Difference in CPU/GPU results larger than expected #2

Difference in CPU/GPU results larger than expected #2

Comments

danieldeutsch commented Nov 5, 2021

jmhessel commented Nov 5, 2021 • edited

danieldeutsch commented Nov 5, 2021

jmhessel commented Nov 5, 2021

jmhessel commented Nov 5, 2021 •

edited