New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Difference in CPU/GPU results larger than expected #2
Comments
Hi @danieldeutsch ! Thanks for your interest in our work! I spent yesterday debugging this problem. After much digging, It turns out that float16 einsums are broken in numpy greater than See: scikit-learn/scikit-learn#21559 For now, I have added a numpy version check that does a different method of normalization, so the code should work now, though the new normalization scheme differs (up to numerical precision) https://github.com/jmhessel/clipscore/blob/main/clipscore.py#L148-L157 But, if you want to replicate the settings that we used in the paper exactly, you could Sorry for the error --- was a tricky one to diagnose!! |
Awesome! That fixed the problem. Nice debugging. It sounds hard to find. For what it's worth, I made a Dockerized version of the metric that you can find here. Thanks! |
Awesome!!! Thanks Daniel --- I'll check out your repro repo :-) Definitely let me know if I can be helpful further |
Hello,
I've successfully reproduced the results from the Readme on the CPU, but the results when I switch the code to the GPU are very different. I'm aware of the minor difference you say could happen in the Readme, but this is larger.
For example, the "good" captions have a CLIPScore of 0.8585, but when I switch to GPU the score is 2.52734375. I think this may be specific to my environment/GPU because this didn't happen when I ran it on a GPU with Codalab.
Have you seen this issue before? Here is my exact environment
I tried with CUDA 10.2 and 11.0 with the same result.
Thanks!
The text was updated successfully, but these errors were encountered: