-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kullback Leibler Divergence broadcasting no longer works #13707
Comments
Hi. Reading at the solution you linked, it seems like it was more of a trick instead of something which was officially supported. Can you maybe provide a minimal example of what you think should be working? |
I don't think it was really a trick. It's just that the code supported nd-array input, performing the operation along the last axis, without advertising it. Many scipy functions support this sort of thing, but they will have an update: oops, this wasn't accurate. It's easier than that. I didn't realize the |
This looks like an easy fix. It's still vectorized; we just need to broadcast the two arrays instead of requiring them to be the same shape. @swhalemwo take a look at |
thanks for the ideas! I have to admit I don't know how broadcasting works in detail, and my code has been copied together to a large extent from SO. Just to clarify my issue, previously the following code would work to calculate pairwise KLDs, but now fails due to the shape requirement.
however, when I remove the shape requirement, it works just as before.
|
With the required imports and using explicit broadcasting, that's: import numpy as np
from scipy.special import rel_entr, entr
from scipy.stats import entropy
distributions = np.random.rand(3, 5)
distributions /= distributions.sum(axis=1, keepdims=True)
# broadcasting before passing the arrays in works
a, b = np.broadcast_arrays(distributions.T[:,:,None], distributions.T[:,None,:])
pairwise_klds1 = entropy(a, b)
def entropy_custom(pk, qk=None, base=None, axis=0):
"""custom version of entropy without shape requirements"""
pk = np.asarray(pk)
pk = 1.0*pk / np.sum(pk, axis=axis, keepdims=True)
if qk is None:
vec = entr(pk)
else:
qk = np.asarray(qk)
# if qk.shape != pk.shape:
# raise ValueError("qk and pk must have same shape.")
qk = 1.0*qk / np.sum(qk, axis=axis, keepdims=True)
vec = rel_entr(pk, qk)
S = np.sum(vec, axis=axis)
if base is not None:
S /= np.log(base)
return S
pairwise_klds2 = entropy_custom(distributions.T[:,:,None], distributions.T[:,None,:])
np.testing.assert_equal(pairwise_klds1, pairwise_klds2) |
Addressed in gh-13711. |
@mdhaber thanks for the explanation! |
I revisited some older code which involved the calculation of pairwise KLDs. Some years back I had implemented the calculation based on numpy broadcasting. Now I noticed this approach no longer works, as since commit 473dd08
scipy.stats.entropy
requires the arrays to have identical shapes (previously only identical lengths were required).The text was updated successfully, but these errors were encountered: