Skip to content

Commit

Permalink
PR pandas-dev#22761: add cookbook entry for callable correlation meth…
Browse files Browse the repository at this point in the history
  • Loading branch information
Daniel Saxton authored and tm9k1 committed Nov 19, 2018
1 parent 35a1dbc commit af092d0
Showing 1 changed file with 36 additions and 0 deletions.
36 changes: 36 additions & 0 deletions doc/source/cookbook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1223,6 +1223,42 @@ Computation
`Numerical integration (sample-based) of a time series
<http://nbviewer.ipython.org/5720498>`__

Correlation
***********

The `method` argument within `DataFrame.corr` can accept a callable in addition to the named correlation types. Here we compute the `distance correlation <https://en.wikipedia.org/wiki/Distance_correlation>`__ matrix for a `DataFrame` object.

.. ipython:: python
def distcorr(x, y):
n = len(x)
a = np.zeros(shape=(n, n))
b = np.zeros(shape=(n, n))
for i in range(n):
for j in range(i + 1, n):
a[i, j] = abs(x[i] - x[j])
b[i, j] = abs(y[i] - y[j])
a += a.T
b += b.T
a_bar = np.vstack([np.nanmean(a, axis=0)] * n)
b_bar = np.vstack([np.nanmean(b, axis=0)] * n)
A = a - a_bar - a_bar.T + np.full(shape=(n, n), fill_value=a_bar.mean())
B = b - b_bar - b_bar.T + np.full(shape=(n, n), fill_value=b_bar.mean())
cov_ab = np.sqrt(np.nansum(A * B)) / n
std_a = np.sqrt(np.sqrt(np.nansum(A**2)) / n)
std_b = np.sqrt(np.sqrt(np.nansum(B**2)) / n)
return cov_ab / std_a / std_b
df = pd.DataFrame(np.random.normal(size=(100, 3)))
df.corr(method=distcorr)
Timedeltas
----------

Expand Down

0 comments on commit af092d0

Please sign in to comment.