Use a more helpful error message for invalid correlation methods in DataFrame.corr #22298

dsaxton · 2018-08-12T23:53:02Z

DataFrame.corr currently returns a KeyError for invalid correlation methods. The proposed change would instead return a ValueError with an error message reminding the user of the valid correlation methods.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.normal(size=(10, 3)))
df.corr(method="turkey")

pep8speaks · 2018-08-12T23:53:08Z

Hello @dsaxton! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on August 14, 2018 at 15:49 Hours UTC

codecov · 2018-08-13T00:59:45Z

Codecov Report

Merging #22298 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #22298      +/-   ##
==========================================
- Coverage   92.08%   92.05%   -0.03%     
==========================================
  Files         169      169              
  Lines       50706    50713       +7     
==========================================
- Hits        46691    46683       -8     
- Misses       4015     4030      +15

Flag	Coverage Δ
#multiple	`90.46% <100%> (-0.03%)`	⬇️
#single	`42.25% <0%> (-0.09%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/frame.py	`97.25% <100%> (+0.01%)`	⬆️
pandas/core/series.py	`93.73% <100%> (+0.01%)`	⬆️
pandas/core/internals/blocks.py	`93.83% <0%> (-0.81%)`	⬇️
pandas/core/dtypes/missing.py	`92.98% <0%> (-0.59%)`	⬇️
pandas/util/testing.py	`85.85% <0%> (-0.21%)`	⬇️
pandas/core/generic.py	`96.44% <0%> (-0.05%)`	⬇️
pandas/plotting/_core.py	`83.48% <0%> (ø)`	⬆️
pandas/core/indexes/multi.py	`95.41% <0%> (+0.07%)`	⬆️
pandas/core/ops.py	`96.71% <0%> (+0.14%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 601d71f...5ba0bc8. Read the comment docs.

TomAugspurger · 2018-08-13T01:25:02Z

Thanks. Could you add a release note in 0.24.0, and a test to ensure that code is hit for bad inputs?

dsaxton · 2018-08-13T12:43:48Z

I'm working on some additional modifications to the DataFrame correlation methods to address some potential bugs I found and close another existing issue, is it okay if I close this pull request and reopen another with all the changes?

jreback · 2018-08-13T12:49:14Z

no pls don’t

put independent changes in independent PRs

jreback

can you also test / fix this on Series as well.

jreback · 2018-08-14T00:14:20Z

pandas/core/frame.py

@@ -6682,6 +6682,9 @@ def corr(self, method='pearson', min_periods=1):
                        c = corrf(ac, bc)
                    correl[i, j] = c
                    correl[j, i] = c
+        else:
+            raise ValueError("method must be either 'pearson', "
+                             "'spearman' or 'kendall'")


add ....{method} was supplied'.format(method=method)

jreback · 2018-08-14T00:14:58Z

pandas/tests/frame/test_analytics.py

@@ -130,6 +130,11 @@ def test_corr_cov_independent_index_column(self):
            assert result.index is not result.columns
            assert result.index.equals(result.columns)

+    def test_corr_invalid_method(self):
+        df = pd.DataFrame(np.random.normal(size=(10, 2)))
+        with pytest.raises(ValueError):


can you check this with an assert_raises_regex (e.g. match the error)

add the gh issue number as a comment

Will do. Is the convention here to use the most specific regex that matches the error message, or just one that is reasonably specific?

reasonable is fine

jreback

small comments. ping on green.

jreback · 2018-08-14T10:20:49Z

pandas/core/series.py

+        if method in ['pearson', 'spearman', 'kendall']:
+            return nanops.nancorr(this.values, other.values, method=method,
+                                  min_periods=min_periods)
+        else:


you don't need the else here

jreback · 2018-08-14T10:21:23Z

pandas/tests/frame/test_analytics.py

@@ -130,6 +130,13 @@ def test_corr_cov_independent_index_column(self):
            assert result.index is not result.columns
            assert result.index.equals(result.columns)

+    def test_corr_invalid_method(self):
+        df = pd.DataFrame(np.random.normal(size=(10, 2)))
+        pttrn = ("method must be either 'pearson', 'spearman', "


pttrn -> msg

jreback · 2018-08-14T10:21:34Z

pandas/tests/frame/test_analytics.py

@@ -130,6 +130,13 @@ def test_corr_cov_independent_index_column(self):
            assert result.index is not result.columns
            assert result.index.equals(result.columns)

+    def test_corr_invalid_method(self):
+        df = pd.DataFrame(np.random.normal(size=(10, 2)))


can you add the gh issue number as a comment

jreback · 2018-08-14T10:21:40Z

pandas/tests/series/test_analytics.py

@@ -778,6 +778,14 @@ def test_corr_rank(self):
        tm.assert_almost_equal(A.corr(B, method='kendall'), kexp)
        tm.assert_almost_equal(A.corr(B, method='spearman'), sexp)

+    def test_corr_invalid_method(self):
+        s1 = pd.Series(np.random.randn(10))


jreback · 2018-08-14T10:21:51Z

pandas/tests/series/test_analytics.py

+    def test_corr_invalid_method(self):
+        s1 = pd.Series(np.random.randn(10))
+        s2 = pd.Series(np.random.randn(10))
+        pttrn = ("method must be either 'pearson', 'spearman', "


pttrn -> msg

jreback · 2018-08-14T10:22:04Z

doc/source/whatsnew/v0.24.0.txt

@@ -468,6 +468,7 @@ Other API Changes
 - :meth:`PeriodIndex.tz_convert` and :meth:`PeriodIndex.tz_localize` have been removed (:issue:`21781`)
 - :class:`Index` subtraction will attempt to operate element-wise instead of raising ``TypeError`` (:issue:`19369`)
 - :class:`pandas.io.formats.style.Styler` supports a ``number-format`` property when using :meth:`~pandas.io.formats.style.Styler.to_excel` (:issue:`22015`)
+- :meth:`DataFrame.corr` now raises a ``ValueError`` instead of a ``KeyError`` when supplied with an invalid method.


can you add the issue number

use the PR number as we don't have an issue number

…mber to tests and release notes

dsaxton · 2018-08-15T03:45:24Z

@jreback Looks like the tests passed. Let me know if there are any other changes that should be made, if not, thanks for all your help.

TomAugspurger · 2018-08-18T19:48:13Z

Thanks @dsaxton!

dsaxton · 2018-08-18T19:53:34Z

@TomAugspurger Happy to help!

…2298)

more helpful error message for invalid correlation type

e5fc074

placate pep8

ddc0f14

TomAugspurger added this to the 0.24.0 milestone Aug 13, 2018

gfyoung added the Error Reporting Incorrect or improved errors from pandas label Aug 13, 2018

daniel saxton and others added 4 commits August 13, 2018 09:02

add invalid method test for DataFrame.corr

0fce19f

try to fix test

6490af3

add to release notes for DataFrame.corr error type

97da60c

Merge branch 'master' of https://github.com/pandas-dev/pandas

156dcac

jreback requested changes Aug 14, 2018

View reviewed changes

add error for Series, format error messages, edit tests

e7d49a1

jreback requested changes Aug 14, 2018

View reviewed changes

remove else from Series.corr, change pttrn to msg in tests, add PR nu…

5ba0bc8

…mber to tests and release notes

TomAugspurger merged commit 92dcf5f into pandas-dev:master Aug 18, 2018

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018

ERR: Error message for invalid method in DataFrame.corr (pandas-dev#2…

9ca4397

…2298)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a more helpful error message for invalid correlation methods in DataFrame.corr #22298

Use a more helpful error message for invalid correlation methods in DataFrame.corr #22298

dsaxton commented Aug 12, 2018

pep8speaks commented Aug 12, 2018 •

edited

Loading

codecov bot commented Aug 13, 2018 •

edited

Loading

TomAugspurger commented Aug 13, 2018

dsaxton commented Aug 13, 2018

jreback commented Aug 13, 2018

jreback left a comment

jreback Aug 14, 2018

jreback Aug 14, 2018

dsaxton Aug 14, 2018

jreback Aug 14, 2018

jreback left a comment

jreback Aug 14, 2018

jreback Aug 14, 2018

jreback Aug 14, 2018

jreback Aug 14, 2018

jreback Aug 14, 2018

jreback Aug 14, 2018

jreback Aug 14, 2018

dsaxton commented Aug 15, 2018

TomAugspurger commented Aug 18, 2018

dsaxton commented Aug 18, 2018

Use a more helpful error message for invalid correlation methods in DataFrame.corr #22298

Use a more helpful error message for invalid correlation methods in DataFrame.corr #22298

Conversation

dsaxton commented Aug 12, 2018

pep8speaks commented Aug 12, 2018 • edited Loading

Comment last updated on August 14, 2018 at 15:49 Hours UTC

codecov bot commented Aug 13, 2018 • edited Loading

Codecov Report

TomAugspurger commented Aug 13, 2018

dsaxton commented Aug 13, 2018

jreback commented Aug 13, 2018

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsaxton commented Aug 15, 2018

TomAugspurger commented Aug 18, 2018

dsaxton commented Aug 18, 2018

pep8speaks commented Aug 12, 2018 •

edited

Loading

codecov bot commented Aug 13, 2018 •

edited

Loading