Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame.melt gives unexpected result when column "value" already exists #35003

Merged
merged 28 commits into from Jul 2, 2020

Conversation

pizzathief
Copy link
Contributor

@pizzathief
Copy link
Contributor Author

Linting .py code
16
##[error]./pandas/core/apply.py:9:1:F401:'pandas._libs.reduction as libreduction' imported but unused

should I remove this line as part of this PR? "libreduction" isn't mentioned anywhere else in the file pandas/core/apply.py , but on the other hand, its out of scope

@jreback jreback changed the title Issue34731 1 BUG: DataFrame.melt gives unexpected result when column "value" already exists Jun 26, 2020
@WillAyd
Copy link
Member

WillAyd commented Jun 26, 2020

Can you merge master to fix CI issue?

# a name in the dataframe already (default name is "value")
df = pd.DataFrame({"col": list("ABC"), "value": range(10, 16, 2)})

with warnings.catch_warnings(record=True) as w:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use tm.assert_produces_warning here (e.g., as in L580 of this file)?

@@ -40,6 +41,14 @@ def melt(
else:
cols = list(frame.columns)

if value_name in frame.columns:
warnings.warn(
'The value column name "%s" conflicts with an existing '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use f-string syntax? We don't use the Py2 syntax any more

warnings.warn(
'The value column name "%s" conflicts with an existing '
"column in the dataframe." % (value_name),
DeprecationWarning,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to a FutureWarning?

@@ -1105,6 +1105,7 @@ Reshaping
- Fixed bug in :func:`melt` where melting MultiIndex columns with ``col_level`` > 0 would raise a ``KeyError`` on ``id_vars`` (:issue:`34129`)
- Bug in :meth:`Series.where` with an empty Series and empty ``cond`` having non-bool dtype (:issue:`34592`)
- Fixed regression where :meth:`DataFrame.apply` would raise ``ValueError`` for elements whth ``S`` dtype (:issue:`34529`)
- Issue warning if value column name already exists when using :meth:`DataFrame.melt` (:issue:`34731`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to the Deprecations section?

if value_name in frame.columns:
warnings.warn(
'The value column name "%s" conflicts with an existing '
"column in the dataframe." % (value_name),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we want to give users insight as to what they need to do before the deprecated behavior gets changed - can you try to clarify the message a bit?

@WillAyd WillAyd added the Deprecate Functionality to remove in pandas label Jun 29, 2020
@jreback jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Jun 30, 2020
@jreback jreback added this to the 1.1 milestone Jun 30, 2020
@@ -816,6 +816,7 @@ Deprecations
- :meth:`util.testing.assert_almost_equal` now accepts both relative and absolute
precision through the ``rtol``, and ``atol`` parameters, thus deprecating the
``check_less_precise`` parameter. (:issue:`13357`).
- Issue warning if value column name already exists when using :meth:`DataFrame.melt` (:issue:`34731`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you start with DataFrame.melt (like the other notes above), and also say that this is deprecated and removed in future version.

@@ -40,6 +41,16 @@ def melt(
else:
cols = list(frame.columns)

if value_name in frame.columns:
warnings.warn(
"This dataframe has a column name that matches the value column "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put value in single quotes as this is an argument that the user passed. (and this should be value_name)

if value_name in frame.columns:
warnings.warn(
"This dataframe has a column name that matches the value column "
f'name of the resultant melted dataframe (That being "{value_name})". '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resultant -> resuling DataFrame. you don't need the parenthesized expression.

warnings.warn(
"This dataframe has a column name that matches the value column "
f'name of the resultant melted dataframe (That being "{value_name})". '
"In the future this will raise an error, please set the value_name "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value_name in single quotes

df = pd.DataFrame({"col": list("ABC"), "value": range(10, 16, 2)})

with tm.assert_produces_warning(FutureWarning):
dfm = df.melt(id_vars="value") # noqa F841
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you create an expected value and use assert_frame_equal

also instead of dfm, call this result.

@jreback jreback merged commit 997c914 into pandas-dev:master Jul 2, 2020
@jreback
Copy link
Contributor

jreback commented Jul 2, 2020

thanks @pizzathief

@jbrockmendel
Copy link
Member

@pizzathief want to make a PR to enforce this deprecation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame.melt gives unexpected result when column "value" already exists
5 participants