Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Melting with not present column does not produce error #23575

Merged
merged 34 commits into from
Nov 21, 2018
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
855985d
check for columns in dataframe
michaelsilverstein Nov 8, 2018
40fdb05
check for columns in dataframe
michaelsilverstein Nov 8, 2018
9670da2
check difference with Index; use {} str formatting
michaelsilverstein Nov 13, 2018
3ffc870
missing.any()
michaelsilverstein Nov 13, 2018
8139f78
started test
michaelsilverstein Nov 13, 2018
0a94650
added to whatsnew
michaelsilverstein Nov 13, 2018
d0f6d23
PEP criteria
michaelsilverstein Nov 13, 2018
6c76161
`missing.empty` to accommodate MultiIndex
michaelsilverstein Nov 13, 2018
ad3d926
rm `*`
michaelsilverstein Nov 13, 2018
e097a87
rm comment
michaelsilverstein Nov 13, 2018
5ff3a32
add test for id_var and multiple missing
michaelsilverstein Nov 13, 2018
fcbda15
reformat error statement; Value->KeyError
michaelsilverstein Nov 13, 2018
3175b34
simplified test
michaelsilverstein Nov 13, 2018
515fb9f
Issue -> GH
michaelsilverstein Nov 13, 2018
c7d6fcf
PEP criteria
michaelsilverstein Nov 13, 2018
5911cc3
PEP criteria
michaelsilverstein Nov 13, 2018
47ca7fc
test not working now
michaelsilverstein Nov 13, 2018
d0ee9c5
regex compatible match
michaelsilverstein Nov 14, 2018
c75ab23
PEP criteria
michaelsilverstein Nov 14, 2018
32ed22c
move test to TestMelt() class
michaelsilverstein Nov 14, 2018
e629b2a
PEP
michaelsilverstein Nov 14, 2018
89de406
PEP
michaelsilverstein Nov 14, 2018
1d13f4a
PEP
michaelsilverstein Nov 14, 2018
479b761
Merge branch 'master' into dev_melt_column_check
michaelsilverstein Nov 14, 2018
01e8d74
resolving conflicts
michaelsilverstein Nov 15, 2018
6762b21
Merge branch 'master' of https://github.com/pandas-dev/pandas into de…
michaelsilverstein Nov 15, 2018
eae7716
Merge branch 'master' of https://github.com/pandas-dev/pandas into de…
michaelsilverstein Nov 15, 2018
fba641f
handle multiindex columns
michaelsilverstein Nov 15, 2018
06b7cdb
test single var melt with multiindex
michaelsilverstein Nov 15, 2018
39c746b
test single var melt with multiindex
michaelsilverstein Nov 15, 2018
af170e1
pep8 and index sorting
michaelsilverstein Nov 16, 2018
4c9bc9f
rm extra description
michaelsilverstein Nov 21, 2018
c59d29f
add comment
michaelsilverstein Nov 21, 2018
0db8838
add MI tests
michaelsilverstein Nov 21, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1344,6 +1344,7 @@ Reshaping
- Bug in :func:`pandas.concat` when concatenating a multicolumn DataFrame with tz-aware data against a DataFrame with a different number of columns (:issue`22796`)
- Bug in :func:`merge_asof` where confusing error message raised when attempting to merge with missing values (:issue:`23189`)
- Bug in :meth:`DataFrame.nsmallest` and :meth:`DataFrame.nlargest` for dataframes that have :class:`MultiIndex`ed columns (:issue:`23033`).
- Bug in :func: `pandas.melt` when passing column names that do no exist in dataframe (:issue:`23575`)
michaelsilverstein marked this conversation as resolved.
Show resolved Hide resolved

.. _whatsnew_0240.bug_fixes.sparse:

Expand Down
11 changes: 11 additions & 0 deletions pandas/core/reshape/melt.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

from pandas import compat
from pandas.core.arrays import Categorical
from pandas.core.indexes.base import Index
from pandas.core.frame import _shared_docs
from pandas.core.reshape.concat import concat
from pandas.core.tools.numeric import to_numeric
Expand All @@ -32,7 +33,12 @@ def melt(frame, id_vars=None, value_vars=None, var_name=None,
raise ValueError('id_vars must be a list of tuples when columns'
' are a MultiIndex')
else:
# Check that `id_vars` are in frame
id_vars = list(id_vars)
missing = Index(id_vars).difference(frame.columns)
if not missing.empty:
raise ValueError('Columns {missing} are not in'
michaelsilverstein marked this conversation as resolved.
Show resolved Hide resolved
' dataframe'.format(missing=missing))
else:
id_vars = []

Expand All @@ -45,6 +51,11 @@ def melt(frame, id_vars=None, value_vars=None, var_name=None,
' columns are a MultiIndex')
else:
value_vars = list(value_vars)
# Check that `value_vars` are in frame
missing = Index(value_vars).difference(frame.columns)
if not missing.empty:
michaelsilverstein marked this conversation as resolved.
Show resolved Hide resolved
raise ValueError('Columns {missing} are not in'
' dataframe'.format(missing=missing))
frame = frame.loc[:, id_vars + value_vars]
else:
frame = frame.copy()
Expand Down
17 changes: 17 additions & 0 deletions pandas/tests/reshape/test_melt.py
Original file line number Diff line number Diff line change
Expand Up @@ -661,3 +661,20 @@ def test_col_substring_of_stubname(self):
i=['node_id', 'A'],
j='time')
tm.assert_frame_equal(result, expected)

def test_melt_missing_columns(self):
# Addresses issue #23575
michaelsilverstein marked this conversation as resolved.
Show resolved Hide resolved
# This test is to ensure that pandas raises an error if melting is
michaelsilverstein marked this conversation as resolved.
Show resolved Hide resolved
michaelsilverstein marked this conversation as resolved.
Show resolved Hide resolved
# attempted with column names absent from the dataframe

# Generate data
people = ['Susie', 'Alejandro']
day = ['Monday', 'Tuesday', 'Wednesday']
cols = ['burgers', 'fries']
data = [[person, d] + list(np.random.randint(0, 5, len(cols)))
for person in people for d in day]
df = pd.DataFrame(data, columns=['Name', 'day', 'burgers', 'fries'])

# Try to melt with missing column name
with pytest.raises(ValueError):
michaelsilverstein marked this conversation as resolved.
Show resolved Hide resolved
df.melt(['Name', 'day'], ['Burgers', 'fries'])
michaelsilverstein marked this conversation as resolved.
Show resolved Hide resolved