Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pivot() modifies arguments #37635

Closed
3 tasks done
Jacob-Stevens-Haas opened this issue Nov 4, 2020 · 5 comments · Fixed by #37771
Closed
3 tasks done

BUG: pivot() modifies arguments #37635

Jacob-Stevens-Haas opened this issue Nov 4, 2020 · 5 comments · Fixed by #37771
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@Jacob-Stevens-Haas
Copy link
Contributor

Jacob-Stevens-Haas commented Nov 4, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Minimal Working Example

One of the documentation examples, slightly modified:

import pandas as pd
df = pd.DataFrame({
       "lev1": [1, 1, 1, 2, 2, 2],
       "lev2": [1, 1, 2, 1, 1, 2],
       "lev3": [1, 2, 1, 2, 1, 2],
       "lev4": [1, 2, 3, 4, 5, 6],
       "values": [0, 1, 2, 3, 4, 5]})
a = ["lev1", "lev2"]
b=["lev3"]
df.pivot(index=a, columns=b)
print(a)

Displays:

['lev1', 'lev2', 'lev3']

Problem description

I would expect that my arguments passed to .pivot remain unchanged. I understand that lists, being mutable, can be modified by a function when they are used as arguments. I also understand that I can pass a set here instead of a list to keep it from being modified. However, I think it's reasonable to expect functions to be side-effect free in this case, especially when the documentation asks for a list and does not mention side effects. (FWIW, this behavior does not occur if I set the values= parameter).

The issue cropped up when I wanted to .pivot multiple DataFrames using the same index. To do so, I created a partial function of .pivot with an anonymous list as the index= paramter. Here's an MWE using the dataframe from above:

from functools import partial
alt_pivot = lambda df, index, columns: df.pivot(index=index, columns=columns)
ppivot = partial(alt_pivot, index=["lev1", "lev2"])
print(ppivot)
ppivot(df, columns=b)
print(ppivot)

produces the output:

functools.partial(<function <lambda> at 0x00000159C45EAEE8>, index=['lev1', 'lev2'])
functools.partial(<function <lambda> at 0x00000159C45EAEE8>, index=['lev1', 'lev2', 'lev3'])

Expected Output

In the MWE above,

print(a)

produces

['lev1', 'lev2']

Output of pd.show_versions()

(Version in my project. Confirmed same behavior in pandas==1.1.4 and master branch.)

INSTALLED VERSIONS ------------------ commit : d9fff27 python : 3.7.7.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.16299 machine : AMD64 processor : Intel64 Family 6 Model 94 Stepping 3, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : None.None

pandas : 1.1.0
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.2
setuptools : 41.2.0
Cython : None
pytest : None
hypothesis : None
sphinx : 3.2.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.17.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : 1.3.19
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@Jacob-Stevens-Haas Jacob-Stevens-Haas added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 4, 2020
@jreback
Copy link
Contributor

jreback commented Nov 5, 2020

try this in master to see if it's fixed otherwise would take a PR to fix

@ssortman
Copy link
Contributor

ssortman commented Nov 6, 2020

Hi. I'm new and looking to help out on this project! Is this an issue that I'd be able to pick up and work on?

@Jacob-Stevens-Haas
Copy link
Contributor Author

Jacob-Stevens-Haas commented Nov 9, 2020

@jreback yeah, confirmed it in master a few days ago. I'll work on a PR.

@ssortman
Copy link
Contributor

ssortman commented Nov 9, 2020

@Jacob-Stevens-Haas Hey! I'm a student working on a time sensitive project and it would help a lot if I could work on this issue. Thanks for the help.

@phofl phofl added Reshaping Concat, Merge/Join, Stack/Unstack, Explode and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 10, 2020
@Jacob-Stevens-Haas
Copy link
Contributor Author

@ssortman I too am a student and new to this, and would like to try to solve it. Please give me a bit of time to solve it.

@jreback jreback added this to the 1.2 milestone Nov 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants