Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Columns and Index share the same numpy object underneath when pd.DataFrame.cov is used #14617
Comments
kapilsh
commented
Nov 8, 2016
|
In my use case, I am doing something like below: In [96]: df = pd.DataFrame({"Value": np.random.randn(1000), "Kind": map(chr, np.random.randint(65, 69, 1000))})
In [97]: df.pivot(values="Value", columns="Kind").ffill().diff().cov()
Out[97]:
Kind A B C D
Kind
A 6.094439e-01 1.864854e-06 -5.956038e-07 -1.130525e-08
B 1.864854e-06 5.643768e-01 1.384354e-06 2.627663e-08
C -5.956038e-07 1.384354e-06 4.964671e-01 -1.802524e-08
D -1.130525e-08 2.627663e-08 -1.802524e-08 3.862837e-01
In [98]: cc = df.pivot(values="Value", columns="Kind").ffill().diff().cov()
In [99]: cc.index is cc.columns
Out[99]: TrueAs a result,
fails. |
|
yeah it should shallow copy the index first rather than setting the same object so that meta data will not be shared want to do a PR ? |
jreback
added Bug Indexing Reshaping Difficulty Novice Numeric Effort Low
labels
Nov 8, 2016
jreback
added this to the
Next Major Release
milestone
Nov 8, 2016
kapilsh
commented
Nov 8, 2016
|
Sure! I can do a PR. Feel free to assign it to me. |
kapilsh
added a commit
to kapilsh/pandas
that referenced
this issue
Nov 16, 2016
|
|
kapilsh |
428cb37
|
kapilsh
referenced
this issue
Nov 16, 2016
Closed
BUG: Columns and Index share the same numpy object underneath when pd.DataFrame.cov is used #14667
kapilsh
commented
Nov 16, 2016
|
@jreback Made the changes to cov and corr. |
mroeschke
added a commit
to mroeschke/pandas
that referenced
this issue
Feb 28, 2017
|
|
mroeschke |
5a46f0a
|
mroeschke
referenced
this issue
Feb 28, 2017
Closed
Bug:DataFrame index & column returned by corr & cov are the same (#14617) #15528
jorisvandenbossche
modified the milestone: 0.20.0, Next Major Release
Feb 28, 2017
jreback
closed this
in d0a281f
Feb 28, 2017
AnkurDedania
added a commit
to AnkurDedania/pandas
that referenced
this issue
Mar 21, 2017
|
|
mroeschke + AnkurDedania |
4a3035b
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
kapilsh commentedNov 8, 2016
•
edited
A small, complete example of the issue
Expected Output
Output of
pd.show_versions()INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-327.36.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: None.None
pandas: 0.19.1
nose: 1.3.7
pip: 9.0.0
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.11.2
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.8
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext)
jinja2: 2.8
boto: 2.42.0
pandas_datareader: None