New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
corr return without duplicates and sorted by correlation strength #24728
Comments
This sounds like more of a usage question. We recommend using StackOverflow for these types of questions. Issues is reserved for bug tracking and enhancement requests |
@mroeschke I want to make an enhancement. Is this a feature that would be useful? |
Oh I see. I'll open it back up for discussion. I could see this as a useful cookbook example in our documentation; we are somewhat hessitant to expand pandas' large API without a large interest from the community. |
i agree with @mroeschke; have used this pattern before myself, dont need it implemented in pandas |
Code Sample, a copy-pastable example if possible
Functionalized example of what I'm seeking to implement in
pandas
as acorr
argument or separate function:Problem description
pd.DataFrame.corr()
returns a table with redundancies. I'm interested in implementing an enhancement (as an argument option or function, etc.) to return a DataFrame without redundancy and sorted by correlation strength.Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.14.79+
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.22.0
pytest: 3.10.1
pip: 18.1
setuptools: 40.6.3
Cython: 0.29.2
numpy: 1.14.6
scipy: 1.1.0
pyarrow: None
xarray: 0.11.2
IPython: 5.5.0
sphinx: 1.8.3
patsy: 0.5.1
dateutil: 2.5.3
pytz: 2018.9
blosc: None
bottleneck: None
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 2.1.2
openpyxl: 2.5.9
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: None
lxml: 4.2.6
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.15
pymysql: None
psycopg2: 2.7.6.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: 0.2.0
fastparquet: None
pandas_gbq: 0.4.1
pandas_datareader: 0.7.0
The text was updated successfully, but these errors were encountered: