An implementation of the Universal Correlation Coefficient in Python via Pandas
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
examples
ucc_pandas
.gitignore
LICENSE
MANIFEST.in
README.md
requirements.txt
setup.py

README.md

ucc-pandas

An implementation of the Universal Correlation Coefficient in Python via Pandas

Here is a high-level overview, based on the R library that I wrote to compute the UCC. In a nutshell, for two discrete random variables, the UCC gives an indication as to whether or not there is a--possibly non-linear--relationship between them.

TODO

  • Include tests and examples
  • Extend to computing UCCs for pairs of columns from a given list
    • Also, allow for automatic output of scatterplots for pairs having UCC >= a given threshold
  • Print modes:
    • Pretty print mode (for interactive use)
    • CSV output mode (for dumping to file for later).
  • Figure out how to make a proper pip package out of this (with setup.py and all that happy stuff).