-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Open
Description
Problem description
The current Statsmodels implementation of medcouple is in O(N^2) time, leading to excessive runtimes and memory issues
Proposed remedy
- I would like to see a revised version of Guy Brys's R code included in Statsmodels
- The implementation is available on my Github (link to repo)
- He has granted permission for this in correspondence
- Details follow
Historical context
- Guy Brys authored an R package for efficient medcouple, c. 2004 (link)
- Jordi Gutiérrez Hermoso used that as a reference for a Python 2 implementation, c. 2015 (link)
- There was a conversation about whether to include it in the Python Statsmodels project (link)
- There were concerns due to the original reference implementation being licensed under GNU-GPL
- However, as mentioned in that thread, such code may be relicensed with author permission
What I did
- Reached out to Guy on LinkedIn (link to profile) to ask for permission
- He granted permission
- link to permission from guy brys.png in my repo
- Revised Jordi's code for Python 3
- Validated my revised code against the (quadratic) statsmodels implementation
- Used data from Jordi's repo
- RMSE was 1.03e-4
- Much smaller than statistic's scale of [-1 to 1]
- Consistent with implementation-level differences
- Posted the revised code on Github (link to repo)
Please let me know what else may be needed.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels