Robustats is a Python library for high-performance computation of robust statistical estimators.
The functions that compute the robust estimators are implemented in C for speed and called by Python.
Estimators implemented in the library:
- Weighted Median (temporal complexity:
O(n)
) [1, 2, 3] - Medcouple (temporal complexity:
O(n * log(n))
) [4, 5, 6, 7] - Mode (temporal complexity:
O(n * log(n))
) [8]
This library requires Python 3.
You can install the library using Pip.
pip install robustats
You can also install the library directly from GitHub using the following command.
pip install -e 'git+https://github.com/FilippoBovo/robustats.git#egg=robustats'
Otherwise, you may clone the repository, and install and test the Robustats package in the following way.
git clone https://github.com/FilippoBovo/robustats.git
cd robustats
pip install -e .
python -m unittest
This is an example of how to use the Robustats library in Python.
import numpy as np
import robustats
# Weighted Median
x = np.array([1.1, 5.3, 3.7, 2.1, 7.0, 9.9])
weights = np.array([1.1, 0.4, 2.1, 3.5, 1.2, 0.8])
weighted_median = robustats.weighted_median(x, weights)
print("The weighted median is {}".format(weighted_median))
# Output: The weighted median is 2.1
# Medcouple
x = np.array([0.2, 0.17, 0.08, 0.16, 0.88, 0.86, 0.09, 0.54, 0.27, 0.14])
medcouple = robustats.medcouple(x)
print("The medcouple is {}".format(medcouple))
# Output: The medcouple is 0.7749999999999999
# Mode
x = np.array([1., 2., 2., 3., 3., 3., 4., 4., 5.])
mode = robustats.mode(x)
print("The mode is {}".format(mode))
# Output: The mode is 3.0
If you wish to contribute to this library, please follow the patterns and style of the rest of the code.
Moreover, install the Git hooks.
git config core.hooksPath .githooks
Tips:
- In C, use
malloc
to allocate memory to the heap, instead of creating arrays that allocate memory to the stack, as with large array we would incur in a segmentation fault due to stack overflow. - Avoid recursions where possible to limit the spatial complexity of the problem. In place of recursions, use loops.
[1] Cormen, Leiserson, Rivest, Stein - Introduction to Algorithms (3rd Edition).
[2] Cormen - Introduction to Algorithms (3rd Edition) - Instructor's Manual.
[3] Weighted median on Wikipedia.
[6] Medcouple implementation in Python by Jordi Gutiérrez Hermoso.