Welcome to the python binding of the Certifiably Optimal RulE ListS (CORELS) algorithm!
CORELS (Certifiably Optimal RulE ListS) is a custom discrete optimization technique for building rule lists over a categorical feature space. Using algorithmic bounds and efficient data structures, our approach produces optimal rule lists on practical problems in seconds.
The CORELS pipeline is simple. Given a dataset matrix of size
n_samples x n_features and a labels vector of size
n_samples, it will compute a rulelist (similar to a series of if-then statements) to predict the labels with the highest accuracy.
More information about the algorithm can be found here
CORELS exists on PyPI, and can be downloaded with
pip install corels
To install from this repo, simply run
pip install . or
python setup.py install from the
Here are some detailed examples of how to install all the dependencies needed, followed by corels itself:
sudo apt install libgmp-dev pip install corels
# Install g++ and gmp /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" brew install g++ gmp pip install corels
Note: Python 2 is currently NOT supported on Windows.
pip install corels
- If you come across an error saying Python version >=3.5 is required, try running
pip install numpybefore again running
pip install corels.
pipdoes not successfully install corels, try using
The docs for this package are hosted on here: https://pycorels.readthedocs.io/
After installing corels, run
pytest (you may have to install it with
pip install pytest first) from the
tests/ folder, where the tests are located.
from corels import * # Load the dataset X, y, _, _ = load_from_csv("data/compas.csv") # Create the model, with 10000 as the maximum number of iterations c = CorelsClassifier(n_iter=10000) # Fit, and score the model on the training set a = c.fit(X, y).score(X, y) # Print the model's accuracy on the training set print(a)
Toy dataset (See picture example above)
from corels import CorelsClassifier # ["loud", "samples"] is the most verbose setting possible C = CorelsClassifier(max_card=2, c=0.0, verbosity=["loud", "samples"]) # 4 samples, 3 features X = [[1, 0, 1], [0, 0, 0], [1, 1, 0], [0, 1, 0]] y = [1, 0, 0, 1] # Feature names features = ["Mac User", "Likes Pie", "Age < 20"] # Fit the model C.fit(X, y, features=features, prediction_name="Has a dirty computer") # Print the resulting rulelist print(C.rl()) # Predict on the training set print(C.predict(X))
More examples are in the
Email the maintainer at: firstname.lastname@example.org