Skip to content
No description, website, or topics provided.
Jupyter Notebook Python Yacc
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.vscode
data
jupyter_notebooks
pyids
scripts
LICENCE.txt
README.md
main.py
test
testing_SLS.py
testing_coordinate_ascent.py
testing_fim_randomness.py
testing_pyarc_txns.py
testing_rule_miner.py
testing_xml.py

README.md

pyIDS

pyIDS is a custom implementation of IDS (Interpretable Decision Sets) algorithm introduced in

LAKKARAJU, Himabindu; BACH, Stephen H.; LESKOVEC, Jure. Interpretable decision sets: A joint framework for description and prediction. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2016. p. 1675-1684.

If you find this package useful in your research, please cite our paper on this Interpretable Decision Sets Implementation:

Jiri Filip, Tomas Kliegr. PyIDS - Python Implementation of Interpretable Decision Sets Algorithm by Lakkaraju et al, 2016. RuleML+RR2019@Rule Challenge 2019. http://ceur-ws.org/Vol-2438/paper8.pdf

Installation

The pyarc, pandas, scipy and numpy packages need to be installed before using pyIDS.

All of these packages can be installed using pip.

For pyarc, please refer to the Installation section of its README file.

Examples

training a simple IDS model

import pandas as pd
from pyids.data_structures.ids_classifier import IDS, mine_CARs

from pyarc.qcba.data_structures import QuantitativeDataFrame


df = pd.read_csv("./data/iris0.csv")
cars = mine_CARs(df, rule_cutoff=50)
lambda_array = [1, 1, 1, 1, 1, 1, 1]

quant_dataframe = QuantitativeDataFrame(df)

ids = IDS()
ids.fit(quant_dataframe=quant_dataframe, class_association_rules=cars, lambda_array=lambda_array, debug=False)

acc = ids.score(quant_dataframe)

training a One-vs-all IDS model

import pandas as pd
from pyids.data_structures.ids_classifier import IDSOneVsAll, mine_CARs

from pyarc.qcba.data_structures import QuantitativeDataFrame


df = pd.read_csv("./data/iris0.csv")

quant_dataframe = QuantitativeDataFrame(df)

ids = IDSOneVsAll()
ids.fit(quant_dataframe=quant_dataframe, debug=False)

acc = ids.score_auc(quant_dataframe)

optimizing for best lambda parameters using coordinate ascent, as described in the original paper

import pandas as pd
from pyids.data_structures.ids_classifier import IDS, mine_IDS_ruleset
from pyids.model_selection import CoordinateAscentOptimizer, train_test_split_pd

from pyarc.qcba.data_structures import QuantitativeDataFrame


df = pd.read_csv("./data/titanic.csv")
df_train, df_test = train_test_split_pd(df, prop=0.2)

ids_ruleset = mine_IDS_ruleset(df_train, rule_cutoff=50)

quant_dataframe_train = QuantitativeDataFrame(df_train)
quant_dataframe_test = QuantitativeDataFrame(df_test)

coordinate_ascent = CoordinateAscentOptimizer(IDS(), debug=True, maximum_delta_between_iterations=200, maximum_score_estimation_iterations=3)
coordinate_ascent.fit(ids_ruleset, quant_dataframe_train, quant_dataframe_test)

best_lambda_array = coordinate_ascent.current_best_params

or optimizing a One-vs-all IDS model

import pandas as pd
from pyids.data_structures.ids_classifier import IDSOneVsAll, mine_IDS_ruleset
from pyids.model_selection import CoordinateAscentOptimizer, train_test_split_pd

from pyarc.qcba.data_structures import QuantitativeDataFrame


df = pd.read_csv("./data/iris0.csv")
df_train, df_test = train_test_split_pd(df, prop=0.2)

ids_ruleset = mine_IDS_ruleset(df_train, rule_cutoff=50)

quant_dataframe_train = QuantitativeDataFrame(df_train)
quant_dataframe_test = QuantitativeDataFrame(df_test)

coordinate_ascent = CoordinateAscentOptimizer(IDSOneVsAll(), debug=True, maximum_delta_between_iterations=200, maximum_score_estimation_iterations=3)
coordinate_ascent.fit(ids_ruleset, quant_dataframe_train, quant_dataframe_test)

best_lambda_array = coordinate_ascent.current_best_params

using k-fold cross validation with AUC score

import pandas as pd
from pyids.data_structures.ids_classifier import IDSOneVsAll

dataframes = [ pd.read_csv("./data/iris{}.csv".format(i)) for i in range(10)]

kfold = KFoldCV(IDSOneVsAll(), dataframes, score_auc=True)
scores = kfold.fit(rule_cutoff=50)

using k-fold cross validation with accuracy score

import pandas as pd
from pyids.ids_classifier import IDS

dataframes = [ pd.read_csv("./data/iris{}.csv".format(i)) for i in range(10)]

kfold = KFoldCV(IDS(), dataframes)
scores = kfold.fit(rule_cutoff=50)
You can’t perform that action at this time.