PARs: Predicate-based Association Rules for Efficient and Accurate Model-Agnostic Anomaly Explanation

This repository includes the Python package which offers functionalities for explaining data anomalies detected by arbitrary models using PARs. The methodology of PARs is described in the follow paper:

Cheng Feng. 2024. PARs: Predicate-based Association Rules for Efficient and Accurate Anomaly Explanation. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM ’24), October 21–25, 2024, Boise, ID, USA. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3627673.3679625.

An extended version for the paper is available at arxiv.

The scripts for experiments in the paper are available here.

Install dependencies

pip install -r requirements.txt

How to use the package

### define features and the PARAnomalyExplainer
from pars import NumericFeature, CategoricFeature, PARAnomalyExplainer

features = []
for name in train_df.columns:
    if len(train_df[name].unique()) > 5:
        features.append( NumericFeature(name,min_value=train_df[name].min(), max_value=train_df[name].max(),
                                    mean_value=train_df[name].mean(), std_value=train_df[name].std()) )
    else:
        features.append( CategoricFeature(name,values=train_df[name].unique().tolist()) )

parexp = PARAnomalyExplainer(features)

### let's train the PARAnomalyExplainer
parexp.train(train_df, max_predicts4rule_mining = 75, max_times4rule_mining = 5, set_seed=False)

### you can use PARAnomalyExplainer to find top-k violated PARs for an individual anomaly
rules = parexp.find_violated_pars(anomalies[0], topk=5)
print('Violated PARs:')
for rule in rules:
    print(f'{rule}, sup: {rule.support}, conf: {rule.conf}')

### you can also find summarized anomaly explanation for a list of anomalies
explanation = parexp.explain_anomalies(anomalies[0:20])
# each explanation item is a tuple contains the following elements: 
# (anomalous feature,probability,violated rule,rule confidence,rule support,violated locations,related features)
for exp_item in explanation.summary():
    print(f'anomalous feature: {exp_item[0]}')
    print(f'probability: {exp_item[1]}')
    print(f'representive violated PAR: {exp_item[2]}')
    print(f'confidence of representive PAR: {exp_item[3]}')
    print(f'support of the representive PAR: {exp_item[4]}')
    print(f'violated locations: {exp_item[5]}')
    print(f'related features in the representive PAR: {exp_item[6]}')
    print()

Please check tutorial.ipynb for the complete tutorial.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
pars		pars
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
tutorial.ipynb		tutorial.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PARs: Predicate-based Association Rules for Efficient and Accurate Model-Agnostic Anomaly Explanation

Install dependencies

How to use the package

About

Releases

Packages

Contributors 2

Languages

cfeng783/PARs

Folders and files

Latest commit

History

Repository files navigation

PARs: Predicate-based Association Rules for Efficient and Accurate Model-Agnostic Anomaly Explanation

Install dependencies

How to use the package

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages