Rule interpretability analysis (including semantic coherence)

This repository presents a collection of scripts for analyzing results of user experiments aimed at evaluating interepretability of inductively learnt rules.

In addition to basic interpretability metrics like average antecedent length, this library computes semantic coherence using method described in:

Gabriel, A., Paulheim, H., and Janssen, F.  Learning semantically coherent rules.  ECML/PKDD-14.  International Workshop on Interactions between Data Mining and Natural Language Processing , pp. 49–63, Nancy, France, September 2014. CEUR Workshop Proceedings

The analysis is done in two phases.

compute_pairwise_sim.py Attribute names are extracted from input datasets (.csv) and semantic similarity is precomputed between all pairs of attribute names and stored into a file, which is saved to each dataset's directory
compute_rule_list_sim.py parses input rule lists (plain text), extracts attribute names, counts number of attributes in the antecedent, and computes semantic coherence. The results are averages across all rules in in the input rule list.

Inputs and outputs

The scripts assume the following directory structure

data/{datasetname}/data.csv - dataset, from which the rules were learnt
data/{datasetname}/rules/{user-id}/mined.txt - list of discovered rules
data/{datasetname}/rules/{user-id}/modified.txt - list of rules after user intervention

The script compute_pairwise_sim.py saves pair-wise similarities for attribute names to:

data/{datasetname}/word-pairs.csv

The script compute_rule_list_sim.py saves results to:

results.csv

Sample datasets

The scripts come with some results of a proof-of-concept user study on UCI datasets used in the Gabriel et al paper. These are stored in the data folder.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
__pycache__		__pycache__
data		data
LICENSE		LICENSE
README.md		README.md
compute_pairwise_sim.py		compute_pairwise_sim.py
compute_rule_list_sim.py		compute_rule_list_sim.py
results.csv		results.csv
ruleanalysissettings.py		ruleanalysissettings.py
splitattname.py		splitattname.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

data

data

LICENSE

LICENSE

README.md

README.md

compute_pairwise_sim.py

compute_pairwise_sim.py

compute_rule_list_sim.py

compute_rule_list_sim.py

results.csv

results.csv

ruleanalysissettings.py

ruleanalysissettings.py

splitattname.py

splitattname.py

Repository files navigation

Rule interpretability analysis (including semantic coherence)

Inputs and outputs

Sample datasets

About

Releases

Packages

Languages

License

kliegr/rule_interpretability_analysis

Folders and files

Latest commit

History

Repository files navigation

Rule interpretability analysis (including semantic coherence)

Inputs and outputs

Sample datasets

About

Resources

License

Stars

Watchers

Forks

Languages