Skip to content

[ASONAM23] Mine unexpected patterns from attributed graphs.

Notifications You must be signed in to change notification settings

simondelarue/UnexPatterns

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Unexpected Attributed Subgraphs: a Mining Algorithm

UnexPatterns is a an algorithm that mines unexpected patterns from large attributed graphs.

It relies on the combination of information-theoretic based measure of Unexpectedness [1] as an interestingness measure to select relevant patterns and pruning techniques to reduce the search space.

[1] Dessalles, J.L.: Algorithmic simplicity and relevance. Algorithmic Probability and Friends - LNAI 7070, pp. 119–130, Springer (2013).

Install

Clone this project and install dependencies:

git clone
cd UnexPatterns
pip install -r requirements.txt

Data

The five real-world datasets used in our work are provided either in data/ directory or accessible through Netset:

  • 'wikivitals'
  • 'wikivitals-fr'
  • 'wikischools'
  • 'sanFranciscoCrimes'
  • 'ingredients'

Pattern mining

Parameter file

Use parameters.txt to specify the dataset(s) and parameter(s) value. As an example:

datasets: wikivitals, wikivitals-fr
s: 8
beta: 4
delta: 0
patterns_path: output

where:

  • datasets: Dataset name (if several datasets, names are coma-separated)
  • s: Minimum number of nodes in pattern
  • beta: Minimum number of attributes in pattern
  • delta: Minimum amount of unexpectedness difference between two patterns
  • patterns_path: Path to output directory

Usage

To mine patterns according to parameters.txt, use the following command:

python unexpatterns.py parameters.txt

Output format

Patterns are stored in binary format, within output/patterns/ directory. To extract them, you can use:

with open(f'output/patterns/<patterns>.bin', 'rb') as data:
    patterns = pickle.load(data)

The naming convention of each file is the following: result_<dataset>_<beta>_<s>_<delta>.bin.

The variable patterns then contains a list of patterns, each in the form of a tuple (Xi,Qi) where Xi is the set of node indexes and Qi the set of attribute indexes of the ith pattern.

Experiments

Code for experiments and comparisons with baseline and state-of-the-art aglorithms are provided in experiments/ directory.

About

[ASONAM23] Mine unexpected patterns from attributed graphs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%