Skip to content

A subgroup discovery tool that can use ontological domain knowledge (RDF graphs) in the learning process. Subgroup descriptions contain terms from the given domain knowledge and enable potentially better generalizations.

master
Switch branches/tags
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Hedwig

A pattern mining tool that can exploit background knowledge in the form of RDF triplets.

Installation

python setup.py install

Example

View all the options:

python -m hedwig --help

Running with default parameters and outputing the rules to a file:

python -m hedwig <path-to-folder-with-domain-rdf-files> <examples-file>.n3 -o rules

Running the included numbers mini-example:

python -m hedwig example/numbers/ontology/ example/numbers/data.n3 --output=rules --adjust=none --leaves --support=0 --beam=1

Simple hierarchy example with CSV data

If you want to use just simple hierarchies of features, you don't need to resort to RDF. Just run hedwig with the --format=csv flag, for example:

python -m hedwig --format=csv tests/data/csv/ontology/ tests/data/csv/Cities_clusters.csv -o rules

Hierarchy files must have the .tsv suffix, with the following structure:

class_1<tab>superclass_1_1; superclass_1_2; ...
class_2<tab>superclass_2_1; superclass_2_2; ...
...

If you provide proper URIs, they will be used. Otherwise generic URIs will be constructed from the provided class names.

Data files must have the .csv suffix and the following structure:

example_uri_or_label; attr_uri_1; attr_uri_2; ...
http://example.org/uri_1; 0/1; 0/1; 0/1; 0/1; ...
http://example.org/uri_2; 0/1; 0/1; 0/1; 0/1; ...
...

See the tests/data/csv/ folder for an example input of this type.

Note

Please note that this is a research project and that drastic changes can be (and are) made pretty regularly. Changes are documented in the CHANGELOG.

Pull requests and issues are welcome.

About

A subgroup discovery tool that can use ontological domain knowledge (RDF graphs) in the learning process. Subgroup descriptions contain terms from the given domain knowledge and enable potentially better generalizations.

Resources

License

Releases

No releases published

Packages

No packages published