-
Notifications
You must be signed in to change notification settings - Fork 1
Configuration
Marc-Olivier Buob edited this page May 13, 2022
·
1 revision
The entry point is pattern_clustering/boost.py
file.
It contains the pattern_distance
and the pattern_clustering
python functions that wraps the corresponding C++ underlying functions. These two functions take in parameter some patterns, identified by a string name, characterized by a deterministic finite automaton and weighted by a density (intuitively, the density of a pattern characterizes how strict it is; for instance, a float is more strict than an alpha-numeric string).
As a end-user you just have to decide how to name your patterns and to define for each of them the corresponding regular expression. Note that many common patterns are already defined in pattern_clustering.patterns
.
To define customized patterns:
- Choose a string identifying this pattern.
- Compute its DFA from its regular expression using
pybgl.compile_dfa
and insert it in amap_name_dfa
dictionary. - Compute its density using
pattern_clustering.language_density
and insert it in amap_name_density
dictionary. - Call
make_densities(map_name_dfa, map_name_density)
to obtain adensities
list. - Call
pattern_distance
/pattern_clustering
by passingmap_name_dfa
anddensities
in parameter.