The evaluation of Community Discovery algorithms is not an easy task. CDlib
implements two families of evaluation strategies:
- Internal evaluation through quality scores
- External evaluation through partitions comparison
Note
The following lists are aligned to CD evaluation methods available in the GitHub main branch of CDlib. In particular, the following methods ara not yet available in the packaged version of the library: modularity_overlap.
Fitness functions allows to summarize the characteristics of a computed set of communities. CDlib
implements the following quality scores:
cdlib.evaluation
avg_distance avg_embeddedness average_internal_degree avg_transitivity conductance cut_ratio edges_inside expansion fraction_over_median_degree hub_dominance internal_edge_density normalized_cut max_odf avg_odf flake_odf scaled_density significance size surprise triangle_participation_ratio purity
Among the fitness function a well-defined family of measures is the Modularity-based one:
erdos_renyi_modularity link_modularity modularity_density modularity_overlap newman_girvan_modularity z_modularity
Some measures will return an instance of FitnessResult
that takes together min/max/mean/std values of the computed index.
FitnessResult
It is often useful to compare different graph partition to assess their resemblance (i.e., to perform ground truth testing). CDlib
implements the following partition comparisons scores:
adjusted_mutual_information adjusted_rand_index f1 nf1 normalized_mutual_information omega overlapping_normalized_mutual_information_LFK overlapping_normalized_mutual_information_MGH variation_of_information
Some measures will return an instance of MatchingResult
that takes together mean and standard deviation values of the computed index.
MatchingResult