Skip to content

Latest commit

 

History

History
98 lines (72 loc) · 2.35 KB

evaluation.rst

File metadata and controls

98 lines (72 loc) · 2.35 KB

Evaluation

The evaluation of Community Discovery algorithms is not an easy task. CDlib implements two families of evaluation strategies:

  • Internal evaluation through quality scores
  • External evaluation through partitions comparison

Note

The following lists are aligned to CD evaluation methods available in the GitHub main branch of CDlib. In particular, the following methods ara not yet available in the packaged version of the library: modularity_overlap.

Fitness Functions

Fitness functions allows to summarize the characteristics of a computed set of communities. CDlib implements the following quality scores:

cdlib.evaluation

avg_distance avg_embeddedness average_internal_degree avg_transitivity conductance cut_ratio edges_inside expansion fraction_over_median_degree hub_dominance internal_edge_density normalized_cut max_odf avg_odf flake_odf scaled_density significance size surprise triangle_participation_ratio purity

Among the fitness function a well-defined family of measures is the Modularity-based one:

erdos_renyi_modularity link_modularity modularity_density modularity_overlap newman_girvan_modularity z_modularity

Some measures will return an instance of FitnessResult that takes together min/max/mean/std values of the computed index.

FitnessResult

Partition Comparisons

It is often useful to compare different graph partition to assess their resemblance (i.e., to perform ground truth testing). CDlib implements the following partition comparisons scores:

adjusted_mutual_information adjusted_rand_index f1 nf1 normalized_mutual_information omega overlapping_normalized_mutual_information_LFK overlapping_normalized_mutual_information_MGH variation_of_information

Some measures will return an instance of MatchingResult that takes together mean and standard deviation values of the computed index.

MatchingResult