For each property p ∈ P in a dataset D, the set of subjects is defined as:
Then, we computed the mean and standard deviation of the frequency of their respective objects (o). The mean frequency of the objects of a property p is defined as
whereas the standard deviation is
Afterwards, for each dataset, we calculated the following indices:
- net-mean mean frequency
- net-mean standard deviation of frequency
- net-mean coefficient of variation, defined as the rate between nsdf and nmf.
We repeated the steps above also in the other direction, i.e. extracting the objects first and computing the indices for the subjects (s).
As the ncv values are way above 0 in almost all cases, we can safely assume that the distribution of subjects and objects is not homogeneous.