Consistent metric names #51

nfriesen · 2014-07-24T09:26:13Z

Looking over the metric implementation for cleaning purposes I noticed some inconsistences in metric names:

In the most cases the metric name reflects the corresponding quality problem, so 'UndefinedClasses', or 'DuplicateInstances' or 'ObsoleteConceptInOtology'. However, some metrics have different meaning - LowUsageOfBlankNodes, Even it's not possible for all metrics, In some cases it makes more sence to reflect the quality problems, e.g. BlankNodeUsage.
Some metric names are confusing, they don't reflect the metric definition. It would be helpful either adapt their implementation (if exactly this metric is required for use case) or rename them. This is the list of such metrics:

ShortURIs metric actually computes the average URU length, so maybe (AverageURILength) ?
LowBlankNodeUsage metric actually computes the ratio of 'good' entities. - the current implementation computes NoBlankNodesRatio, but it would makes more sence to define BlankNodesRatio.
Metric UnstructuredData probably shoulb be separated into the two metrics: UnstructuredData and DeadURIs metrics

@jerdeb BTW the test for Dereferencability metric fails.

jerdeb · 2014-07-24T09:44:48Z

Most of those are specific to EBI use cases, therefore are not generic.
They are marked as so. Also, there are a number of metrics which need to be
reviewed after the deliverable. Keep this ticket open.

On 24 July 2014 11:26, Natalja Friesen notifications@github.com wrote:

Looking over the metric implementation for cleaning purposes I noticed
some inconsistences in metric names:

In the most cases the metric name reflects the corresponding quality
problem, so 'UndefinedClasses', or 'DuplicateInstances' or
'ObsoleteConceptInOtology'. However, some metrics have different meaning -
LowUsageOfBlankNodes, Even it's not possible for all metrics, In some cases
it makes more sence to reflect the quality problems, e.g. BlankNodeUsage.
Some metric names are confusing, they don't reflect the metric definition.
It would be helpful either adapt their implementation (if exactly this
metric is required for use case) or rename them. This is the list of such
metrics:

ShortURIs metric actually computes the average URU length, so maybe
(AverageURILength) ?

LowBlankNodeUsage metric actually computes the ratio of 'good'
entities. - the current implementation computes NoBlankNodesRatio, but it
would makes more sence to define BlankNodesRatio.

Metric UnstructuredData probably shoulb be separated into the two
metrics: UnstructuredData and DeadURIs metrics

@jerdeb https://github.com/jerdeb BTW the test for Dereferencability
metric fails.

—
Reply to this email directly or view it on GitHub
#51.

nfriesen · 2014-07-25T19:47:43Z

@jerdeb Please check the test for Dereferencability metric, it fails.

nfriesen assigned clange Jul 24, 2014

nfriesen added todo and removed todo labels Jul 24, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent metric names #51

Consistent metric names #51

nfriesen commented Jul 24, 2014

jerdeb commented Jul 24, 2014

nfriesen commented Jul 25, 2014

Consistent metric names #51

Consistent metric names #51

Comments

nfriesen commented Jul 24, 2014

jerdeb commented Jul 24, 2014

nfriesen commented Jul 25, 2014