Advertise score range used by a service in its manifest #147

wetneb · 2023-12-07T14:55:47Z

Services are free to use any numerical range for the scores they give to reconciliation candidates.

As a client, this makes it a bit hard to process those scores: we can only compare the candidates between each other and have no idea what the maximum or minimum score might be.

Services could advertise their min and max scores in their manifest (if any). Some might use unbounded scores on purpose and it could be nice to be able to indicate that too.

Alternatively, the specifications could mandate a particular score range.

Reconciliation features could also have similar metadata.

thadguidry · 2023-12-07T23:05:35Z

The specs should standardize the score range. I'd avoid negative numbers completely and standardize on [0..100] which is easy enough and wide enough for representing "a percentage of how closely the candidates match the query against this service "

Abbe98 · 2023-12-13T18:46:10Z

I do not have much of an opinion on the range itself but I'm not fond of the idea to have all clients handle different ranges.

tfmorris · 2024-01-10T17:55:44Z

Something to consider here would be how this works for a service which is built on something like Solr/Lucene which doesn't have a fixed scoring range (and scores aren't comparable across queries). What type of normalization would the service do and how would the user interpret the resulting score?

ChristianeKlaes · 2024-01-11T09:36:39Z

Unifying score ranges between reconciliation services would definitely improve further processing of reconciliation results.

Additionally, recon services should clearly state in their manifest/documentation

how similarity is computed (similarity measure, similarity features)
which similarity score leads to automatic matching, if enabled

fsteeg · 2024-01-11T13:15:05Z

Additionally, recon services should clearly state in their manifest/documentation

how similarity is computed (similarity measure, similarity features)

which similarity score leads to automatic matching, if enabled

For this part see also the discussion in #128.

fsteeg · 2024-01-11T15:55:59Z

We discussed this in today's meeting:

There seems to be consensus that a standardized score range from 0 to 100 would be useful. To address the problem of service backends without fixed scoring ranges mentioned by @tfmorris in #147 (comment), this could be made optional. This would also retain compatibility with existing services.

So we could add an optional field in the service manifest, like "standardizedScore": true, and specify that this means the score will be between 0 and 100. This would allow for clients to check if the service does use a standard range and process or display candidates accordingly.

Remove camelCase entry, not merged yet, will add it back in #166

Add optional `standardizedScore` field in service manifest (#147)

wetneb mentioned this issue Dec 11, 2023

Clarify scoring results in reconciliation matches to users OpenRefine/OpenRefine#6234

Open

fsteeg added a commit that referenced this issue Mar 14, 2024

Add optional standardizedScore field in service manifest (#147)

c5a7bbb

fsteeg mentioned this issue Mar 14, 2024

Add optional standardizedScore field in service manifest (#147) #156

Merged

fsteeg added a commit that referenced this issue Apr 11, 2024

Add standardizedScore change to "This Draft" section (#147)

6c058d3

Remove camelCase entry, not merged yet, will add it back in #166

fsteeg closed this as completed in #156 Apr 11, 2024

fsteeg added a commit that referenced this issue Apr 11, 2024

Merge pull request #156 from reconciliation-api/147-standardizedScore

e53a418

Add optional `standardizedScore` field in service manifest (#147)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advertise score range used by a service in its manifest #147

Advertise score range used by a service in its manifest #147

wetneb commented Dec 7, 2023 •

edited

thadguidry commented Dec 7, 2023 •

edited

Abbe98 commented Dec 13, 2023

tfmorris commented Jan 10, 2024

ChristianeKlaes commented Jan 11, 2024

fsteeg commented Jan 11, 2024

fsteeg commented Jan 11, 2024

Advertise score range used by a service in its manifest #147

Advertise score range used by a service in its manifest #147

Comments

wetneb commented Dec 7, 2023 • edited

thadguidry commented Dec 7, 2023 • edited

Abbe98 commented Dec 13, 2023

tfmorris commented Jan 10, 2024

ChristianeKlaes commented Jan 11, 2024

fsteeg commented Jan 11, 2024

fsteeg commented Jan 11, 2024

wetneb commented Dec 7, 2023 •

edited

thadguidry commented Dec 7, 2023 •

edited