Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advertise score range used by a service in its manifest #147

Closed
wetneb opened this issue Dec 7, 2023 · 6 comments · Fixed by #156
Closed

Advertise score range used by a service in its manifest #147

wetneb opened this issue Dec 7, 2023 · 6 comments · Fixed by #156

Comments

@wetneb
Copy link
Member

wetneb commented Dec 7, 2023

Services are free to use any numerical range for the scores they give to reconciliation candidates.

As a client, this makes it a bit hard to process those scores: we can only compare the candidates between each other and have no idea what the maximum or minimum score might be.

Services could advertise their min and max scores in their manifest (if any). Some might use unbounded scores on purpose and it could be nice to be able to indicate that too.

Alternatively, the specifications could mandate a particular score range.

Reconciliation features could also have similar metadata.

@thadguidry
Copy link
Contributor

thadguidry commented Dec 7, 2023

The specs should standardize the score range. I'd avoid negative numbers completely and standardize on [0..100] which is easy enough and wide enough for representing "a percentage of how closely the candidates match the query against this service "

@Abbe98
Copy link
Member

Abbe98 commented Dec 13, 2023

I do not have much of an opinion on the range itself but I'm not fond of the idea to have all clients handle different ranges.

@tfmorris
Copy link
Member

Something to consider here would be how this works for a service which is built on something like Solr/Lucene which doesn't have a fixed scoring range (and scores aren't comparable across queries). What type of normalization would the service do and how would the user interpret the resulting score?

@ChristianeKlaes
Copy link

Unifying score ranges between reconciliation services would definitely improve further processing of reconciliation results.

Additionally, recon services should clearly state in their manifest/documentation

  • how similarity is computed (similarity measure, similarity features)
  • which similarity score leads to automatic matching, if enabled

@fsteeg
Copy link
Member

fsteeg commented Jan 11, 2024

Additionally, recon services should clearly state in their manifest/documentation

  • how similarity is computed (similarity measure, similarity features)
  • which similarity score leads to automatic matching, if enabled

For this part see also the discussion in #128.

@fsteeg
Copy link
Member

fsteeg commented Jan 11, 2024

We discussed this in today's meeting:

There seems to be consensus that a standardized score range from 0 to 100 would be useful. To address the problem of service backends without fixed scoring ranges mentioned by @tfmorris in #147 (comment), this could be made optional. This would also retain compatibility with existing services.

So we could add an optional field in the service manifest, like "standardizedScore": true, and specify that this means the score will be between 0 and 100. This would allow for clients to check if the service does use a standard range and process or display candidates accordingly.

fsteeg added a commit that referenced this issue Apr 11, 2024
Remove camelCase entry, not merged yet, will add it back in #166
fsteeg added a commit that referenced this issue Apr 11, 2024
Add optional `standardizedScore` field in service manifest (#147)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants