-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Reconcile > Use values as identifiers" does not reconcile #3172
Comments
That is the intended behaviour - the goal is to be able to blindly trust a column of identifiers when you know they are valid, to avoid a costly reconciliation. We could:
|
Hrm! It feels like it's misleading to have this under the "Reconcile" menu if there's no actual reconciliation or validation happening here. |
We could also have a checkbox to optionally validate the ids during this operation, but there is no provision for that in the reconciliation API. If people want validation, they should just use the standard reconciliation operation (although there is no requirement on the services that when queried with their own identifiers, they return the corresponding entity as a match). I think the priority is adding the text to the UI (first bullet point above). |
I don't understand why this is a separate operation at all. Why isn't this just a standard Reconcile operation against a property of "id" or whatever the reconciliation service calls it? It'd be trivial, blindingly fast, and have the behavior the user expects. |
If you have 100k rows, it is still going to take a fairly long while, and that can be quite frustrating when you have just retrieved these ids from the service (for instance with a SPARQL query). The Wikidata service recognizes Wikidata identifiers in reconciliation queries and processes them as fast as it can (without searching for them) but it still takes quite a bit of time to fetch the label and types for each of them. Moreover there is no requirement that services behave in this way - it could well be that some services out there do not implement a special case for recon queries that look like ids. This is something we could add to the specs now, but I am not sure it is fair to expect that from them right now. |
Marking this as a good first issue: the dialog opened by this operation (to pick the reconciliation service) should warn the user that reconciliation identifiers will not be validated. |
@wetneb Am volunteering to work on this issue |
To Reproduce
Run "Reconcile" > "Use values as identifiers" on any column, whether it contains properly-formatted unique IDs for the given service or not. You will see that it seems not to reconcile at all - it only says, whatever your cell contents are, it's 100% matched and generates a real-looking URL, whether or not that URL will resolve.
Expected Behavior
Reconciliation seems to be missing the part where it actually validates these unique IDs to see if they match up with existing entities. I expect failures to say so.
Screenshots
I tested on Wikidata and VIAF. VIAF gives no hover-information, the matches just send you to 404s, e.g. http://viaf.org/viaf/Q17291. None of the content in that VIAF column should be matching (Q### is an invalid format) except that first value, which resolves correctly to https://viaf.org/viaf/38242123/.
There are two things happening on Wikidata that I thought I'd mention - IDs that don't exist give the error in the above screenshot. IDs that seem to exist but have yet to be assigned (something that fits the Q### format) look a bit different (so that may be a second bug to work out). I would expect these to have some other kind of obvious error message or flag.
Versions
Windows, Firefox
The text was updated successfully, but these errors were encountered: