New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional details for matching properties in reconciliation queries #131
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like unneeded complexity to me. One of the things that I like about the IETF process is that they require prior implementation and testing. This demonstrates that a) it's implementable and b) there's at least one real world use case.
It feels like this spec is accumulating more and more words/features without a commensurate increase in real world testing experience (unless there's implementation testing going on behind the scene that I'm missing).
But it's all just intuition / gut feeling...
draft/index.html
Outdated
This can be used for general matching relations like "skos:exactMatch", "skos:closeMatch", etc. or for specific features like spatial matching with geo data | ||
(e.g. containment search with "schema:containsPlace" etc.) or custom matching on date fields (e.g. services supporting the [[EDTF]] specification could use "EDTF:Level-0" etc. | ||
To allow discovery of supported qualifiers by clients, services that support <code>match_qualifier</code> SHOULD return a <a href='#suggest-responses'>suggest response</a> with supported qualifiers | ||
for a property at <code>/suggest/property/match_qualifier/<pid></code> (relative to the service root).</p></dd> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this list should be small enough that it could just be returned with the response to the original property suggest query, although it probably needs descriptions too to help the user figure out what the heck they mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be an option too. My line of thought was that since this is a separate selection step in the client (first select the property, then select the qualifier), clients could reuse existing suggest endpoint logic (do the same lookup as for the property itself, but against a different endpoint), instead of building custom UI. But that may be a faulty assumption, since I'm looking at it more from the service side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy either way, but we'd probably need a more precise description of how the URL is constructed: property ids can be arbitrary strings, so we'd need to know if/how they are escaped to form the URL you specify. Probably just URL-encoding the property id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the feedback by @tfmorris here and by @wetneb on the UI differences in our last meeting ("the set is small, and we'd like to see all options, without typing"), I've moved the match_qualifiers
discovery into the property suggest response: b486232.
Thanks a lot for your review @tfmorris! I replied inline to your specific comments.
I got to say I do share that gut feeling! What I actually tried to do here was to condense all these different suggestions from the 5 linked issues into a fairly compact, non-breaking proposal (optional fields and retaining the current behavior if omitted) that addresses actual use cases. There is one use case I forgot to mention: OpenRefine/OpenRefine#5615 could be implemented using
Yes, I agree, and it's what I like so much about working on this spec, that it is based on years of proven functionality, and not "made up". I very much want to keep it that way too! After we addressed any obvious issues with the approach proposed in this PR, we should implement it in the testbench and our service to see if it actually works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal is a nice solution to the recent issues about the difficulty of selecting candidates.
We already discussed the complexity issue and to whether put to put the candidate selection decision in server or client side. The conclusion of this proposal feels balanced to me in this regards. One one hand we unfortunately can't move everything on client side leaving weird fuzzy matching to client as candidates pre-selection requires that kind of complexity. On the other those new parameters are optional leaving recon services developers/maintainers decide what to do.
Thus I feel like this would help recon services to grow in diversity and quality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I think I disagree that this solves #106 though. After this PR, the query
field still has a sort of special status, and there is no standard way to replace it by a property (since clients cannot know which pid
to use). But I think this PR makes it all the more interesting to solve #106, since the new settings you introduce cannot be applied to the query
itself.
I agree with @tfmorris that we've been introducing quite a few things in the specs and the implementations are lagging behind. I'm keen to try things out in OpenRefine and some services but I have more urgent things on my plate in OpenRefine at the moment. Also it's a requirement for the standardization process at W3C that there are at least 2 independent implementations so there is no risk that we adopt something completely made up and untested.
draft/index.html
Outdated
This can be used for general matching relations like "skos:exactMatch", "skos:closeMatch", etc. or for specific features like spatial matching with geo data | ||
(e.g. containment search with "schema:containsPlace" etc.) or custom matching on date fields (e.g. services supporting the [[EDTF]] specification could use "EDTF:Level-0" etc. | ||
To allow discovery of supported qualifiers by clients, services that support <code>match_qualifier</code> SHOULD return a <a href='#suggest-responses'>suggest response</a> with supported qualifiers | ||
for a property at <code>/suggest/property/match_qualifier/<pid></code> (relative to the service root).</p></dd> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy either way, but we'd probably need a more precise description of how the URL is constructed: property ids can be arbitrary strings, so we'd need to know if/how they are escaped to form the URL you specify. Probably just URL-encoding the property id?
No matter what syntax, we need to keep the ability for users to always optionally provide constraints (filters and property matching criteria) at query time, and not depend on services to have some fancy logic to figure things out the clients wants or wishes to return an adequate response. So I always prefer syntax that keeps more control for users. |
}, | ||
{ | ||
"name": "place of birth", | ||
"description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character", | ||
"id": "P19" | ||
"id": "P19", | ||
"match_qualifiers": ["schema:containsPlace", "schema:containedInPlace"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should settle on using only one format for the match_qualifiers
field, otherwise it's painful to parse for clients. I'd go for the format with objects (which makes it possible to display human-readable info and is more extensible).
Also, the corresponding JSON schema should be updated accordingly (now that we are close to merging this, I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
<dd>A string to indicate how to match the values in <code>v</code>. | ||
This can be used for general matching relations like "skos:exactMatch", "skos:closeMatch", etc. or for specific features like spatial matching with geo data | ||
(e.g. containment search with "schema:containsPlace" etc.) or custom matching on date fields (e.g. services supporting the [[EDTF]] specification could use "EDTF:Level-0" etc. | ||
To allow discovery of supported qualifiers by clients, services that support <code>match_qualifier</code> SHOULD return the supported <code>match_qualifiers</code> for each property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fsteeg I think we're going to regret this where "match_qualifiers SHOULD return" instead of "MUST return". But we'll see how things pan out over the next year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this has been discussed before. I guess my motivation was to keep the barrier low for implementers, but using MUST
sounds good to me too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought was that discovery of supported qualifiers will be impossible if we do not say MUST.
@wetneb your thoughts on this also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, why not MUST
, your reasoning makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 1812f84.
Supported match qualifiers MUST be included in property suggestions, see #131 (comment)
This adds three optional fields to properties in reconciliation queries. It's based on our discussions in recent meetings, in particular the suggestion in #105 (comment), adding an additional field to address #114, based on #128 (comment).
Changes from the linked discussions:
mode=must|should
withrequired=true|false
(we discussedfilter
as an option in our last meeting, but I feltrequired
was clearer since filter can be used both meaning "accept" and "reject")match=any|all|none
tomatch_quantifier=any|all|none
match_qualifier=<string>
to address use cases like Date comparison needs an absolute settings to compute relative diff score #114, which came up in otherwise unrelated Add an additional property match context to the features #128 (comment)This is related to multiple issues, which I'd hope this could resolve:
match_quantifier
required
match_quantifier
)match_qualifier
and the mention of the EDTF specificationquery
parameter as one of theproperties
in reconciliation queries #106, which maybe could have been resolved before, sincequery
is optional already, but with this we have precise control over the behavior of the property replacing thequery