Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional details for matching properties in reconciliation queries #131

Merged
merged 4 commits into from Dec 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
20 changes: 16 additions & 4 deletions draft/examples/reconciliation-query-batch/valid/example-full.json
Expand Up @@ -7,11 +7,17 @@
"properties": [
{
"pid": "professionOrOccupation",
"v": "Politik*"
"v": "Politik*",
wetneb marked this conversation as resolved.
Show resolved Hide resolved
"required": false,
"match_quantifier": "any",
"match_qualifier": "WildcardMatch"
},
{
"pid": "affiliation",
"v": "http://d-nb.info/gnd/2022139-3"
"v": "http://d-nb.info/gnd/2022139-3",
"required": false,
"match_quantifier": "any",
"match_qualifier": "ExactMatch"
}
]
},
Expand All @@ -22,11 +28,17 @@
"properties": [
{
"pid": "professionOrOccupation",
"v": "Politik*"
"v": "Politik*",
"required": false,
"match_quantifier": "any",
"match_qualifier": "WildcardMatch"
},
{
"pid": "affiliation",
"v": "http://d-nb.info/gnd/2022139-3"
"v": "http://d-nb.info/gnd/2022139-3",
"required": false,
"match_quantifier": "any",
"match_qualifier": "ExactMatch"
}
]
}
Expand Down
12 changes: 10 additions & 2 deletions draft/examples/suggest-properties-response/valid/example.json
Expand Up @@ -3,12 +3,20 @@
{
"name": "coordinate location",
"description": "geocoordinates of the subject. For Earth, please note that only WGS84 coordinating system is supported at the moment",
"id": "P625"
"id": "P625",
"match_qualifiers": [
{"id": "ExactMatch", "name": "Exact match of the coordinates"},
{"id": "DecimalPlaces-N", "name": "Match the coordinates with a precision of N decimal places"}
]
},
{
"name": "place of birth",
"description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
"id": "P19"
"id": "P19",
"match_qualifiers": [
{"id": "schema:containsPlace", "name": "Containment relation between a place and another that it contains"},
{"id": "schema:containedInPlace", "name": "Containment relation between a place and another that contains it"}
]
},
{
"name": "located in time zone",
Expand Down
35 changes: 28 additions & 7 deletions draft/index.html
Expand Up @@ -96,6 +96,11 @@
"publisher": "IETF",
"href": "https://www.rfc-editor.org/rfc/bcp/bcp47.txt"
},
"EDTF": {
"title": "Extended Date/Time Format Specification (part of ISO 8601:2019)",
"publisher": "Library of Congress / International Organization for Standardization",
"href": "https://www.loc.gov/standards/datetime/"
},
}
};
</script>
Expand Down Expand Up @@ -517,12 +522,26 @@ <h3>Structure of a Reconciliation Query</h3>
<dt><code>limit</code></dt>
<dd>A limit on the number of candidates to return, which must be a positive integer;</dd>
<dt><code>properties</code></dt>
<dd>An array of objects, where each object maps a <a href='#properties'>property</a> identifier (in the <code>pid</code> field)
to one or more <a>property values</a> (in the <code>v</code> field). These are used to further filter the set of candidates (similar to a WHERE clause in SQL),
by allowing clients to specify other attributes of entities that should match, beyond their name in the <code>query</code> field.
How reconciliation services handle this further restriction ("must match all properties" or "should match some") and how it affects the score, is up to the service.
A reconciliation service that supports properties SHOULD provide a <a>suggest service</a> for discovering these properties;</dd>
</dl>
<dd><p>An array of objects, where each object maps a <a href='#properties'>property</a> identifier (in the <code>pid</code> field) to one or more <a>property values</a> (in the <code>v</code> field).
These are used to further refine the list of candidates by allowing clients to specify other attributes of entities, beyond their name in the <code>query</code> field.
A reconciliation service that supports properties SHOULD provide a <a>suggest service</a> for discovering these properties.</p>
<p>In addition to <code>pid</code> and <code>v</code>, services MAY support the following optional fields that allow clients to specify the effect of each property on the resulting list of candidates.
If these fields are omitted, the exact behavior ("must match all", "should match some", etc.) is up to the service.</p>
<dl>
<dt><code>required</code></dt>
wetneb marked this conversation as resolved.
Show resolved Hide resolved
<dd>A boolean indicating if a match for the property is required for an entity to enter the list of candidates (i.e. acting like a filter or a WHERE clause in SQL)
or optional (i.e. only effecting the entity's rank in the list of candidates);</dd>
<dt><code>match_quantifier</code></dt>
<dd>A string to indicate which of the values in <code>v</code> to match. MUST be "any" (equivalent to boolean OR), "all" (equivalent to boolean AND), or "none" (equivalent to boolean NOT);</dd>
<dt><code>match_qualifier</code></dt>
wetneb marked this conversation as resolved.
Show resolved Hide resolved
<dd>A string to indicate how to match the values in <code>v</code>.
This can be used for general matching relations like "skos:exactMatch", "skos:closeMatch", etc. or for specific features like spatial matching with geo data
(e.g. containment search with "schema:containsPlace" etc.) or custom matching on date fields (e.g. services supporting the [[EDTF]] specification could use "EDTF:Level-0" etc.
To allow discovery of supported qualifiers by clients, services that support <code>match_qualifier</code> SHOULD return the supported <code>match_qualifiers</code> for each property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fsteeg I think we're going to regret this where "match_qualifiers SHOULD return" instead of "MUST return". But we'll see how things pan out over the next year.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this has been discussed before. I guess my motivation was to keep the barrier low for implementers, but using MUST sounds good to me too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was that discovery of supported qualifiers will be impossible if we do not say MUST.
@wetneb your thoughts on this also?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, why not MUST, your reasoning makes sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 1812f84.

in their property <a href='#suggest-responses'>suggest responses</a>.</p></dd>
</dl>
</dd>
</dl>
</p>
<p>
A <dfn>reconciliation query batch</dfn> is an array of <a>reconciliation queries</a>.
Expand Down Expand Up @@ -738,7 +757,9 @@ <h3>Suggest Responses</h3>
<dt><code>description</code></dt>
<dd>An optional description which can be provided to disambiguate namesakes, providing more context. This could for instance be displayed underneath the <code>name</code>;</dd>
<dt><code>notable</code></dt>
<dd>When suggesting entities only, this field can be used to supply some important types (not necessarily all types) of the suggested entity. The value must be an array of either type identifiers (as strings) or type objects, containing a <code>id</code> and <code>name</code> field which represent the type.</dd>
<dd>When suggesting entities only, this field can be used to supply some important types (not necessarily all types) of the suggested entity. The value must be an array of either type identifiers (as strings) or type objects, containing an <code>id</code> and <code>name</code> field which represent the type.</dd>
<dt><code>match_qualifiers</code></dt>
<dd>When suggesting properties only, an optional array of objects, each containing an <code>id</code> and <code>name</code> field, which represent the property's <code>match_qualifiers</code> supported in <a>reconciliation queries</a>.</dd>
</dl>
</dd>
</dl>
Expand Down
17 changes: 17 additions & 0 deletions draft/schemas/reconciliation-query-batch.json
Expand Up @@ -82,6 +82,23 @@
}
}
]
},
"required": {
"type": "boolean",
"description": "A boolean indicating if a match for the property is required for an entity to enter the list of candidates"
},
"match_quantifier": {
"type": "string",
"description": "A string to indicate which of the values in v to match",
"enum": [
"any",
"all",
"none"
]
},
"match_qualifier": {
"type": "string",
"description": "A string to indicate how to match the values in v"
}
},
"required": [
Expand Down
21 changes: 21 additions & 0 deletions draft/schemas/suggest-properties-response.json
Expand Up @@ -20,6 +20,27 @@
"description": {
"type": "string",
"description": "An optional description which can be provided to disambiguate namesakes, providing more context."
},
"match_qualifiers": {
"type": "array",
"description": "An optional array of objects representing the match_qualifiers supported for the suggested property",
"items": {
"type": "object",
"properties": {
"id": {
"type": "string",
"description": "Identifier of the match_qualifier"
},
"name": {
"type": "string",
"description": "Name of the match_qualifier"
}
},
"required": [
"id",
"name"
]
}
}
},
"required": [
Expand Down