Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move 'query' field inside 'properties' #149

Merged
merged 4 commits into from Mar 14, 2024
Merged

Conversation

wetneb
Copy link
Member

@wetneb wetneb commented Dec 14, 2023

Closes #134. Closes #106.

@wetneb wetneb requested review from fsteeg and osma December 14, 2023 09:28
A <dfn>property assignment</dfn> specifies the expected value of a property on the entities to match.
These are used to filter the set of candidates (similar to a WHERE clause in SQL),
by allowing clients to specify an attribute of entities that should match. It consists of:
<dl>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML nitpick: Isn't <dl> a block element in HTML? It shouldn't be placed inside <p> like this, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I created an issue about it because it would require normalization in the rest of the document too: #151

Copy link
Contributor

@osma osma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective, this is a somewhat unexpected solution for #134 and #145. In a way I like it, because it unifies the structure of queries. OTOH, it makes the simple case of querying just for a name more complex, as seen from the examples; the minimal example is a lot less minimal now.

The PR looks well made (except for a minor HTML structure issue that I commented on) and seems to correctly adjust all the relevant examples.

I'm not against this if the CG decides that this is the right way forward. Just a bit concerned about the additional complexity.

Copy link
Member

@tfmorris tfmorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an incompatible change is going to be introduced in the name of orderliness, I think it makes sense to make it as clean as possible and tidy up all the loose ends. In particular, having a missing property imply some special property doesn't seem very clean.

draft/index.html Outdated Show resolved Hide resolved
draft/schemas/reconciliation-query-batch.json Outdated Show resolved Hide resolved
@wetneb
Copy link
Member Author

wetneb commented Jan 13, 2024

@tfmorris in our call this week, we thought that we could accommodate your preference for an explicit pid, because we have an eye on #145 as well, for which we would likely need a similar solution (and we would not be able to use an empty pid for both, obviously).

Also, I want to stress that the consensus we reach in the monthly calls are not (in my opinion) meant to be definitive resolutions of the entire community group, but rather ways to quickly run ideas by other group members and get their feedback early on (saving some rounds of review on GitHub). So there is no problem with questioning those outcomes. That's why we still make PRs for those things.

@osma
Copy link
Contributor

osma commented Jan 15, 2024

I think it was me who suggested an empty pid, in the spirit of "what is the simplest thing that could possibly work?". I have no problem admitting that this is hackish and ugly.

we thought that we could accommodate your preference for an explicit pid, because we have an eye on #145 as well, for which we would likely need a similar solution (and we would not be able to use an empty pid for both, obviously).

It seems to me that the scope of properties is expanding, perhaps way beyond its original intent, if we are going to apply it not just to labels but to types as well. How about renaming it into something more generic such as conditions (or where) and adding a new attribute match_type to specify whether we are matching labels, properties or types? It could look something like this:

      "conditions": [
        {
          "match_type": "label",
          "v": "Christel Hanewinckel"
        },
        {
          "match_type": "property",
          "pid": "professionOrOccupation",
          "v": "Politik*",
          "required": false,
          "match_quantifier": "any",
          "match_qualifier": "WildcardMatch"
        },
        {
          "match_type": "property",
          "pid": "affiliation",
          "v": "http://d-nb.info/gnd/2022139-3",
          "required": false,
          "match_quantifier": "any",
          "match_qualifier": "ExactMatch"
        }
      ]

To handle types as well (#145), we could add "match_type": "type" as well.

@fsteeg
Copy link
Member

fsteeg commented Feb 8, 2024

To handle types as well (#145), we could add "match_type": "type" as well.

So to summarize, we'd have 3 different kinds of match_type for these proposed conditions: label (replacing the old query, no pid required), type (replacing the old type, no pid required), and property (the original case, pid required).

For compatibility, we could also keep the properties name, define "match_type": "property" as the default, and effectively only add a new optional field match_type? And for the possible values we could stick with the existing query, type, property as values.

It seems like a sensible way to structure this. In general, simply using label and type as pid and not adding a new field seems simpler. However, with a match_type we'd avoid possible collisions with existing label or type fields (and workarounds like _label / _type), right? And maybe it's actually cleaner to have a separate field, instead of giving certain pid values a special meaning?

@wetneb
Copy link
Member Author

wetneb commented Feb 8, 2024

That sounds good to me, but I would suggest "match_type": "name" instead of "match_type": "label" since that's currently how we call this field in the specs (and JSON serialization of reconciliation candidates, for instance).

@osma
Copy link
Contributor

osma commented Feb 8, 2024

"match_type": "name" seconded!

draft/index.html Outdated Show resolved Hide resolved
Co-authored-by: Fabian Steeg <steeg@hbz-nrw.de>
@fsteeg fsteeg merged commit 4446d6b into master Mar 14, 2024
1 check passed
@fsteeg fsteeg deleted the 134-query-to-property branch March 14, 2024 14:44
@osma
Copy link
Contributor

osma commented Mar 14, 2024

Should the changes made in this PR also be mentioned in the changes section 1.4.3 This Draft?

@wetneb
Copy link
Member Author

wetneb commented Mar 14, 2024

Yes, we could add update this section with all of the recent changes we have done (not just this one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants