Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ReferencedEntityIdLookup interface #193

Merged
merged 1 commit into from
Apr 12, 2018
Merged

Conversation

mariushoch
Copy link
Member

@mariushoch mariushoch commented Apr 5, 2018

I chose a more general name here, as this can be used to lookup several kinds of relations (not only "parent" relations).

This is supposed to have an implementation similar to TypeCheckerHelper::isSubclassOf (in WikibaseQualityConstraints).

Needed in the context of T179155, but can probably also be re-used in WikibaseQualityConstraints.

* @param EntityId[] $toIds
* @param PropertyId $propertyId
*
* @return EntityId|null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this distinguished between “no target entity found” and “search limit exhausted”? (I assume the limit itself will be implementation-specific, but the interface should still provide an appropriate return type, right?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it's probably best to throw an exception in that case?! That would make the interface agnostic from that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, an exception makes sense. And I agree that the limit itself doesn’t belong in the interface.

*
* @since 3.10
*
* @license GPL-2.0+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GPL-2.0-or-later ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch (although this component uses the old naming scheme consistently)

@mariushoch
Copy link
Member Author

Another question is: Do we want to have the recursion/ depth limits in the interface? I think we don't, given when for example using a SPARQL backend (instead of relying on looking up entities), this suddenly doesn't make much sense anymore.

/**
* Get the closest entity (out of $toIds), from a given entity. The starting entity, and
* the target entities are (potentially indirectly, via intermediate entities) linked by
* statements with the given property id.
Copy link
Contributor

@thiemowmde thiemowmde Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty happy with what I see here! :-) The only detail I'm missing is that it's not entirely clear what "linked" exactly means. It could be linked both ways: Either the $fromId contains a property linking to one of the $toIds (this is what we want to do here), or the other way around.

I think we can add a short sentence like "… linked by statements with the given property ID, pointing towards the target entities."

This relationship might become a little more obvious when we order the function arguments like this: ( EntityId $fromId, PropertyId $propertyId, array $toIds ).

Please capitalize "ID" in the text (not in the variable names).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@mariushoch
Copy link
Member Author

Hm, should ClosestReferencedEntityIdLookup::getClosestEntityId take one entity id as starting point or an array of entity ids (see also T179155#4110697?

@thiemowmde
Copy link
Contributor

thiemowmde commented Apr 6, 2018

Do we want to have the recursion/ depth limits in the interface? I think we don't […]

I agree. Sure, we know we need a limit. But I think it should not be in this interface, but in a configuration variable and in the constructor of the implementation that needs it.

should ClosestReferencedEntityIdLookup::getClosestEntityId take one entity id as starting point or an array of entity ids

In my opinion an array does not make much sense. The method would need to return two entity IDs then: the closest end point that was found, as well as the starting point that lead to this result.

Let's please stick to a single root node for this tree.

@mariushoch
Copy link
Member Author

One thing that came to my mind yesterday is that while in a PHP implementation we can easily make sure that we actually find the closest referenced entity, this is not necessarily true for a SPARQL implementation for example.

Do we want to reflect this in the name of the interface?

@thiemowmde
Copy link
Contributor

[…] find the closest referenced entity, this is not necessarily true for a SPARQL implementation for example.

I agree this should be rephrased then. Maybe instead of talking about "closest", talk about "whichever is found first, but not necessarily the closest in terms of graph distance"?


$message = $message ?: 'Closest referenced entity id lookup failed. Tried to find the closest entity out of ' .
$targets . ' linked from ' . $fromId->getSerialization() . ' via ' . $propertyId->getSerialization();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objection to this, but I do wonder if it is really needed to construct such an elaborate message. Can't the info be kept in a more structured form? Perhaps when the message construction is needed it can be done on demand in the accessing method (not sure how this usually works for exceptions). That way you avoid the work in the constructor.


I'd extract a buildMessage and perhaps a buildTargetsString methods.

@JeroenDeDauw
Copy link
Contributor

Looks good to me +1 (no +2 since I do not know the context)


Another question is: Do we want to have the recursion/ depth limits in the interface?

Without knowing the context and domain, my bet is on those being in the constructor of implementations that care about these being the better approach. I'd go with putting them in the interface if you don't have this info yet when the services are constructed and these arguments are applicable to all "normal" implementations.

@lucaswerkmeister
Copy link
Member

One thing that came to my mind yesterday is that while in a PHP implementation we can easily make sure that we actually find the closest referenced entity, this is not necessarily true for a SPARQL implementation for example.

Do we want to reflect this in the name of the interface?

I just realized that this might have implications for usage tracking too. Consider a situation like the following:

      bar – baz – bat
foo ⧼
      qux

If I ask for getClosestEntityId( foo, property, [ bat, quq ] ), I will get bat returned. What happens now if a relation between qux and quq is added? If we guarantee the closest entity, the function should now return quq, and we might need to purge pages that used the bat result. If we don’t make that guarantee, that’s unnecessary. In other words, I feel like dropping this guarantee means we only need to track usages on the successful chain that led to the result, not on the whole tree we explored.

Does that make any sense? :)

@mariushoch
Copy link
Member Author

I feel like dropping this guarantee means we only need to track usages on the successful chain that led to the result, not on the whole tree we explored.

Oh, that's true… the weak guarantee that it's just going to be some referenced entity allows us to do that :) So even more reasons to change to that.

I chose a more general name here, as this can be used to lookup
several kinds of relations (not only "parent" relations).

This is supposed to have an implementation similar
to TypeCheckerHelper::isSubclassOf (in WikibaseQualityConstraints).

Needed in the context of https://phabricator.wikimedia.org/T179155,
but can probably also be re-used in WikibaseQualityConstraints.
@mariushoch mariushoch changed the title Add ClosestReferencedEntityIdLookup interface Add ReferencedEntityIdLookup interface Apr 12, 2018
@mariushoch
Copy link
Member Author

Renamed the interface (and exception) and clarified in the documentation that this might not return the closest referenced entity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants