Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RYA-291 Added owl:hasValue inference #174

Closed

Conversation

jessehatfield
Copy link
Contributor

@jessehatfield jessehatfield commented Jun 26, 2017

Given a type associated with a hasValue property restriction: 1) expand queries
for members of the type to also check for anything with the value; and 2) expand
queries for values of that property to check for instances of the type.

Description

  1. Inference engine now stores information about owl:hasValue property restrictions, which relate a property, a value, and a type, such that members of the type are defined to have that value for that property. (https://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#Individual_Value_Restriction) These type/property/value associations can be retrieved by type or by property.
  2. HasValueVisitor rewrites queries according to the hasValue restrictions, expanding statement patterns referencing either the type or the property. Class hierarchy is considered, so if something is declared to belong to a subclass of such a type, then its value can be inferred; and if something has the property/value associated with such a type, then its membership in that type and any superclasses can be inferred. The resulting query tree is a union of the hasValue logic and the original statement pattern. Other inference visitors may still transform the original statement pattern.

Tests

InferenceEngineTest verifies that the correct schema is extracted from the ontology triples; HasValueVisitorTest verifies that query trees are expanded as expected; and InferenceIT verifies expected query results given ontology and instance triples.

Links

Jira

Checklist

  • Code Review
  • Squash Commits

People To Reivew

@meiercaleb
@ejwhite922

@asfgit
Copy link

asfgit commented Jun 26, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/246/

@@ -345,6 +346,7 @@ protected void commitInternal() throws SailException {
&& this.inferenceEngine != null
) {
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we talked about this briefly, but don't most inference visitors key in on the pattern ?x a uri:class1? After a given visitor expands the query, isn't this pattern gone? How can another inference rule be applied without this pattern? Does the new logic get unioned with the old pattern? Are our rules disjoint enough that this is a non-issue? If they do overlap, how do we determine a priority? I understand that currently the order is hard coded, so there is really nothing to do at this point. But I think that some thought should be put into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is a general issue with Rya's approach to inference (for example, TransitivePropertyVisitor is applied before SubPropertyOfVisitor and SameAsVisitor, and any expansion produced by any of the three will be passed over by the others). In this case, since the original statement is preserved as one branch of the union, there's no conflict as long as the HasValueVisitor is applied first. In cases where there is a conflict, keep in mind that it only manifests when both inference rules in question actually would apply to the same query (e.g. if hasAncestor is transitive, and is a subproperty of hasRelative, and the data says A hasAncestor B has Ancestor C hasAncestor D, I don't believe a query for "?x hasRelative D" will return A). A related limitation is that each visitor only applies once, even if another visitor later produces an expansion that would be relevant. For example, if p1 and p2 are inverse properties, and we have a hasValue condition for p1, queries for p2 won't trigger the hasValue expansion.

for (Value value : sufficientValues.get(property)) {
relevantValues.statements.add(new NullableStatementImpl(objType, property, value));
}
currentNode = new InferUnion(currentNode, new InferJoin(relevantValues, valueSP));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. We keep the original statement pattern and union it with the new infer join. This allows additional reasoning rules to be applied. Attempting to follow the logic here -- if a class is comprised of multiple owl:hasValue property restrictions, we're content with finding all classes that match at least one of those restrictions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the way I understand it, something like ":C1 owl:hasValue :v1 ; owl:onProperty :p1" means: "C1 is the set of individuals who have value v1 for property p1." Therefore, anything with p1=v1 is by definition a member of C1. If we have n such classes C1...Cn, and they all have "C(i) rdfs:subClassOf :Superclass1", then we can say: "The set of individuals with value v1 for property p1 are a subset of Superclass1; the set of individuals with value v2 for property p2 are a subset of Superclass1; ... ; the set of individuals with value vn for property pn are a subset of Superclass1." Therefore we can say that anything with p1=v1 is by definition a member of Superclass1 (regardless of its other properties), anything with p2=v2 is by definition a member of Superclass1 (regardless of its other properties), etc. Here, sufficientValues is supposed to capture those (property, value) combinations that, individually, are each sufficient to determine membership in the (super)class.
(owl syntax | rdf-based semantics)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I see. I think my confusion here stems from my limited understanding of restrictions. I was thinking of them as somehow being bound to a class or being a restriction placed on a class instead of as a class in themselves. Thanks for clearing this up.

final Var objVar = node.getObjectVar();
// We can reason over two types of statement patterns:
// { ?var rdf:type :Restriction } and { ?var :property ?value }
// Both require defined predicate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does objVar correspond to some class that is a subClassOf a Restriction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the first case (when the predicate is rdf:type), objVar is some class that can be inferred by a hasValue expression, which would either be the the property restriction itself or any superclass of it. E.g. if we have ":A owl:onProperty :p1 ; owl:hasValue :v1" and ":A rdfs:subClassOf :B" then an objVar with value :A or :B should trigger rewriting. In the second case, where the predicate is something other than rdf:type, we instead check to see if any hasValue expressions involve the predicate, and objVar corresponds to the value variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a property restriction on A imposes the same property restriction of B? Or does it just trigger the rewriting, where all restrictions on B are retrieved from the inference engine (which could be a subset of the restrictions on A)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't describe it as an imposition; the semantics of the hasValue expression are: "If you have p1=v1, then you're an A (and vice versa)". So if the query is "Get all the A's," then we want to return anything with p1=v1 (objVar here is A, and we ask the inference engine for any predicate/value pairs that entail membership in A). And the semantics of the subclass relationship are: "If you're an A, then you're also a B," so if the query is instead "Get all the B's," then we again want to return anything with p1=v1, in addition to anything that turns out to be a B for some other reason (objVar here is B, and we ask the inference engine for any predicate/value pairs that entail membership in B, including those that entail membership in A because membership in A entails membership in B).

// If the predicate is rdf:type and the type is specified, check whether it can be
// inferred using any hasValue restriction(s)
final Resource objType = (Resource) objVar.getValue();
final Map<URI, Set<Value>> sufficientValues = inferenceEngine.getHasValueByType(objType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this loading all hasValue property restrictions associated with objType?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Any predicate/value combination that would imply that the subject belongs to that type.

for (URI property : sufficientValues.keySet()) {
final Var propVar = new Var(property.toString(), property);
final TupleExpr valueSP = new DoNotExpandSP(subjVar, propVar, valueVar);
final FixedStatementPattern relevantValues = new FixedStatementPattern(objVar, propVar, valueVar);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we require that both the parent and child class have the same value for each property returned by the inference engine?

else {
// If the predicate has some hasValue restriction associated with it, then finding
// that the object belongs to the appropriate type implies a value.
final Map<Resource, Set<Value>> impliedValues = inferenceEngine.getHasValueByProperty(predURI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these all classes that have a hasValue property restriction on the given predicate? Is the whole point here to check whether the class indicated by objVar is a subClassOf of a Restriction that constrains the value of the predicate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All classes with such a restriction, and the values associated with them, and we're trying to match objVar to the value. At this point, the check RDF.TYPE.equals(predURI) failed, which means there isn't any reason to expect objVar to be a class. Instead, we're looking at things like "?subj :p1 ?obj" and saying: If there is a property restriction on p1, involving class C1 and value v1, then this expresses the fact that all members x of :C1 implicitly have the triple (x, :p1, :v1). Therefore, for any x found to belong to :C1, {?subj=x, ?obj=:v1} is a correct solution to the original query. This map from Resources to Values would include an entry mapping C1 to the set {v1}.



// Get a set of all property restrictions of any type
iter = RyaDAOHelper.query(ryaDAO, null, OWL.ONPROPERTY, null, conf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof! That refreshGraph method is getting out of control. Maybe start compartmentalizing the refresh logic by the rule that is getting updated. For example, the logic below could be relegated to an updateHasValue(...) method. I think this is important, especially given that you will be more than doubling the number of rules that the inference engine needs to update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'll try and clean this up.

* For a given type, return any properties and values such that owl:hasValue restrictions on
* those properties could imply this type. No matter how many restrictions are returned, each
* one is considered individually sufficient: if a resource has the property and the value, then
* it belongs to the provided type. Takes type hierarchy into account, so the value may imply a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relates to an earlier question that I had. Is this the standard interpretation? Suppose class A is a subclass of two different Restrictions (say B and C), where B requires that prop1= val1 and C requires that prop2 = val2. By the above interpretation, a class where prop1 = val1 and prop2 = val3 (some other value different from val2) would be considered of type A, correct? Isn't that inconsistent with the definition of A?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If A were a superclass of B and C, rather than a subclass, then we would consider that individual to be of type A. This method is going to return all the hasValue restrictions on all of a type's subclasses, and since belonging to one subclass is sufficient to determine membership in the superclass, meeting one of those restrictions is also sufficient to determine membership in the superclass. Also keep in mind that hasValue property restrictions don't require any data to be there; they actually imply it: "x is a member of B <-> x has prop1=val1". So in your example, if an individual is known to be of type A, then it must also have types B and C, and it must also have prop1=val1 and prop2=val2. If those triples aren't in the data, that's OK, we can infer them (though that would be the other rewriting case and handled by the other method). The clearest statement of this in the standards is probably here, specifically cls-hv1 and cls-hv2 in table 6. (This list of rules is incomplete since it only covers OWL RL languages, but the terms it does cover have their standard meaning.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Thanks for clarifying.

Copy link
Contributor

@meiercaleb meiercaleb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this PR once you clean up your portion of the refreshGraph() method.

@asfgit
Copy link

asfgit commented Jul 13, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/259/

Copy link
Contributor

@isper3at isper3at left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, except docs.

import org.openrdf.query.algebra.TupleExpr;
import org.openrdf.query.algebra.Var;

public class HasValueVisitor extends AbstractInferVisitor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc

Copy link
Contributor

@ejwhite922 ejwhite922 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

private final URI belongsTo = vf.createURI("urn:belongsToTaxon");
private final URI chordata = vf.createURI("urn:Chordata");

@SuppressWarnings("unchecked")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get rid of the SuppressWarnings, make the class org.apache.rya.rdftriplestore.inference.AbstractInferVisitor extend QueryModelVisitorBase<Exception>

import org.openrdf.query.algebra.Var;

public class HasValueVisitor extends AbstractInferVisitor {
public HasValueVisitor(RdfCloudTripleStoreConfiguration conf, InferenceEngine inferenceEngine) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add javadocs

@asfgit
Copy link

asfgit commented Jul 14, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/262/

Given a type associated with a hasValue property restriction: 1) expand queries
for members of the type to also check for anything with the value; and 2) expand
queries for values of that property to check for instances of the type.
@asfgit
Copy link

asfgit commented Jul 17, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/263/

@asfgit asfgit closed this in 2b73c30 Jul 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants