# Semantic operations

This notebook demonstrates what the semantic operations between terms mean; and how they are useful for us

In [1]:
from atomrdf import KnowledgeGraph

In [2]:
kg = KnowledgeGraph()

## Basic comparison operations

Basic operators such as <, >, <=, >=, and ==

These operations are useful for adding conditions to the SPARQL query. When these operations are performed on a term, it is stored in its condition string. No other changes are needed. 

In [3]:
kg.terms.cmso.hasElementRatio==1.0

cmso:hasElementRatio

In [4]:
kg.terms.cmso.hasElementRatio._condition

'(?cmso:hasElementRatiovalue="1.0"^^xsd:float)'

On calling it again, the condition is reset

In [5]:
kg.terms.cmso.hasElementRatio>=1.0

cmso:hasElementRatio

In [6]:
kg.terms.cmso.hasElementRatio._condition

'(?cmso:hasElementRatiovalue>="1.0"^^xsd:float)'

## Logical operators

Logical operators currently supported are & and ||. These operators, when applied, aggregates the condition between two terms|

In [7]:
(kg.terms.cmso.hasChemicalSymbol=='Al') & (kg.terms.cmso.hasElementRatio==1.0)

cmso:hasChemicalSymbol

The resulting hybrid condition is stored in the first term; the condition of the second term is refreshed

In [9]:
kg.terms.cmso.hasElementRatio._condition

An additional change that happens is that the condition parents of the term are also recorded

In [10]:
kg.terms.cmso.hasChemicalSymbol._condition_parents

['cmso:hasElementRatiovalue']

## Shift operator

The final class of operator we have is the >> operator. This can be used for resolving terms that has multiple paths. For example, rdfs:label which multiple entities can have. 

This operator simply adds to parents of the right-most one; therefore allowing the resolve the path correctly

In [11]:
kg.terms.asmo.hasInputParameter>>kg.terms.rdfs.label

rdfs:label

In [12]:
kg.terms.rdfs.label._parents

[asmo:hasInputParameter]

In [13]:
kg.terms.asmo.hasInputParameter._parents

[]

That summarises all the possible options. Now we have to lay down the logic behind what will happen in a query formulation. 

## Query formulation

A query has three main components:

- source: term, from where the query originates. Should be returned.
- destination(s): terms which should be found; these could have an associated condition or not.

Let us consider the question: List all samples along with their space-group symbols with a composition of Al=1.0?

In this particular case, our source is Sample, destinations are space group symbols (with no condition) and composition (with condition)

The function call should therefore look like this:

```
query(source, [dest1, dest2=1.0 & dest3=40, dest4])
```
There can only be one destination with condition, this should raise an error otherwise.

Algorithm:

- get destinations
    - if single term, convert to list
    - check if more than one of them have an associated condition -> if so throw error
    - iterate through the list, if they have condition parents, add them explicitely
- get source
- construct the select distinct command:
    - add source `variable_name`
    - iterate over destinations, add their `variable_name`
- constructing the spaql query path triples, by iterating over destinations
    - for each destination:
        - check if it has  parent by looking at `._parents`
        - if it has `_parents`, called step path method
        - else just get the path
        - replace the ends of the path with `variable_name`
        - if it deosnt exist in the collection of lines, add the lines
- formulate the condition, given by the `FILTER` command:
    - extract the filter text from the term
    - loop over destinations:
        - call `replace(destination.query_name, destination.variable_name)`
- finish
    


In [None]:
query = """
PREFIX cmso: <http://purls.helmholtz-metadaten.de/cmso/>
SELECT DISTINCT ?symbol
WHERE {
    ?sample cmso:hasNumberOfAtoms ?number .
    ?sample cmso:hasMaterial ?material .
    ?material cmso:hasStructure ?structure .
    ?structure cmso:hasSpaceGroupSymbol ?symbol .
FILTER (?number="4"^^xsd:integer)
}"""

In [None]:
kg.query_sample([kg.terms.cmso.hasSpaceGroupSymbol, kg.terms.cmso.hasNumberOfAtoms==4])

In [None]:
res = kg.query_sample( 
             [kg.ontology.terms.cmso.hasAltName=='bcc'])

In [None]:
res = kg.query_sample(kg.ontology.terms.pldo.SymmetricalTiltGrainBoundary)

In [None]:
res = kg.query_sample([kg.ontology.terms.pldo.SymmetricalTiltGrainBoundary, 
                     kg.ontology.terms.pldo.hasSigmaValue])

In [None]:
res = kg.query_sample(kg.ontology.terms.ldo.ScrewDislocation)
