# 3.2 Complex Calculations

Reasoning can be used to perform complex calculations and analysis as `BIND` offers the flexibility to describing mathematical expressions using variables and in-build functions.

Complex calculations often require complex patterns, so layering rules can helpful - referring to patterns in one rule that were created by others.

This is very common in practice, not just in calculations, to construct intricate patterns over several rules - breaking the problem down into bit sized chunks.

This can also greatly benefit the efficiency of reasoning - some common examples are explored throughout the workshop.

## Example

Below is an example of how multiple rules can work together in order to calculate a result - in this case to perform Term Frequency analysis for articles with specific content tags in order to recommend similar articles.

Our Term Frequency analysis assigns common terms a lower weighting than rare terms that receive a much higher weighting.

When used to generate a recommendation, this is used to ensure common terms contribute less than specific terms that have a larger impact.

A node is created between two articles, calculating a similarity score based on shared weighted tags.

In [1]:
cal_data = """
@prefix : <https://rdfox.com/example#> .

:article1 a :Article ;
        :hasTag :AI ,
                :semanticReasoning .

:article2 a :Article ;
        :hasTag :semanticReasoning .

:article3 a :Article ;
        :hasTag :AI .

:article4 a :Article ;
        :hasTag :AI .

"""

In [2]:
cal_rules = """

[?tag, :hasLogFrequency, ?termLogFrequency] :-
    AGGREGATE (
        [?article, :hasTag, ?tag]
        ON ?tag
        BIND COUNT(?article) AS ?totalTagMentions
    ),
    BIND (LOG(1/?totalTagMentions) AS ?termLogFrequency) .

[?recommendationNode, :hasRecommendedArticle, ?article1],
[?recommendationNode, :hasRecommendedArticle, ?article2],
[?recommendationNode, :pairHasSimilarityScore, ?similarityScore] :-
    AGGREGATE(
        [?article1, :hasTag, ?tag],
        [?article2, :hasTag, ?tag],
        [?tag, :hasLogFrequency, ?termLogFrequency]
        ON ?article1 ?article2
        BIND SUM(-1/?termLogFrequency) AS ?similarityScore
    ),
    SKOLEM(?article1,?article2,?recommendationNode),
    FILTER (?article1 > ?article2) .

"""

In [3]:
import requests

# Set up the SPARQL endpoint
rdfox_server = "http://localhost:12110"

# Helper function to raise exception if the REST endpoint returns an unexpected status code
def assert_response_ok(response, message):
    if not response.ok:
        raise Exception(
            message + "\nStatus received={}\n{}".format(response.status_code, response.text))

# Clear data store
clear_response = requests.delete(
    rdfox_server + "/datastores/default/content?facts=true&axioms&rules")
assert_response_ok(clear_response, "Failed to clear data store.")

# Add data
payload = {'operation': 'add-content-update-prefixes'}
data_response = requests.patch(
    rdfox_server + "/datastores/default/content", params=payload, data=cal_data)
assert_response_ok(data_response, "Failed to add facts to data store.")

# Get rules
rules_response = requests.post(rdfox_server + "/datastores/default/content", data=cal_rules)
assert_response_ok(rules_response, "Failed to add rule.")

# Get and issue select query
with open("../queries/3_2-ComplexCalculationsQuery.rq", "r") as file:
    cal_query = file.read()
response = requests.get(
    rdfox_server + "/datastores/default/sparql", params={"query": cal_query})
assert_response_ok(response, "Failed to run select query.")
print('\n=== Similar Articles ===')
print(response.text) 



=== Similar Articles ===
?article1	?article2	?similarityScore
<https://rdfox.com/example#article1>	<https://rdfox.com/example#article2>	1.4426950408889634e+0
<https://rdfox.com/example#article1>	<https://rdfox.com/example#article4>	9.1023922662683732e-1
<https://rdfox.com/example#article1>	<https://rdfox.com/example#article3>	9.1023922662683732e-1
<https://rdfox.com/example#article3>	<https://rdfox.com/example#article4>	9.1023922662683732e-1



## info rulestats

Run `info rulestats` in the RDFox shell to show information about the rules in your data store.

For this example, so far, you will see:

=================== RULES STATISTICS =====================
|Component|    Nonrecursive rules|    Recursive rules|    Total rules|
|---------|----------------------|-------------------|---------------|
|        1|                     1|                  0|              1|
|        2|                     1|                  0|              1|
|Total:|                        2|                  0|              2|

We have imported two rules above, so the total rules is 2.

Both are non-recursive (see 3.3 for recursion) which is also reflected in the table.

Since one rule depends on the other, we end up with 2 components (layers, or strata)

This can be a great launching point for debugging rules, particularly if the values shown are not what you expect.

Find out more about the `info` command in the docs [here](https://docs.oxfordsemantic.tech/rdfox-shell.html#info).

## Overly complex rules

It can be tempting to write huge rules that match sprawling patterns and infer many heads at once, but this can be deeply inefficient.

Rules should be streamlined, only considering the minimum number of body atoms required to infer the relevant head facts, otherwise computation will be spent matching irrelevant body patterns for some head atoms.

This often means splitting large rules into several parts.

Take these rules for examples:

In [4]:
# In profiler/3_2-1.dlog
combo_rule = """

# There is no relation between the atoms involving ?a and ?x
[?a, :hasNewProp, "new prop"],
[?x, a, :newClass] :-
    [?a, :hasProp, "prop" ],
    [?x, a, :Class ].

"""

# In profiler/3_2-2.dlog
separate_rules = """

[?a, :hasNewProp, "new prop"]:-
    [?a, :hasProp, "prop" ].

[?x, a, :newClass] :-
    [?x, a, :Class ].


"""

Run the cell below and then, in the RDFox shell, import the combo rule:

`import profiler/3_2-1.dlog`

Notice that the reasoning profiles undergoes 6 iterator operations. This is because the combo rule creates a cross product of patterns that it must check to infer each head atom, despite being unrelated.

Then **run the cell below again** to clear the rules and, in the RDFox shell, import the separated rules:

`import profiler/3_2-2.dlog`

This time, notice that even though there are more rules imported, the rules only require 2 iterator operations each, for a total of 4.

### A problem of scale

While 6 iterations rather than 4 is almost invisible, this problem scales poorly with the number of triples and complexity of the rule and can cause drastic rules slowdowns.

In [6]:
profiler_data = """
    @prefix : <https://rdfox.com/example/> .

    :a :hasProp "prop" .

    :x a :Class .
    
"""

# Clear data store
clear_response = requests.delete(
    rdfox_server + "/datastores/default/content?facts=true&axioms&rules")
assert_response_ok(clear_response, "Failed to clear data store.")

# Add data
payload = {'operation': 'add-content-update-prefixes'}
data_response = requests.patch(
    rdfox_server + "/datastores/default/content", params=payload, data=profiler_data)
assert_response_ok(data_response, "Failed to add facts to data store.")

print(data_response)

<Response [200]>


## Exercise

Complete the rule `3_2-ComplexCalculationsRules.dlog` in the `rules` folder so that the query below will return the percentage of articles each tag appears in, rounded to the nearest percentage point.

### Hits & helpful resources

[Mathematical functions in RDFox](https://docs.oxfordsemantic.tech/querying.html#mathematical-functions)

In [43]:
cal_sparql = """

SELECT ?percentage ?tag
WHERE {
    ?tag :mentionedInPercentageOfArticles ?percentage .
    FILTER (?percentage > 1)
} ORDER BY DESC (?percentage)

"""

Here is a representative sample of the data in `3_2-ComplexCalculationsData.ttl`.

In [44]:
sample_data = """
@prefix : <https://rdfox.com/example#> .

:blogA :containsArticle :article001,
                :article002 .

:article001 a :Article ;
        :hasTag :AI ,
                :semanticReasoning .

"""

In [46]:
# Clear data store
clear_response = requests.delete(
    rdfox_server + "/datastores/default/content?facts=true&axioms&rules")
assert_response_ok(clear_response, "Failed to clear data store.")

# Get and add data
with open("../data/3_2-ComplexCalculationsData.ttl", "r") as file:
    cal_data = file.read()
payload = {'operation': 'add-content-update-prefixes'}
data_response = requests.patch(
    rdfox_server + "/datastores/default/content", params=payload, data=cal_data)
assert_response_ok(data_response, "Failed to add facts to data store.")

# Get and add rules
with open("../rules/3_2-ComplexCalculationsRules.dlog", "r") as file:
    cal_rules = file.read()
rules_response = requests.post(rdfox_server + "/datastores/default/content", data=cal_rules)
assert_response_ok(rules_response, "Failed to add rule.")

# Issue select query
response = requests.get(
    rdfox_server + "/datastores/default/sparql", params={"query": cal_sparql})
assert_response_ok(response, "Failed to run select query.")
print('\n=== Percentage of articles mentioning tags ===')
print(response.text)


=== Primary Assets Transitive Dependencies ===
?percentage	?tag
16.0	<https://rdfox.com/example#SemanticReasoning>
14.0	<https://rdfox.com/example#Technology>
12.0	<https://rdfox.com/example#KnowledgeRepresentation>
10.0	<https://rdfox.com/example#RDFox>
8.0	<https://rdfox.com/example#Datalog>
7.0	<https://rdfox.com/example#LLMs>
7.0	<https://rdfox.com/example#AI>



## You should see...

=== Percentage of articles mentioning tags ===
|?percentage|?tag|
|-----------|-------------|
|16.0|	<https://rdfox.com/example#SemanticReasoning>|
|14.0|	<https://rdfox.com/example#Technology>|
|12.0|	<https://rdfox.com/example#KnowledgeRepresentation>|
|10.0|	<https://rdfox.com/example#RDFox>|
|8.0|	<https://rdfox.com/example#Datalog>|
|7.0|	<https://rdfox.com/example#LLMs>|
|7.0|	<https://rdfox.com/example#AI>|

## BONUS: Transactions

A transaction is the window in which RDFox performs one or several read and/or write operations, ending when the operations are totally complete and the data store is self-consistent.

Without specifying otherwise, a transaction will be created when any command is executed and will automatically close when its function has been achieved.

However, transactions can be manually opened with `begin` and closed with `commit` or `rollback` depending on whether the results of the transaction should be committed or discarded.

Multiple rule files can be imported in one transaction. This can be much more efficient if the rules interact with one another as RDFox computes an optimized order in which to import the rules.

Eg.

`begin`\
`import rule1.dlog`\
`import rule2.dlog`\
`commit`