In [None]:
import rdflib
from ipyradiant import (
    CustomURIRef,
    FileManager,
    MultiPanelSelect,
    PathLoader,
    PredicateMultiselectApp,
    collapse_predicates,
)
from rdflib import URIRef

## In this notebook we will show an example of how a user can take an RDF Graph (`rdf.graph.Graph`) and collapse predicates down while adding object data to subject nodes. This converts the graph from an `rdf.graph.Graph` instance to and LPG graph, in this case a `networkx.Graph` object.

## Start by using the `PredicateMultiselectApp`

### We also utilize the `CustomURIRef` class from `ipyradiant`. This allows us to get a 'pretty' representation of a URIRef while mainting access to the valuable URIRef.

We can borrow the loading widgets seen in other examples like the Tab App in order to
get a graph to play around with.

In [None]:
lw = FileManager(loader=PathLoader(path="data"))
# here we hard set what we want the file to be, but ideally a user can choose a file to work with.
lw.loader.file_picker.value = lw.loader.file_picker.options["oa.jsonld"]
lw

Now we can use the `PredicateMultiselectApp` to move predicates around based on what we
which ones we want to select. The 'Add predicates where object is a literal' button will
automatically move all predicates to the 'predicates to collapse' side if that predicate
is _always_ associated with a literal object.

In [None]:
graph = lw.graph
pma = PredicateMultiselectApp(
    graph=graph, namespaces={URIRef("http://www.w3.org/"): "w3"}
)
pma

## Collapsing of Graph

### Step 1. We start by getting all the predicates from the `PredicateMultiselectApp` above. For this example we are taking all the predicates where the object is always a literal (can received by clicking the button in the above application).

For the sake of the example, we will manually hit the button so that the
`pma.multiselect.selected_things_list` is populated.

In [None]:
b = ""
pma.populate_predicates(b)

In [None]:
predicates_to_collapse = [_.uri for _ in pma.multiselect.selected_things_list]

### Step 2. Now we can call the `ipyradiant` fuction `collapse_predicates` that will collapse down the data in the graph based on the specified `predicates_to_collapse`.

In [None]:
# for the `collapse_predicates` function, we will use the same namespaces as we did with the `PredicateMultiselectApp`
netx_graph = collapse_predicates(
    graph, predicates_to_collapse, namespaces=pma.namespaces
)

## Evidence of Correct Operation

Here we can see what the first node in the graph looks like now. We can see that
`rdfs:comment`, a previous predicate, has been collapsed onto the node and the
corresponding object, "A object of the relationship is a copy of the Source resource's
representation, appropriate for the Annotation." is attached to it.

In [None]:
netx_graph.nodes[rdflib.term.URIRef("http://www.w3.org/ns/oa#cachedSource")]

With this collapsing algorithm, if an object is a `URIRef` as well as a subject of a
different object in the graph, it will not be collapsed. However, the data will still be
moved onto the subject node if the user wishes to collapse said predicate. If the object
is a literal, then that node will disappear from the graph and the predicate will be
collapsed onto the subject, with all of the data being moved onto the node and erased
from the graph. Examples of these behaviors are seen below:

We will focus on two triples.. The first is
`(rdflib.term.URIRef('http://www.w3.org/ns/oa#Motivation'), rdflib.term.URIRef('http://www.w3.org/2000/01/rdf-schema#subClassOf'), rdflib.term.URIRef('http://www.w3.org/2004/02/skos/core#Concept'))`.
Since this object is not a subject, when we collapse this predicate we can expect the
node `rdflib.term.URIRef('http://www.w3.org/2000/01/rdf-schema#subClassOf'` to
disappear.

In [None]:
example_1 = collapse_predicates(
    graph,
    predicates_to_collapse=[
        rdflib.term.URIRef("http://www.w3.org/2000/01/rdf-schema#subClassOf")
    ],
    namespaces=pma.namespaces,
)
# we can see the data on the node
print(example_1.nodes[rdflib.term.URIRef("http://www.w3.org/ns/oa#Motivation")])
# we can see the object node does not exist anymore
try:
    print(
        example_1.nodes[
            rdflib.term.URIRef(
                rdflib.term.URIRef("http://www.w3.org/2004/02/skos/core#Concept")
            )
        ]
    )
except:
    print("The node has been removed from the graph.")

The second triple to focus on is
`(rdflib.term.URIRef('http://www.w3.org/ns/oa#cachedSource'), rdflib.term.URIRef('http://www.w3.org/2000/01/rdf-schema#isDefinedBy'), rdflib.term.URIRef('http://www.w3.org/ns/oa#'))`.
Since this object is also a subject, when we collapse the predicate we can expect the
node `rdflib.term.URIRef('http://www.w3.org/ns/oa#')` to still be in the graph.

In [None]:
example_2 = collapse_predicates(
    graph,
    predicates_to_collapse=[
        rdflib.term.URIRef("http://www.w3.org/2000/01/rdf-schema#isDefinedBy")
    ],
    namespaces=pma.namespaces,
)
# we can see the data on the node
print(example_2.nodes[rdflib.term.URIRef("http://www.w3.org/ns/oa#cachedSource")])
# we can see the object node is still in the graph
try:
    node_data = example_2.nodes[rdflib.term.URIRef("http://www.w3.org/ns/oa#")]
    # just look at one bit of sample data on the node to show it still exists
    print("The node still exists.")
except:
    print("The node has been removed from the graph.")