# Querying

This notebook demonstrates how to retrieve, query and search data using the Forge.

In [1]:
from kgforge.core import KnowledgeGraphForge

In [2]:
import getpass

In [3]:
token = getpass.getpass()

In [4]:
forge = KnowledgeGraphForge("../../configurations/demo-forge-nexus.yml", token=token)

## Imports

In [5]:
from kgforge.core import Resource

In [6]:
import json

In [7]:
def pp(x): print(json.dumps(x, indent=4))

## Retrieval

In [12]:
jane = Resource(type="Person", name="Jane Doe", email="jane.doe@sxample.org")

In [13]:
forge.register(jane)

<action> _register_one
<succeeded> True


In [14]:
resource = forge.retrieve(jane.id)

In [15]:
resource == jane

True

### specific version

In [16]:
jane = Resource(type="Person", name="Jane Doe")

In [17]:
forge.register(jane)

<action> _register_one
<succeeded> True


In [18]:
forge.tag(jane, "v1")

<action> _tag_one
<succeeded> True


In [19]:
jane.email = "jane.doe@epfl.ch"

In [20]:
forge.update(jane)

<action> _update_one
<succeeded> True


In [21]:
jane._store_metadata._rev

3

In [22]:
jane_v1 = forge.retrieve(jane.id, version=1)

In [23]:
jane_v1_tag = forge.retrieve(jane.id, version="v1")

In [24]:
jane_v1 == jane_v1_tag

True

In [25]:
# print(jane_v1)
pp(forge.as_json(jane_v1))

{
    "id": "https://bbp.epfl.ch/dke/kgforge_tests/74ed38f7-7172-4e38-a71d-91c26011b405",
    "type": "Person",
    "name": "Jane Doe"
}


### error handling

In [26]:
resource = forge.retrieve("123")

<action> retrieve
<error> RetrievalError: not found



In [27]:
print(resource)

None


## Searching

In [28]:
jane = Resource(type="Person", name="Jane Doe")

In [29]:
john = Resource(type="Person", name="John Smith")

In [30]:
association_jane = Resource(type="Association", agent=jane)

In [31]:
association_john = Resource(type="Association", agent=john)

In [32]:
associations = [association_jane, association_john]

In [33]:
forge.register(associations)

<count> 2
<action> _register_many
<succeeded> True


`Association` is a known type by the Model, and the `paths` method will load the data structure for the given type. Refer to the `11 - Modeling.ipynb` notebook to learn about Modeling and Types.

In [34]:
p = forge.paths("Association")

You have autocompletion on `p` and this can be used to build a search.

In [35]:
resources = forge.search(p.type == "Association", limit=5)

In [36]:
type(resources)

list

In [37]:
len(resources)  

5

In [38]:
type(resources[0])

kgforge.core.resource.Resource

In [39]:
forge.as_dataframe(resources)

Unnamed: 0,id,type,agent.type,agent.name,distribution
0,https://bbp.epfl.ch/dke/kgforge_tests/8d0bbdc6...,Association,Person,Jane Doe,"[{'type': 'DataDownload', 'contentSize': {'uni..."
1,https://bbp.epfl.ch/dke/kgforge_tests/a471332d...,Association,Person,John Smith,
2,https://bbp.epfl.ch/dke/kgforge_tests/97567895...,Association,Person,Jane Doe,
3,https://bbp.epfl.ch/dke/kgforge_tests/c7df2735...,Association,Person,Jane Doe,
4,https://bbp.epfl.ch/dke/kgforge_tests/7fae8c71...,Association,Person,John Smith,


In [40]:
forge.as_dataframe(resources, store_metadata=True)

Unnamed: 0,id,type,agent.type,agent.name,distribution,_constrainedBy,_createdAt,_createdBy,_deprecated,_incoming,_outgoing,_project,_rev,_self,_updatedAt,_updatedBy
0,https://bbp.epfl.ch/dke/kgforge_tests/8d0bbdc6...,Association,Person,Jane Doe,"[{'type': 'DataDownload', 'contentSize': {'uni...",https://bluebrain.github.io/nexus/schemas/unco...,2020-04-21T08:37:30.417693Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...,False,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/proje...,1,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,2020-04-21T08:37:30.417693Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...
1,https://bbp.epfl.ch/dke/kgforge_tests/a471332d...,Association,Person,John Smith,,https://bluebrain.github.io/nexus/schemas/unco...,2020-04-08T11:32:15.157150Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...,False,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/proje...,1,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,2020-04-08T11:32:15.157150Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...
2,https://bbp.epfl.ch/dke/kgforge_tests/97567895...,Association,Person,Jane Doe,,https://bluebrain.github.io/nexus/schemas/unco...,2020-04-08T11:32:15.162391Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...,False,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/proje...,1,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,2020-04-08T11:32:15.162391Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...
3,https://bbp.epfl.ch/dke/kgforge_tests/c7df2735...,Association,Person,Jane Doe,,https://bluebrain.github.io/nexus/schemas/unco...,2020-04-08T11:52:13.302810Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...,False,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/proje...,1,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,2020-04-08T11:52:13.302810Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...
4,https://bbp.epfl.ch/dke/kgforge_tests/7fae8c71...,Association,Person,John Smith,,https://bluebrain.github.io/nexus/schemas/unco...,2020-04-08T11:52:13.324540Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...,False,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,https://staging.nexus.ocp.bbp.epfl.ch/v1/proje...,1,https://staging.nexus.ocp.bbp.epfl.ch/v1/resou...,2020-04-08T11:52:13.324540Z,https://staging.nexus.ocp.bbp.epfl.ch/v1/realm...


### nested field querying

You have autocompletion on `p` and also on nested properties like `p.agent`.

In [41]:
resources = forge.search(p.type == "Association", p.agent.name == "John Smith", limit=5)

In [42]:
len(resources)

5

In [43]:
forge.as_dataframe(resources)

Unnamed: 0,id,type,agent.type,agent.name,agent
0,https://bbp.epfl.ch/dke/kgforge_tests/ca829dc4...,Association,Person,John Smith,
1,https://bbp.epfl.ch/dke/kgforge_tests/54423100...,Association,,,"[{'type': 'Person', 'name': 'Jane Doe'}, {'typ..."
2,https://bbp.epfl.ch/dke/kgforge_tests/a471332d...,Association,Person,John Smith,
3,https://bbp.epfl.ch/dke/kgforge_tests/7fae8c71...,Association,Person,John Smith,
4,https://bbp.epfl.ch/dke/kgforge_tests/2868b2e3...,Association,Person,John Smith,


## Graph traversing

SPARQL is used to traverse the graph. The user can provide simplified queries: no prefixes, no compacted URIs, nor full URIs is required, only the term (property) available in the Model to the user.  Refer to the `11 - Modeling.ipynb` notebook to learn about Templates.

In [44]:
jane = Resource(type="Person", name="Jane Doe")

In [45]:
john = Resource(type="Person", name="John Smith")

In [46]:
association = Resource(type="Association", agent=[jane, john])

In [47]:
forge.register(association)

<action> _register_one
<succeeded> True


In [48]:
forge.template("Association")

{
    id: ""
    type: Association
    agent:
    {
        type: Person
        address:
        {
            type: PostalAddress
            postalCode: ""
            streetAddress: ""
        }
        birthDate: 9999-12-31
        deathDate: 9999-12-31
        gender:
        [
            female
            male
        ]
        givenName: ""
        name: ""
    }
}


In [49]:
query = """
    SELECT ?x ?name
    WHERE {
        ?x type Association ;
           agent ?agent .
        ?agent name ?name .
    }
"""

In [50]:
resources = forge.sparql(query, limit=5)

In [51]:
type(resources)

list

In [52]:
len(resources)

5

In [53]:
type(resources[0])

kgforge.core.resource.Resource

### rewritten query display

In [54]:
resources = forge.sparql(query, debug=True)

Submitted query:
   PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
   PREFIX prov: <http://www.w3.org/ns/prov#>
   PREFIX schema: <http://schema.org/>
   
       SELECT ?x ?name
       WHERE {
           ?x rdf:type schema:Association ;
              schema:agent ?agent .
           ?agent schema:name ?name .
       }



## Downloading

In [55]:
jane = Resource(type="Person", name="Jane Doe")

In [56]:
! ls -p ../../data | egrep -v /$

associations.tsv
persons.csv


In [57]:
distribution = forge.attach("../../data")

In [58]:
association = Resource(type="Association", agent=jane, distribution=distribution)

In [59]:
forge.register(association)

<action> _register_one
<succeeded> True


In [60]:
forge.download(association, "distribution.contentUrl", "./downloaded/")

In [61]:
! ls ./downloaded/

associations.tsv persons.csv


In [62]:
! rm -R ./downloaded/