In [None]:
%env AWS_PROFILE=platform-developer

In [None]:
%%graph_notebook_config
{
    "host": <value stored in AWS Secrets Manager under 'NeptuneTest/InstanceEndpoint' in the platform account>,
    "neptune_service": "neptune-db",
    "port": 8182,
    "ssl": true,
    "proxy_port": 443,
    "proxy_host": "catalogue-graph.wellcomecollection.org",
    "auth_mode": "IAM",
    "aws_region": "eu-west-1",
    "load_from_s3_arn": ""
}  

In [None]:
%status

## Sample openCypher queries

Count the number of all `SourceConcept` nodes

In [None]:
%%oc
MATCH (c:SourceConcept)
RETURN count(c)

Count the number of `SourceConcept` nodes grouped by their source (LCSH, MeSH, Wikidata)

In [None]:
%%oc
MATCH (c:SourceConcept)
RETURN count(c), c.source

We can do the same for `SourceLocation` and `SourceName` nodes

In [None]:
%%oc
MATCH (l:SourceLocation)
RETURN count(l), l.source

In [None]:
%%oc
MATCH (n:SourceName)
RETURN count(n), n.source

Using openCypher queries, we can easily traverse the edges in the graph. For example, we can use this query to look up the labels of `SourceConcept` parents:

In [None]:
%%oc
MATCH (c:SourceConcept)-[:HAS_PARENT]->(p)
WHERE c.source='nlm-mesh'
RETURN c.label, p.label
LIMIT 10

We can also traverse multiple edges using the `*` operator. For example, the query below retrieves grandparent labels of `SourceConcept` nodes (i.e. `2` levels of `HAS_PARENT` edges)

In [None]:
%%oc
MATCH (c:SourceConcept)-[:HAS_PARENT*2]->(p)
WHERE c.source='nlm-mesh'
RETURN c.label, p.label
LIMIT 10

We can count the number of links between sources via `SAME_AS` edges. This reveals a high level of Wikidata coverage for both LoC and MeSH `SourceConcepts`

In [None]:
%%oc
MATCH (sc1:SourceConcept)-[:SAME_AS]->(sc2:SourceConcept)
RETURN count(sc1), sc1.source

It is also possible to view an interactive visualisation of query results when returning everyting (`*`), which can be accessed via the `Graph` tab. This can be customised with visualization hints using `-d`, `-de`, `-l` and `-g` after the `%%oc` magic command. 

In [None]:
%%oc -d label -l 20
MATCH(c:SourceConcept)-[r:NARROWER_THAN*]->(p)
WHERE c.id = 'sh00002633'
RETURN *
LIMIT 20

In [None]:
%%oc -d label -l 20
MATCH (m:SourceConcept)-[p:HAS_PARENT*]->(c:SourceConcept)
WHERE m.id = 'D012499'
RETURN *

In [None]:
%%oc -d label -l 20
MATCH (m:SourceConcept)<-[p:HAS_PARENT*]-(c:SourceConcept)
WHERE m.id = 'D012499'
RETURN *

In [None]:
%%oc -d label -l 20 -g source
MATCH (sc1:SourceConcept)-[r:SAME_AS*]->(sc2:SourceConcept)
WHERE sc1.id = 'D012499'
RETURN *

In [None]:
%%oc -d label -l 25 -g source
MATCH (sc1:SourceConcept)-[r:RELATED_TO*..2]->(sc2:SourceConcept)
WHERE sc1.id = 'sh85117296'
RETURN *

In [None]:
%%oc -d label -l 25 -g source
MATCH (sc1:SourceConcept)-[r:NARROWER_THAN*..2]->(sc2:SourceConcept)
WHERE sc1.id = 'sh85117296'
RETURN *

In [None]:
%%oc -d label -l 25 -g source
MATCH (sc1:SourceConcept)<-[r:NARROWER_THAN*..2]-(sc2:SourceConcept)
WHERE sc1.id = 'sh85117296'
RETURN *

In [None]:
%%oc -d label -l 20 -g source
MATCH (sc1:SourceConcept)-[r:SAME_AS]->(sc2:SourceConcept)-[p:HAS_PARENT]->(sc3:SourceConcept)
RETURN *
LIMIT 10

In [None]:
%%oc -d label -l 20
MATCH (sn1:SourceName)-[r:SAME_AS]->(sn2:SourceName)
WHERE sn1.id='n84804337'
RETURN *

In [None]:
%%oc -d label -l 20
MATCH (sn1:SourceName)-[r:SAME_AS]->(sn2:SourceName)
WHERE sn1.id='Q542019'
RETURN *