# GraphSparqlQAChain

Graph databases are an excellent choice for applications based on network-like models. To standardize the syntax and semantics of such graphs, the W3C recommends Semantic Web Technologies, cp. [Semantic Web](https://www.w3.org/standards/semanticweb/). [SPARQL](https://www.w3.org/TR/sparql11-query/) serves as a query language analogously to SQL or Cypher for these graphs. This notebook demonstrates the application of LLMs as a natural language interface to a graph database by generating SPARQL.\
Disclaimer: To date, SPARQL query generation via LLMs is still a bit unstable. Be especially careful with UPDATE queries, which alter the graph.

There are several sources you can run queries against, including files on the web, files you have available locally, SPARQL endpoints, e.g., [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page), and [triple stores](https://www.w3.org/wiki/LargeTripleStores).

In [1]:
from langchain.chains import GraphSparqlQAChain
from langchain.chat_models import ChatOpenAI
from langchain.graphs import RdfGraph

## Load some RDF data

In [2]:
graph = RdfGraph(
    source_file="http://www.w3.org/People/Berners-Lee/card",
    standard="rdf",
    local_copy="test.ttl",
)

Note that providing a `local_file` is necessary for storing changes locally if the source is read-only.

## Refresh graph schema information
If the schema of the database changes, you can refresh the schema information needed to generate SPARQL queries.

In [3]:
graph.load_schema()

In [4]:
graph.get_schema

'In the following, each IRI is followed by the local name and optionally its description in parentheses. \nThe RDF graph supports the following node types:\n<http://xmlns.com/foaf/0.1/PersonalProfileDocument> (PersonalProfileDocument, None), <http://www.w3.org/ns/auth/cert#RSAPublicKey> (RSAPublicKey, None), <http://www.w3.org/2000/10/swap/pim/contact#Male> (Male, None), <http://xmlns.com/foaf/0.1/Person> (Person, None), <http://www.w3.org/2006/vcard/ns#Work> (Work, None)\nThe RDF graph supports the following relationships:\n<http://xmlns.com/foaf/0.1/nick> (nick, None), <http://www.w3.org/2003/01/geo/wgs84_pos#lat> (lat, None), <http://www.w3.org/2006/vcard/ns#postal-code> (postal-code, None), <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> (type, None), <http://www.w3.org/2000/10/swap/pim/contact#participant> (participant, None), <http://www.w3.org/2000/01/rdf-schema#label> (label, None), <http://xmlns.com/foaf/0.1/maker> (maker, None), <http://purl.org/dc/elements/1.1/title> (titl

## Querying the graph

Now, you can use the graph SPARQL QA chain to ask questions about the graph.

In [5]:
chain = GraphSparqlQAChain.from_llm(
    ChatOpenAI(temperature=0), graph=graph, verbose=True
)

In [6]:
chain.run("What is Timothy Berners-Lee's work homepage?")



[1m> Entering new GraphSparqlQAChain chain...[0m
Identified intent:
[32;1m[1;3mSELECT[0m
Generated SPARQL:
[32;1m[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?homepage
WHERE {
    ?person foaf:name "Timothy Berners-Lee" .
    ?person foaf:workplaceHomepage ?homepage .
}[0m
Full Context:
[32;1m[1;3m[(rdflib.term.URIRef('https://www.w3.org/'),)][0m

[1m> Finished chain.[0m


"Timothy Berners-Lee's work homepage is https://www.w3.org/."

## Updating the graph

Analogously, you can update the graph, i.e., insert triples, using natural language.

In [7]:
chain.run(
    "Save that the person with the name 'Timothy Berners-Lee' has a work homepage at 'http://www.w3.org/foo/bar/'"
)



[1m> Entering new GraphSparqlQAChain chain...[0m
Identified intent:
[32;1m[1;3mUPDATE[0m
Generated SPARQL:
[32;1m[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>
INSERT {
    ?person foaf:workplaceHomepage <http://www.w3.org/foo/bar/> .
}
WHERE {
    ?person foaf:name "Timothy Berners-Lee" .
}[0m

[1m> Finished chain.[0m


'Successfully inserted triples into the graph.'

Let's verify the results:

In [8]:
query = """\
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?hp
WHERE {
    ?person foaf:name "Timothy Berners-Lee" . 
    ?person foaf:workplaceHomepage ?hp . 
}
"""
graph.query(query)

[(rdflib.term.URIRef('https://www.w3.org/'),),
 (rdflib.term.URIRef('http://www.w3.org/foo/bar/'),)]

## Now for an OWL ontology
Let's use data from a local file `graph_sparql_qa_local_input.ttl`.
The TBox provides notions for describing an actor, including their age.
The ABox includes a node for Tom Cruise and a triple stating his age to be 40.
Note that this deviates from what a vanilla LLM would return based on its training data.

In [9]:
graph = RdfGraph(
    source_file="graph_sparql_qa_local_input.ttl",
    standard="owl",
)

In [10]:
graph.load_schema()

In [11]:
graph.get_schema

'In the following, each IRI is followed by the local name and optionally its description in parentheses. \nThe OWL graph supports the following node types:\n<http://example.org/example/Actor> (Actor, An actor or actress is a person who acts in a dramatic production and who works in film, television, theatre, or radio in that capacity.), <http://example.org/example/AdministrativeRegion> (AdministrativeRegion, A PopulatedPlace under the jurisdiction of an administrative body. This body may administer either a whole region or one or more adjacent Settlements (town administration)), <http://example.org/example/Animal> (Animal, Kingdom of multicellular eukaryotic organisms.), <http://example.org/example/Person> (Person, Being that has certain capacities or attributes constituting personhood.), <http://example.org/example/Place> (Place, A small area known by a geographical name.)\nThe OWL graph supports the following object properties, i.e., relationships between objects:\n<http://example.or

In [12]:
chain = GraphSparqlQAChain.from_llm(
    ChatOpenAI(temperature=0), graph=graph, verbose=True
)

In [13]:
chain.run("How old is Tom Cruise?")



[1m> Entering new GraphSparqlQAChain chain...[0m
Identified intent:
[32;1m[1;3mSELECT[0m
Generated SPARQL:
[32;1m[1;3mPREFIX example: <http://example.org/example/>
SELECT ?age
WHERE {
    ?person example:name "Tom Cruise" .
    ?person example:age ?age .
}[0m
Full Context:
[32;1m[1;3m[(rdflib.term.Literal('40', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#integer')),)][0m

[1m> Finished chain.[0m


'Tom Cruise is 40 years old.'