# Connection setup

If you are connecting in a regular jupyter notebook and have activated our `ag-tutorial` environment then you do not need to run the following two cells to install `agraph-python` and `pandas`. However if you are in a **Google Colab** please run the following two cells.

In [None]:
%pip install agraph-python pandas pycurl

This section makes it possible to setup connection parameters. Values entered here will be saved in environment variables (and thus will be used as defaults by functions like `ag_connect()`. Depending on how you are connecting to an AllegroGraph repository the following variables will look slightly different. The following variables are an example of how to connect to a cloud server.

In [2]:
import os

def setup_env_var(var_name, value, description):
    os.environ[var_name] = value
    print("{}: {}".format(description, value))
    
setup_env_var('AGRAPH_HOST', 'https://ag197y8xsj2epl2e.allegrograph.cloud', 'Hostname')
setup_env_var('AGRAPH_PORT', '443', 'Port')
setup_env_var('AGRAPH_USER', 'admin', 'Username')
setup_env_var('AGRAPH_PASSWORD', 'GrMEKDvQFaN2bkrHJeiCbv', 'Password')

Hostname: https://ag197y8xsj2epl2e.allegrograph.cloud
Port: 443
Username: admin
Password: GrMEKDvQFaN2bkrHJeiCbv


# Example 1: Creating a repository and triple indices

## Listing catalogs

The first task is to attach to our AllegroGraph Server and open a
repository. To achieve this we build a chain of Python objects, ending in
a “connection” object that lets us manipulate triples in a specific
repository. The overall process of generating the connection object
follows this diagram:

The first example opens (or
creates) a repository by building a
series of client-side objects,
culminating in a “connection”
object.

The connection object contains the
methods that let us manipulate
triples in a specific repository.

![images/connection.svg](images/connection.svg)

Before we start, we will extract the location of the AG server from environment
variables

In [3]:
AGRAPH_HOST = os.environ.get('AGRAPH_HOST', 'localhost')
AGRAPH_PORT = int(os.environ.get('AGRAPH_PORT', '10079'))
AGRAPH_USER = os.environ.get('AGRAPH_USER', 'test')
AGRAPH_PASSWORD = os.environ.get('AGRAPH_PASSWORD', 'xyzzy')
print(AGRAPH_HOST)

https://ag197y8xsj2epl2e.allegrograph.cloud


AllegroGraph connection functions use these environment variables as
defaults, but we will pass the values explicitly to illustrate how to
specify connection parameters in Python.

The example first connects to an AllegroGraph Server by providing the
endpoint (host IP address and port number) of an already-launched
AllegroGraph server. This creates a client-side server object, which can
access the AllegroGraph server’s list of available catalogs through the
[listCatalogs()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.sail.html#franz.openrdf.sail.allegrographserver.AllegroGraphServer.listCatalogs) method. Note that the name of the root catalog
will be represented by None:

In [4]:
from franz.openrdf.sail.allegrographserver import AllegroGraphServer

print("Connecting to AllegroGraph server --",
      "host:'%s' port:%s" % (AGRAPH_HOST, AGRAPH_PORT))
server = AllegroGraphServer(AGRAPH_HOST, AGRAPH_PORT,
                            AGRAPH_USER, AGRAPH_PASSWORD)
print("Available catalogs:")
for cat_name in server.listCatalogs():
    if cat_name is None:
        print('  - <root catalog>')
    else:
        print('  - ' + str(cat_name))
        
print(server.listCatalogs())

Connecting to AllegroGraph server -- host:'https://ag197y8xsj2epl2e.allegrograph.cloud' port:443
Available catalogs:
  - <root catalog>
  - fedshard
  - system
[None, 'fedshard', 'system']


This output says that the server has the root catalog and possibly
also some other catalogs that someone has created for some
experimentation.

## Listing repositories

In the next part of this example, we use the [openCatalog()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.sail.html#franz.openrdf.sail.allegrographserver.AllegroGraphServer.openCatalog)
method to create a client-side catalog object. In this example we will
connect to the root catalog. When we look inside that catalog, we can
see which repositories are available:

In [5]:
catalog = server.openCatalog('')
print("Available repositories in catalog '%s':" % catalog.getName())
for repo_name in catalog.listRepositories():
    print('  - ' + repo_name)

Available repositories in catalog 'None':
  - actors
  - agraph-example
  - python-tutorial
  - test


The corresponding output lists the available repositories.

## Creating repositories

The next step is to create a client-side repository object representing
the respository we wish to open, by calling the [getRepository()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.sail.html#franz.openrdf.sail.allegrographserver.Catalog.getRepository)
method of the catalog object. We have to provide the name of the desired
repository ('python-tutorial'), and select one of four access modes:

- Repository.RENEW clears the contents of an existing repository
  before opening. If the indicated repository does not exist, it
  creates one.  
- Repository.OPEN opens an existing repository, or throws an
  exception if the repository is not found.  
- Repository.ACCESS opens an existing repository, or creates a new
  one if the repository is not found.  
- Repository.CREATE creates a new repository, or throws an
  exception if one by that name already exists.  

In [6]:
from franz.openrdf.repository.repository import Repository

mode = Repository.RENEW
my_repository = catalog.getRepository('python-tutorial', mode)
my_repository.initialize()

<franz.openrdf.repository.repository.Repository object at 0x7fffeddf5b70>

A new or renewed repository must be initialized, using the
[initialize()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repository.Repository.initialize) method of the repository object.

## Connecting to a repository

The goal of all this object-building has been to create a client-side
connection object, whose methods let us manipulate the triples of the
repository. The repository object’s [getConnection()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repository.Repository.getConnection) method
returns this connection object.

In [7]:
conn = my_repository.getConnection()
print('Repository %s is up!' % my_repository.getDatabaseName())
print('It contains %d statement(s).' % conn.size())

Repository python-tutorial is up!
It contains 0 statement(s).


The [size()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.size) method of the connection object returns how many
triples are present. In the example1() function, this number
will always be zero because we “renewed” the repository.

## Managing indices

Whenever you create a new repository, you should stop to consider which
kinds of triple indices you will need. This is an important efficiency
decision. AllegroGraph uses a set of sorted indices to quickly identify
a contiguous range of triples that are likely to match a specific query
pattern.

These indices are identified by names that describe their organization.
The default set of indices are called **spogi, posgi, ospgi, gspoi,
gposi, gospi**, and **i** , where:

- **S** stands for the subject URI.  
- **P** stands for the predicate URI.  
- **O** stands for the object URI or literal.  
- **G** stands for the graph URI.  
- **I** stands for the triple identifier (its unique id number within the
  triple store).  
  

The order of the letters denotes how the index has been organized. For
instance, the **spogi** index contains all of the triples in the
store, sorted first by subject, then by predicate, then by object, and
finally by graph. The triple id number is present as a fifth column in
the index. If you know the URI of a desired resource (the *subject*
value of the query pattern), then the **spogi** index lets you quickly
locate and retrieve all triples with that subject.

The idea is to provide your respository with the indices that your
queries will need, and to avoid maintaining indices that you will never
need.

We can use the connection object’s [listValidIndices()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.listValidIndices) method to
examine the list of all possible AllegroGraph triple indices:

In [8]:
indices = conn.listValidIndices()
print(indices)
group_size = 5
print('All valid triple indices:')
for offset in range(0, len(indices), group_size):
    group = indices[offset:offset + group_size]
    print('  ', ' '.join(group))

['spogi', 'spgoi', 'sopgi', 'sogpi', 'sgpoi', 'sgopi', 'psogi', 'psgoi', 'posgi', 'pogsi', 'pgsoi', 'pgosi', 'ospgi', 'osgpi', 'opsgi', 'opgsi', 'ogspi', 'ogpsi', 'gspoi', 'gsopi', 'gpsoi', 'gposi', 'gospi', 'gopsi', 'i']
All valid triple indices:
   spogi spgoi sopgi sogpi sgpoi
   sgopi psogi psgoi posgi pogsi
   pgsoi pgosi ospgi osgpi opsgi
   opgsi ogspi ogpsi gspoi gsopi
   gpsoi gposi gospi gopsi i


This is the list of all possible valid indices:

AllegroGraph can generate any of these indices if you need them, but it
creates only seven indices by default. We can see the current indices
by using the connection object’s [listIndices()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.listIndices) method:

In [9]:
indices = conn.listIndices()
print('Current triple indices:', ', '.join(indices))

Current triple indices: i, gposi, gspoi, ospgi, posgi, psogi, spogi


There are currently seven indices

The indices that begin with “g” are sorted primarily by subgraph (or
“context”). If your application does not use subgraphs, you should
consider removing these indices from the repository. You don’t want to
build and maintain triple indices that your application will never use.
This wastes CPU time and disk space. The connection object has a
convenient [dropIndex()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.dropIndex) method:

In [10]:
print("Removing graph indices...")
conn.dropIndex("gospi")
conn.dropIndex("gposi")
conn.dropIndex("gspoi")
indices = conn.listIndices()
print('Current triple indices:', ', '.join(indices))

Removing graph indices...
Current triple indices: i, ospgi, posgi, psogi, spogi


Having dropped three of the triple indices, there are now four
remaining:

The **i** index is for deleting triples by using the triple id number.
It is also required for [free text indexing](#example12).  The
**ospgi** index is sorted primarily by object value, which makes it
possible to efficiently retrieve a range of object values from the
index. Similarly, the **posgi** index lets us quickly reach for a
triples that all share the same predicate. We mentioned previously
that the **spogi** index speeds up the retrieval of triples that all
have the same subject URI.

As it happens, we may have been overly hasty in eliminating all of the
graph indices. AllegroGraph can find the right matches as long as there
is *any* one index present, but using the “right” index is much faster.
Let’s put one of the graph indices back, just in case we need it. We’ll
use the connection object’s [addIndex()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addIndex) method:

In [11]:
print("Adding one graph index back in...")
conn.addIndex("gspoi")
indices = conn.listIndices()
print('Current triple indices:', ', '.join(indices))

Adding one graph index back in...
Current triple indices: i, gspoi, ospgi, posgi, psogi, spogi


## Releasing resources

Both the connection and the repository object must be closed to
release resources once they are no longer needed. We can use the
[shutDown()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repository.Repository.shutDown) and [close()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.close)
methods to do this:

In [12]:
conn.close()
my_repository.shutDown()

It is safer and more convenient to ensure that the resources are
released by using the with statement:

In [13]:
with catalog.getRepository('python-tutorial', Repository.OPEN) as repo:
    # Note: an explicit call to initialize() is not required
    # when using the `with` statement.
    with repo.getConnection() as conn:
        print('Statements:', conn.size())

Statements: 0


## Utility functions

Creating the intermediate server, catalog and repository objects can
be tedious when the only thing required is a single connection to one
repository. In such circumstances it might be more convenient to use
the [ag_connect()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.html#franz.openrdf.connect.ag_connect) function. That is what we will do in further
examples. Here is a brief example of using [ag_connect()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.html#franz.openrdf.connect.ag_connect)

In [14]:
from franz.openrdf.connect import ag_connect

with ag_connect('python-tutorial', create=True, clear=True) as conn:
    print('Statements:', conn.size())

Statements: 0


This function take care of creating all required objects and the
returned context manager ensures that all necessary initialization
steps are taken and no resources are leaked. The create and
clear arguments ensure that the repository is empty and that it is
created if necessary.


<a id='example2'></a>

# Example 2: Asserting and retracting triples

In this example we show how to create resources describing two people,
Bob and Alice, by asserting triples into the repository. The example
also retracts and replaces a triple. Assertions and retractions to the
triple store are executed by [add()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.add) and [remove()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.remove) methods
belonging to the connection object, which we obtain by calling the
ag_connect() function described in [Example 1: Creating a repository and triple indices](#example1).

Before asserting a triple, we have to generate the URI values for the
subject, predicate and object fields. The AllegroGraph Python client
API predefines a number of classes and predicates for the RDF, RDFS,
XSD, and OWL ontologies. RDF.TYPE is one of the predefined
predicates we will use.

The [add()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.add) and [remove()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.remove) methods take an optional
contexts argument that specifies one or more subgraphs that are
the target of triple assertions and retractions. When the context is
omitted, triples are asserted/retracted to/from the default graph. In
the example below, facts about Alice and Bob reside in the default
graph.

The second example begins by calling ag_connect() to create the
appropriate connection object, which is bound to the variable conn.

In [15]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

The next step is to begin assembling the URIs we will need for the new
triples. The [createURI()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.createURI) method generates a URI from a string.
These are the subject URIs identifying the resources “Bob” and “Alice”:

In [16]:
alice = conn.createURI("http://example.org/people/alice")
bob = conn.createURI("http://example.org/people/bob")
print(alice)

<http://example.org/people/alice>


Bob and Alice will be members of the “person” class (rdf type
person).

In [17]:
person = conn.createURI("http://example.org/ontology/Person")

Both Bob and Alice will have a “name” attribute.

In [18]:
name = conn.createURI("http://example.org/ontology/name")

The name attributes will contain literal values. We have to generate the
[Literal](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.model.html#franz.openrdf.model.Literal) objects from strings:

In [19]:
bobsName = conn.createLiteral("Bob")
alicesName = conn.createLiteral("Alice")

The next line prints out the number of triples currently in the
repository - we expect that to be zero, since we have not yet added
any triples and the connect function should have removed any
existing statements from the repository.

In [20]:
print("Triple count before inserts:", conn.size())

Triple count before inserts: 0


Now we assert four triples, two for Bob and two more for Alice, using
the connection object’s [add()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.add) method. After
the assertions, we count triples again (there should be four) and print
out the triples for inspection.

In [21]:
from franz.openrdf.vocabulary import RDF

# alice is a person
conn.add(alice, RDF.TYPE, person)
# alice's name is "Alice"
conn.add(alice, name, alicesName)
# bob is a person
conn.add(bob, RDF.TYPE, person)
# bob's name is "Bob":
conn.add(bob, name, bobsName)

print("Triple count:", conn.size())
for s in conn.getStatements(None, name, None, None):
    print(s)

Triple count: 4
(<http://example.org/people/bob>, <http://example.org/ontology/name>, "Bob")
(<http://example.org/people/alice>, <http://example.org/ontology/name>, "Alice")


The None arguments to the [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) method say that we
don’t want to restrict what values may be present in the subject, predicate,
object or context positions. Just print out all the triples where the predicate is `name`

This is the output at this point. We see four triples, two about Alice
and two about Bob

We see two resources of type “person,” each with a literal name.

The next step is to demonstrate how to remove a triple. Use the
[remove()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.remove) method of the connection object, and supply a triple pattern
that matches the target triple. In this case we want to remove Bob’s name
triple from the repository. Then we’ll count the triples again to verify
that there are only three remaining.

In [22]:
conn.remove(bob, name, bobsName)
print("Triple count:", conn.size())

Triple count: 3


A potentially less verbose way of adding triples is to use the
[addData()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addData) method of the connection object with a string
containing triples in [Turtle](https://www.w3.org/TR/turtle/), [N-Triples](https://www.w3.org/TR/n-triples/) or another RDF format.

Let us see how the data used in this example could be added using
[addData()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addData). We will also wrap the whole process in a function
that we’ll use later:

In [23]:
def add_bob_and_alice(conn):
   conn.addData("""
       @base <http://example.org/> .

       <people/alice> a <ontology/Person> ;
                      <ontology/name> "Alice" .
       <people/bob> a <ontology/Person> ;
                    <ontology/name> "Bob" .
   """)

The string used here is in the [Turtle](https://www.w3.org/TR/turtle/) format. It is also possible
to use other formats by passing the rdf_format argument to
[addData()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addData).

We should check if the new function behaves as expected by creating a
fresh connection (recall that the clear parameter causes all existing
triples to be deleted):

In [24]:
with ag_connect('python-tutorial', clear=True) as conn:
    add_bob_and_alice(conn)
    print("Triple count:", conn.size())

Triple count: 4



<a id='example3'></a>

# Example 3: A SPARQL query

SPARQL stands for the [“SPARQL Protocol and RDF Query Language,”](http://www.w3.org/TR/rdf-sparql-query/) a recommendation of the
[World Wide Web Consortium (W3C)](http://www.w3.org/) . SPARQL is a
query language for retrieving RDF triples.

Our next example illustrates how to evaluate a SPARQL query. This is
the simplest query, the one that returns all triples. Note that we
will use the same triples that were used in [Example 2: Asserting and retracting triples](#example2).

Let’s create the connection first:

In [25]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

And now we can add our data and define the query:

In [26]:
conn.addData("""
    @base <http://example.org/> .

    <people/alice> a <ontology/Person> ;
                   <ontology/name> "Alice" .
    <people/bob> a <ontology/Person> ;
                 <ontology/name> "Bob" .
""")
query_string = "SELECT ?s ?p ?o  WHERE {?s ?p ?o .}"

The SELECT clause returns the variables ?s, ?p and ?o
in the binding set. The variables are bound to the subject, predicate
and objects values of each triple that satisfies the WHERE clause. In
this case the WHERE clause is unconstrained. The dot (.) in the
fourth position signifies the end of the pattern.

The connection object’s [prepareTupleQuery()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.prepareTupleQuery) method creates a
query object that can be evaluated one or more times. The results are
returned in an iterator that yields a sequence of binding sets.

In [27]:
from franz.openrdf.query.query import QueryLanguage

tuple_query = conn.prepareTupleQuery(QueryLanguage.SPARQL, query_string)
result = tuple_query.evaluate()

Below we illustrate one method for extracting the values from a
binding set, indexed by the name of the corresponding column variable
in the SELECT clause.

In [28]:
with result:
   for binding_set in result:
        s = binding_set.getValue("s")
        p = binding_set.getValue("p")
        o = binding_set.getValue("o")
        print("%s %s %s" % (s, p, o))

<http://example.org/people/alice> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://example.org/ontology/Person>
<http://example.org/people/alice> <http://example.org/ontology/name> "Alice"
<http://example.org/people/bob> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://example.org/ontology/Person>
<http://example.org/people/bob> <http://example.org/ontology/name> "Bob"


Note that we have wrapped the whole result processing in a with
statement. The reason is that result objects must be closed after
processing to release resources. The most convenient way to ensure
this is the with statement, but it is also possible to explicitly
call [close()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.queryresult.QueryResult.close)
(e.g. in a finally block).



In [29]:
conn.close()


<a id='example4'></a>

# Example 4: Statement matching

The [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) method of the connection object provides a
simple way to perform unsophisticated queries. This method lets you
enter a mix of required values and wildcards, and retrieve all
matching triples. (If you need to perform sophisticated tests and
comparisons you should use a SPARQL query instead.)

Below, we illustrate two kinds of [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) calls. The
first mimics traditional RDF4J syntax, and returns a [Statement](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.model.html#franz.openrdf.model.Statement)
object at each iteration. We will reuse data from in previous
examples to create a connection object and populate the repository
with four triples describing Bob and Alice. We’re going to search for
triples that mention Alice, so we have to create an “Alice” URI to use
in the search pattern:

In [30]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

In [31]:
conn.addData("""
    @base <http://example.org/> .

    <people/alice> a <ontology/Person> ;
                   <ontology/name> "Alice" .
    <people/bob> a <ontology/Person> ;
                 <ontology/name> "Bob" .
""")
alice = conn.createURI("http://example.org/people/alice")

Now we search for triples with Alice’s URI in the subject
position. The None values are wildcards for the predicate and
object positions of the triple.

In [32]:
statements = conn.getStatements(alice, None, None)

The [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) method returns a RepositoryResult object
(bound to the variable statements in this case). This object can
be iterated over, exposing one result statement at a time. It is
sometimes desirable to screen the results for duplicates, using the
[enableDuplicateFilter()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.repositoryresult.RepositoryResult.enableDuplicateFilter) method. Note, however, that duplicate
filtering can be expensive. Our example does not contain any
duplicates, but it is possible for them to occur.

In [33]:
with statements:
    statements.enableDuplicateFilter()
    for statement in statements:
        print(statement)

(<http://example.org/people/alice>, <http://example.org/ontology/name>, "Alice")
(<http://example.org/people/alice>, <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, <http://example.org/ontology/Person>)


This prints out the two matching triples for “Alice.”

Notice how we used the with keyword to ensure that the
RepositoryResult object is closed after the results are
fetched. This is necessary to release resources used during result
retrieval. The same goal could be accomplished by calling the
[RepositoryResult.close()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.repositoryresult.RepositoryResult.close) method (preferably in a finally
block to ensure exception safety).




<a id='example5'></a>

# Example 5: Literal values

The next example illustrates some variations on what we have seen so
far. The example creates and asserts plain, data-typed, and
language-tagged literals, and then conducts searches for them in three
ways:

- [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) search, which is an efficient way to match a
  single triple pattern.  
  - SPARQL direct match, for efficient multi-pattern search.  
  - SPARQL filter match, for sophisticated filtering such as
  performing range matches.  
  

The [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) and SPARQL direct searches return exactly
the datatype you ask for. The SPARQL filter queries can sometimes
return multiple datatypes. This behavior will be one focus of this
section.

If you are not explicit about the datatype of a value, either when
asserting the triple or when writing a search pattern, AllegroGraph
will deduce an appropriate datatype and use it. This is another focus
of this section. This helpful behavior can sometimes surprise you with
unanticipated results.

## Setup

We begin by obtaining a connection object and remvoing all existing data
from the repository

In [2]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

For the sake of coding efficiency, it is good practice to create variables
for namespace strings. We’ll use this namespace again and again in the
following example. We have made the URIs in this example very short to
keep the result displays compact.

In [3]:
exns = "ex://"
conn.setNamespace('ex', exns)

Namespace handling, including the [setNamespace()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.setNamespace) method, is
described in [Example 11: Namespaces](#example11).

The example will use an artificial data set consisting of eight
statements, each illustrating a different kind of literal. The subject
will describe the nature of the literal used as the object, while the
predicate will always be `<ex://p>`. The example shows how to enter
a full URI string, or alternately how to combine a namespace with a
local resource name.

In [4]:
ex_integer = conn.createURI("ex://integer")
ex_double = conn.createURI("ex://double")
ex_int = conn.createURI("ex://int")
ex_long = conn.createURI(
    namespace=exns, localname="long")
ex_float = conn.createURI(
    namespace=exns, localname="float")
ex_decimal = conn.createURI(
    namespace=exns, localname="decimal")
ex_string = conn.createURI(
    namespace=exns, localname="string")
ex_plain = conn.createURI(
    namespace=exns, localname="plain")

The predicate for all our statements will be the same.

In [5]:
pred = conn.createURI(namespace=exns, localname="p")

Now we construct the objects, illustrating various kinds of RDF
literals.

In [6]:
from franz.openrdf.vocabulary.xmlschema import XMLSchema

# Type will be XMLSchema.INTEGER
forty_two = conn.createLiteral(42)
# Type will be XMLSchema.DOUBLE
forty_two_double = conn.createLiteral(42.0)
forty_two_int = conn.createLiteral(
    '42', datatype=XMLSchema.INT)
forty_two_long = conn.createLiteral(
    '42', datatype=XMLSchema.LONG)
forty_two_float = conn.createLiteral(
    '42', datatype=XMLSchema.FLOAT)
forty_two_decimal = conn.createLiteral(
    '42', datatype=XMLSchema.DECIMAL)
forty_two_string = conn.createLiteral(
    '42', datatype=XMLSchema.STRING)
# Creates a plain (untyped) literal.
forty_two_plain = conn.createLiteral('42')

In four of these statements, we explicitly identified the datatype of
the value in order to create an INT, a LONG, a FLOAT and a
STRING. This is the best practice.

In three other statements, we just handed AllegroGraph numeric-looking
values to see what it would do with them. As we will see in a moment,
42 creates an INTEGER, 42.0 becomes a DOUBLE, and '42'
becomes a “plain” (untyped) literal value.

Note that plain literals are not *quite* the same thing
as typed literal strings. A search for a plain literal
will not always match a typed string, and *vice versa*.)

Now we will now assemble the URIs and values into [statements](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.model.html#franz.openrdf.model.Statement) (which are client-side triples):

In [7]:
stmt1 = conn.createStatement(ex_integer, pred, forty_two)
stmt2 = conn.createStatement(ex_double, pred, forty_two_double)
stmt3 = conn.createStatement(ex_int, pred, forty_two_int)
stmt4 = conn.createStatement(ex_long, pred, forty_two_long)
stmt5 = conn.createStatement(ex_float, pred, forty_two_float)
stmt6 = conn.createStatement(ex_decimal, pred, forty_two_decimal)
stmt7 = conn.createStatement(ex_string, pred, forty_two_string)
stmt8 = conn.createStatement(ex_plain, pred, forty_two_plain)

And then add the statements to the triple store on the AllegroGraph
server. We can use either [add()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.add) or [addStatement()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addStatement) for this
purpose.

In [8]:
conn.add(stmt1)
conn.add(stmt2)
conn.add(stmt3)
conn.addStatement(stmt4)
conn.addStatement(stmt5)
conn.addStatement(stmt6)
conn.addStatement(stmt7)
conn.addStatement(stmt8)

Now we’ll complete the round trip to see what triples we get back from
these assertions. This is where we use [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) in this
example to retrieve and display triples for us:

In [9]:
print("Showing all triples using getStatements(). Eight matches.")
conn.getStatements(None, pred, None, output=True,)

Showing all triples using getStatements(). Eight matches.
<ex://plain> <ex://p> "42" .
<ex://string> <ex://p> "42"^^<http://www.w3.org/2001/XMLSchema#string> .
<ex://decimal> <ex://p> "42.0"^^<http://www.w3.org/2001/XMLSchema#decimal> .
<ex://float> <ex://p> "4.2E1"^^<http://www.w3.org/2001/XMLSchema#float> .
<ex://long> <ex://p> "42"^^<http://www.w3.org/2001/XMLSchema#long> .
<ex://int> <ex://p> "42"^^<http://www.w3.org/2001/XMLSchema#int> .
<ex://double> <ex://p> "4.2E1"^^<http://www.w3.org/2001/XMLSchema#double> .
<ex://integer> <ex://p> "42"^^<http://www.w3.org/2001/XMLSchema#integer> .


This code prints out all triples from the store. The output
parameter causes the result to be printed on stdout (it is also
possible to pass a file name or a file-like object as the value of
this parameter to print to other destinations). Without output the
result would have been returned as a RepositoryResult object.

Note that the retrieved literals are of eight types: an int (a 32-bit
integer), an integer (arbitrary precision), a decimal, a long, a
float, a double, a string, and a “plain literal.”

If you ask for a specific datatype, you will get it. If you leave the
decision up to AllegroGraph, you might get something unexpected such as
a plain literal value.

## Numeric literal values

### Matching 42 without explicit type

This section explores [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) and SPARQL matches against
numeric triples. We ask AllegroGraph to find an untyped number,
42.

In [10]:
print('getStatements():')
conn.getStatements(None, pred, 42, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s ?p 42 .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery(
    'SELECT ?s ?p ?o WHERE {?s ?p ?o . filter (?o = 42)}',
    output=True)
print()

getStatements():
<ex://integer> <ex://p> "42"^^<http://www.w3.org/2001/XMLSchema#integer> .

SPARQL direct match
--------------
| s          |
| ex:integer |
--------------

SPARQL filter match
-----------------------------
| s          | p    | o     |
| ex:integer | ex:p | 42    |
| ex:double  | ex:p | 4.2E1 |
| ex:int     | ex:p | 42    |
| ex:long    | ex:p | 42    |
| ex:float   | ex:p | 4.2E1 |
| ex:decimal | ex:p | 42.0  |
-----------------------------



We use the executeQuery() method to retrieve the result of a
SPARQL query. Like [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements), it accepts an output
parameter that causes the result to be printed (instead of being
returned as a [TupleQueryResult](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.queryresult.TupleQueryResult) object).  Here is what the query


The [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) query returned triples containing longs
only. The SPARQL direct match treated the numeric literal as if it had
the type of <http://www.w3.org/2001/XMLSchema#integer> (see the
SPARQL [specification](https://www.w3.org/TR/sparql11-query/#QSynLiterals) for
information on how literals are parsed in queries) and returned only
triples with exactly the same type. The SPARQL filter match, however,
opened the doors to matches of multiple numeric types, and returned
ints, floats, longs and doubles.

### Matching 42.0 without explicit type

Now we will try the same queries using 42.0.

In [11]:
print('getStatements():')
conn.getStatements(None, pred, 42.0, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s <ex://p> 42.0 .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery(
    'SELECT ?s ?p ?o WHERE {?s ?p ?o . filter (?o = 42.0)}',
    output=True)
print()

getStatements():
<ex://double> <ex://p> "4.2E1"^^<http://www.w3.org/2001/XMLSchema#double> .

SPARQL direct match
--------------
| s          |
| ex:decimal |
--------------

SPARQL filter match
-----------------------------
| s          | p    | o     |
| ex:integer | ex:p | 42    |
| ex:double  | ex:p | 4.2E1 |
| ex:int     | ex:p | 42    |
| ex:long    | ex:p | 42    |
| ex:float   | ex:p | 4.2E1 |
| ex:decimal | ex:p | 42.0  |
-----------------------------



Here is what the query methods discussed in this example would
return:

The [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) search returned a double but not the similar
float. Direct SPARQL match treated 42.0 as a decimal (in
accordance with the SPARQL specification). The filter match returned
all numeric types that were equal to 42.0.

### Matching “42”^^xsd:int

The next section shows the results obtained when querying for a
literal with explicitly specified type. Note that doing this with
[getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) requires passing in a [Literal](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.model.html#franz.openrdf.model.Literal) object,
not a raw value.

In [12]:
print('getStatements():')
conn.getStatements(None, pred, forty_two_int, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s ?p "42"^^xsd:int .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?p ?o WHERE {
       ?s ?p ?o .
       filter (?o = "42"^^xsd:int)
    }''',
    output=True)
print()

getStatements():
<ex://int> <ex://p> "42"^^<http://www.w3.org/2001/XMLSchema#int> .

SPARQL direct match
----------
| s      |
| ex:int |
----------

SPARQL filter match
-----------------------------
| s          | p    | o     |
| ex:integer | ex:p | 42    |
| ex:double  | ex:p | 4.2E1 |
| ex:int     | ex:p | 42    |
| ex:long    | ex:p | 42    |
| ex:float   | ex:p | 4.2E1 |
| ex:decimal | ex:p | 42.0  |
-----------------------------



Here is what the query methods discussed in this example would
return:

We would get similar results when asking for any other typed literal
(forty_two_long, forty_two_float, …).

## Numeric strings and plain literals

At this point we are transitioning from tests of numeric matches to
tests of string matches, but there is a gray zone to be explored
first. What do we find if we search for strings that contain numbers?
In particular, what about “plain literal” values that are almost, but
not quite, strings?

### Matching “42” as a typed string

Let’s start with a typed string literal.

In [13]:
print('getStatements():')
conn.getStatements(None, pred, forty_two_string, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s ?p "42"^^xsd:string .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?p ?o WHERE {
       ?s ?p ?o .
       filter (?o = "42"^^xsd:string)
    }''',
    output=True)
print()

getStatements():
<ex://string> <ex://p> "42"^^<http://www.w3.org/2001/XMLSchema#string> .

SPARQL direct match
-------------
| s         |
| ex:plain  |
| ex:string |
-------------

SPARQL filter match
------------------------------------------------------------------
| s         | p    | o                                           |
| ex:string | ex:p | 42^^http://www.w3.org/2001/XMLSchema#string |
| ex:plain  | ex:p | 42                                          |
------------------------------------------------------------------



Here are the results:

SPARQL matched both plain and literal strings, but a
[getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) search returned only typed matches. In both
cases numeric literals were ignored.

### Matching “42” as a plain literal

If we try to match a plain (untyped) string value

In [14]:
print('getStatements():')
conn.getStatements(None, pred, forty_two_plain, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s ?p "42" .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?p ?o WHERE {
       ?s ?p ?o .
       filter (?o = "42")
    }''',
    output=True)
print()

getStatements():
<ex://plain> <ex://p> "42" .

SPARQL direct match
-------------
| s         |
| ex:string |
| ex:plain  |
-------------

SPARQL filter match
------------------------------------------------------------------
| s         | p    | o                                           |
| ex:string | ex:p | 42^^http://www.w3.org/2001/XMLSchema#string |
| ex:plain  | ex:p | 42                                          |
------------------------------------------------------------------



We will get results consistent with that we saw in the typed case:

In SPARQL both kinds of string literals were matched, while
[getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) returned only direct matches.

## Matching strings

In this section we’ll set up a variety of string triples and
experiment with matching them using [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) and SPARQL.

[Example 12: Free Text indexing](#example12) is a different topic. In this section we’re
doing simple matches of whole strings.

### Sample data

For these examples we will use a different data set.

In [15]:
name = conn.createURI('ex://name')
upper_g = conn.createLiteral('Galadriel')
lower_g = conn.createLiteral('galadriel')
typed_g = conn.createLiteral('Galadriel', XMLSchema.STRING)
lang_g = conn.createLiteral('Galadriel', language='sjn')
upper_a = conn.createLiteral('Artanis')
lower_a = conn.createLiteral('artanis')
typed_a = conn.createLiteral('Artanis', XMLSchema.STRING)
lang_a = conn.createLiteral('Artanis', language='qya')
conn.addTriple('<ex://upper_g>', name, upper_g)
conn.addTriple('<ex://lower_g>', name, lower_g)
conn.addTriple('<ex://typed_g>', name, typed_g)
conn.addTriple('<ex://lang_g>', name, lang_g)
conn.addTriple('<ex://upper_a>', name, upper_a)
conn.addTriple('<ex://lower_a>', name, lower_a)
conn.addTriple('<ex://typed_a>', name, typed_a)
conn.addTriple('<ex://lang_a>', name, lang_a)

We have two literals, each in four variants:

- Upper case (plain literal)  
  - Lower case (plain literal)  
  - Typed  
  - Tagged with a [BCP47](https://tools.ietf.org/html/bcp47) language tag appropriate for its language
  (Quenya or Sindarin) according to the the [registry](https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry)  

### Matching a plain string

We’ve seen a similar case when looking at matches for "42", but
this time we have more similar literals in the store.

In [16]:
print('getStatements():')
conn.getStatements(None, name, upper_g, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s <ex://name> "Galadriel" .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s <ex://name> ?o .
       filter (?o = "Galadriel")
    }''',
    output=True)
print()

getStatements():
<ex://upper_g> <ex://name> "Galadriel" .

SPARQL direct match
--------------
| s          |
| ex:typed_g |
| ex:upper_g |
--------------

SPARQL filter match
-------------------------------------------------------------------
| s          | o                                                  |
| ex:typed_g | Galadriel^^http://www.w3.org/2001/XMLSchema#string |
| ex:upper_g | Galadriel                                          |
-------------------------------------------------------------------



Here’s the result:

We can see that the match is case-sensitive and ignores the
language-tagged literal in all cases. As usual [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements)
matches only the exact kind of literal that we’ve provided, while
SPARQL is more liberal.

### Matching a language-tagged string

To retrieve the language-tagged variant we can ask for it explicitly:

In [17]:
print('getStatements():')
conn.getStatements(None, name, lang_g, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s <ex://name> "Galadriel"@sjn .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s <ex://name> ?o .
       filter (?o = "Galadriel"@sjn)
    }''',
    output=True)
print()

getStatements():
<ex://lang_g> <ex://name> "Galadriel"@sjn .

SPARQL direct match
-------------
| s         |
| ex:lang_g |
-------------

SPARQL filter match
-----------------------------
| s         | o             |
| ex:lang_g | Galadriel@sjn |
-----------------------------



Unsurprisingly we get exactly what we have asked for

You may be wondering how to perform a string match where language and
capitalization don’t matter. You can do that with a SPARQL filter
query using the str() function, which strips out the string
portion of a literal, leaving behind the datatype or language
tag. Then the fn:lower-case() function eliminates case issues:

In [18]:
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s <ex://name> ?o .
       filter (fn:lower-case(str(?o)) = "artanis")
    }''',
    output=True)

-----------------------------------------------------------------
| s          | o                                                |
| ex:lang_a  | Artanis@qya                                      |
| ex:typed_a | Artanis^^http://www.w3.org/2001/XMLSchema#string |
| ex:lower_a | artanis                                          |
| ex:upper_a | Artanis                                          |
-----------------------------------------------------------------


This query returns all variants of the selected literal

Remember that the SPARQL filter queries are powerful, but they are
also the slowest queries. SPARQL direct queries and getStatements()
queries are faster.

## Booleans

Boolean values in SPARQL are represented by literals of type
<http://www.w3.org/2001/XMLSchema#boolean>. There are two ways to
create such literals in Python:

1. From corresponding Python boolean values (True and FAlse):  

In [19]:
true1 = conn.createLiteral(True)
false1 = conn.createLiteral(False)
print(type(true1))
print(type(false1))

<class 'franz.openrdf.model.literal.Literal'>
<class 'franz.openrdf.model.literal.Literal'>


1. By creating a typed literal with the value of "true" or
"false". The type must be xsd:boolean:  

In [20]:
true2 = conn.createLiteral("true", datatype=XMLSchema.BOOLEAN)
false2 = conn.createLiteral("false", datatype=XMLSchema.BOOLEAN)



Both ways of creating boolean literals produce equivalent results:

In [21]:
print(true1)
print(true2)

"true"^^<http://www.w3.org/2001/XMLSchema#boolean>
"true"^^<http://www.w3.org/2001/XMLSchema#boolean>


As we can see the literals are identical.

Let us add some boolean data to the store:

In [22]:
conn.addData("""
    <ex://f> <ex://p>
        "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
    # In Turtle 'true' is the same as '"true"^^xsd:boolean"'
    <ex://t> <ex://p> true .
""")

When querying for boolean values using SPARQL one can use the literals
true and false as a shorthand for
"true"^^<http://www.w3.org/2001/XMLSchema#boolean> and
"false"^^<http://www.w3.org/2001/XMLSchema#boolean>. The code
below illustrates various ways of querying for boolean values:

In [23]:
print('getStatements():')
conn.getStatements(None, None, true1, output=True)
print()

print('SPARQL direct match (true)')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s ?p true.}',
    output=True)
print()

print('SPARQL direct match ("false"^^xsd:boolean)')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s ?p "false"^^xsd:boolean .}',
    output=True)
print()

print('SPARQL filter match ("false"^^xsd:boolean)')
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s ?p ?o .
       filter (?o = "false"^^xsd:boolean)
    }''',
    output=True)
print()

getStatements():
<ex://t> <ex://p> "true"^^<http://www.w3.org/2001/XMLSchema#boolean> .

SPARQL direct match (true)
--------
| s    |
| ex:t |
--------

SPARQL direct match ("false"^^xsd:boolean)
--------
| s    |
| ex:f |
--------

SPARQL filter match ("false"^^xsd:boolean)
----------------
| s    | o     |
| ex:f | false |
----------------



Here’s the output from that script:

## Dates and times

SPARQL represents dates and times using three literal types:
xsd:date, xsd:time and xsd:dateTime. These can be created
either explicitly from strings in the [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) format or from
Python datetime.date, datetime.time and datetime.datetime
values.

Let’s create a few sample literals:

In [24]:
from datetime import date, time, datetime
import iso8601

d = conn.createLiteral(date(1944, 8, 1))
t = conn.createLiteral(time(15, 0, 0))
dt = conn.createLiteral('1944-08-01T17:00:00+02:00',
                        datatype=XMLSchema.DATETIME)

Creating time and datetime literals from Python values can
yield somewhat unexpected results if time zones are involved:

In [25]:
surprise = conn.createLiteral(iso8601.parse_date(
    '1944-08-01T17:00:00+02:00'))
# Should be the same...
print(dt)
print(surprise)

"1944-08-01T17:00:00+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>
"1944-08-01T15:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>


The time has been converted to UTC. While both dt and surprise
refer to the same moment in time, this conversion might still lead to
problems if the user is not aware that it takes place.

We will now add the newly created literals to the store:

In [26]:
conn.addTriple('<ex://d>', '<ex://p>', d)
conn.addTriple('<ex://t>', '<ex://p>', t)
conn.addTriple('<ex://dt>', '<ex://p>', dt)

The following sections illustrate how date and time values behave
during queries.

### Matching dates

Let’s try the usual mix of query methods and see what is returned:

In [27]:
print('getStatements():')
conn.getStatements(None, None, d, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s ?p "1944-08-01"^^xsd:date .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s ?p ?o .
       filter (?o = "1944-08-01"^^xsd:date)
    }''',
    output=True)
print()

getStatements():
<ex://d> <ex://p> "1944-08-01"^^<http://www.w3.org/2001/XMLSchema#date> .

SPARQL direct match
--------
| s    |
| ex:d |
--------

SPARQL filter match
---------------------
| s    | o          |
| ex:d | 1944-08-01 |
---------------------



The result is not surprising. It is worth noting that the datetime
value has not been returned, even though it refers to the same date.

### Matching times

Times can be queried in a similar fashion.

In [28]:
print('getStatements():')
conn.getStatements(None, None, t, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery(
    'SELECT ?s WHERE {?s ?p "15:00:00Z"^^xsd:time .}',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s ?p ?o .
       filter (?o = "15:00:00Z"^^xsd:time)
    }''',
    output=True)
print()

getStatements():
<ex://t> <ex://p> "15:00:00Z"^^<http://www.w3.org/2001/XMLSchema#time> .

SPARQL direct match
--------
| s    |
| ex:t |
--------

SPARQL filter match
--------------------
| s    | o         |
| ex:t | 15:00:00Z |
--------------------



Again, only the value of the appropriate type is returned.

### Matching datetimes

Datetimes work just like times and dates:

In [29]:
print('getStatements():')
conn.getStatements(None, None, dt, output=True)
print()

print('SPARQL direct match')
conn.executeTupleQuery('''
    SELECT ?s WHERE {
       ?s ?p "1944-08-01T17:00:00+02:00"^^xsd:dateTime .
    }''',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s ?p ?o .
       filter (?o = "1944-08-01T17:00:00+02:00"^^xsd:dateTime)
    }''',
    output=True)
print()

getStatements():
<ex://dt> <ex://p> "1944-08-01T17:00:00+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime> .

SPARQL direct match
---------
| s     |
| ex:dt |
---------

SPARQL filter match
-------------------------------------
| s     | o                         |
| ex:dt | 1944-08-01T17:00:00+02:00 |
-------------------------------------



The result:

### Matching datetimes with offsets

We saw that times created from Python values are converted to UTC. So
what happens when we query for Zulu time, while the value in the store
is still in CEST? (An error is expected for the first query so we run that
in a separate box


In [31]:
try:
    zulu = conn.createLiteral("1944-08-01T15:00:00Z",
                          datatype=XMLSchema.DATETIME)
    print('getStatements():')
    conn.getStatements(None, None, zulu, output=True)
    print()
except:
    print('No response')

print('SPARQL direct match')
conn.executeTupleQuery('''
    SELECT ?s WHERE {
       ?s ?p "1944-08-01T15:00:00Z"^^xsd:dateTime .
    }''',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s ?p ?o .
       filter (?o = "1944-08-01T15:00:00Z"^^xsd:dateTime)
    }''',
    output=True)
print()

getStatements():
204 
No response
SPARQL direct match
---------
| s     |
| ex:dt |
---------

SPARQL filter match
-------------------------------------
| s     | o                         |
| ex:dt | 1944-08-01T17:00:00+02:00 |
-------------------------------------



In [32]:
print('SPARQL direct match')
conn.executeTupleQuery('''
    SELECT ?s WHERE {
       ?s ?p "1944-08-01T15:00:00Z"^^xsd:dateTime .
    }''',
    output=True)
print()

print('SPARQL filter match')
conn.executeTupleQuery('''
    SELECT ?s ?o WHERE {
       ?s ?p ?o .
       filter (?o = "1944-08-01T15:00:00Z"^^xsd:dateTime)
    }''',
    output=True)
print()

SPARQL direct match
---------
| s     |
| ex:dt |
---------

SPARQL filter match
-------------------------------------
| s     | o                         |
| ex:dt | 1944-08-01T17:00:00+02:00 |
-------------------------------------



AllegroGraph still finds our value when using SPARQL

When evaluating SPARQL queries AllegroGraph treats datetime
objects that refer to the same point in time as equivalent, regardless
of the timezone used in their representation. [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements)
performs exact matching, so will not return a value with different
timezone.




<a id='example6'></a>

# Example 6: Importing triples

AllegroGraph can import files in multiple RDF [formats](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.rio.html#franz.openrdf.rio.rdfformat.RDFFormat), such as [Turtle](https://www.w3.org/TR/turtle/) or [N-Triples](https://www.w3.org/TR/n-triples/). The example below
calls the connection object’s [add()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.add) method to load an N-Triples
file, and [addFile()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addFile) to load an RDF/XML file. Both methods work,
but the best practice is to use [addFile()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addFile).

The RDF/XML file contains a short
list of v-cards (virtual business cards), like this one:

Save this file in ./data/vcards.rdf (or choose another path
and adjust the code below).

The N-Triples file contains
a graph of resources describing the Kennedy family, the places where
they were each born, their colleges, and their professions. A typical
entry from that file looks like this:

Save the file to ./data/kennedy.ntriples.

Note that AllegroGraph can segregate triples into contexts (subgraphs)
by treating them as quads, but the N-Triples and RDF/XML formats
cannot include context information (unlike e.g [N-Quads](https://www.w3.org/TR/n-quads/) or
[Trig](https://www.w3.org/TR/trig/)). They deal with triples only, so there is no place to store a
fourth field in those formats. In the case of the [add()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.add) call, we
have omitted the context argument so the triples are loaded into the
default graph (sometimes called the “null context.”) The
[addFile()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addFile) call includes an explicit context setting, so the
fourth field of each VCard triple will be the context named
http://example.org#vcards. The connection [size()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.size) method
takes an optional context argument. With no argument, it returns the
total number of triples in the repository. Below, it returns the
number 16 for the context context argument, and the number
28 for the null context (None) argument.

In [67]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

The variables path1 and path2 are bound to the RDF/XML and
N-Triples files, respectively.

In [68]:
import os.path

# We assume that our data files live in this directory.
DATA_DIR = 'data'
path1 = os.path.join(DATA_DIR, 'vcards.rdf')
path2 = os.path.join(DATA_DIR, 'kennedy.ntriples')

The triples about the VCards will be added to a specific context, so
naturally we need a URI to identify that context.

In [69]:
context = conn.createURI("http://example.org#vcards")

In the next step we use [addFile()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addFile) to load the VCard triples into
the #vcards context:

In [70]:
from franz.openrdf.rio.rdfformat import RDFFormat

conn.addFile(path1, None, format=RDFFormat.RDFXML, context=context)

Then we use [add()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.add) to load the Kennedy family tree into the
default context:

In [71]:
conn.add(path2, base=None, format=RDFFormat.NTRIPLES, contexts=None)

Now we’ll ask AllegroGraph to report on how many triples it sees in
the default context and in the #vcards context:

In [72]:
print('VCard triples (in {context}): {count}'.format(
      count=conn.size(context), context=context))

print('Kennedy triples (default graph): {count}'.format(
      count=conn.size('null')))

VCard triples (in <http://example.org#vcards>): 16
Kennedy triples (default graph): 1214



<a id='example7'></a>

# Example 7: Querying multiple contexts

The purpose of this example is to see how data imported into multiple
contexts (like that from [Example 6: Importing triples](#example6)) behaves when queried using
various methods. This exampe covers only the results of basic
queries. The subject is explored in more detail in [Example 10: Graphs in SPARQL](#example10).

Let us start by creating a connection:

In [73]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

and adding a few triples in the default context:

In [74]:
from franz.openrdf.query.query import QueryLanguage

conn.addData("""
   <ex://default1> <ex://p1> 1 .
   <ex://default2> <ex://p2> 2 .
   <ex://default3> <ex://p3> 3 .""")

We can add data to another contect by using the optional context
parameter of [addData()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addData):

In [75]:
context = conn.createURI('ex://context')
conn.addData("""
   <ex://context1> <ex://p1> 1 .
   <ex://context2> <ex://p2> 2 .
   <ex://context3> <ex://p3> 3 .""",
   context=context)

Let’s try a [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) call first:

In [77]:
p1 = conn.createURI('ex://p1')
with conn.getStatements(None, p1, None, None) as result:
    for row in result:
        print(row.getSubject())

<ex://context1>
<ex://default1>


This loop prints out a mix of triples from the default context and
from the named context.

SPARQL queries behave in a different way. When a graph clause is
present, as in the following code, triples that are not in a named
context will not be examined:

In [78]:
query_string = """
    SELECT DISTINCT ?s WHERE {
      graph ?g { ?s ?p ?o filter(?o > 2).
    }} order by ?s"""
tuple_query = conn.prepareTupleQuery(
    QueryLanguage.SPARQL, query_string)
with tuple_query.evaluate() as result:
    for bindings in result:
        print(bindings[0])

<ex://context3>


Only the context3 triple is printed:

What happens if we issue a trivial query without mentioning graph?

In [79]:
query_string = """
    SELECT DISTINCT ?s WHERE {
      ?s ?p ?o .
    } order by ?s"""
tuple_query = conn.prepareTupleQuery(
    QueryLanguage.SPARQL, query_string)
with tuple_query.evaluate() as result:
    for bindings in result:
        print(bindings[0])

<ex://context1>
<ex://context2>
<ex://context3>
<ex://default1>
<ex://default2>
<ex://default3>


This prints all triples, just like a [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) call.

But this behavior can be altered by setting a query option.
AllegroGraph allows such options to be set by defining
a prefix.

In [80]:
query_string = """
    PREFIX franzOption_defaultDatasetBehavior: <franz:rdf>
    SELECT DISTINCT ?s WHERE {
      ?s ?p ?o .
    } order by ?s"""
tuple_query = conn.prepareTupleQuery(
    QueryLanguage.SPARQL, query_string)
with tuple_query.evaluate() as result:
    for bindings in result:
        print(bindings[0])

<ex://default1>
<ex://default2>
<ex://default3>


Now only the default context is matched by simple pattern (i.e. ones
not wrapped in graph ?g { ... })


<a id='example8'></a>

# Example 8: Exporting triples

This example shows how to serialize contents of a repository to a
file. As usual we’ll start with obtaining a connection to the
repository:

In [81]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

Now let’s import some data:

In [78]:
conn.addData("""
  <ex://s> <ex://p1> <ex://o1> , <ex://o2> ;
           <ex://p2> <ex://o3> .""")

Data can be exported by passing a file name or a file-like object as
the output parameter of [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements). In this case we’ll
want to print all statements to standard output. We can do this by
passign True as the output file:

In [79]:
from franz.openrdf.rio.rdfformat import RDFFormat

conn.getStatements(output=True, output_format=RDFFormat.NTRIPLES)

<ex://s> <ex://p2> <ex://o3> .
<ex://s> <ex://p1> <ex://o2> .
<ex://s> <ex://p1> <ex://o1> .


We can see that results are printed in the specified format:

We can also use other arguments of [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) to constrain
the set of exported tripes:

In [80]:
conn.getStatements(None, conn.createURI('ex://p1'), None,
                   output=True,
                   output_format=RDFFormat.NTRIPLES)

<ex://s> <ex://p1> <ex://o1> .
<ex://s> <ex://p1> <ex://o2> .


As expected, the result contains only two triples.

A file path can also be passed as the output argument:

In [81]:
import os
import sys

conn.getStatements(output='example8.nt')
with open('example8.nt', 'r') as f:
    sys.stdout.write(f.read())

os.remove('example8.nt')

<ex://s> <ex://p2> <ex://o3> .
<ex://s> <ex://p1> <ex://o2> .
<ex://s> <ex://p1> <ex://o1> .


This outputs data read from the file:


<a id='example9'></a>

# Example 9: Exporting query results

The [previous example](#example8) showed how to serialize
statements to a file or a stream. It is also possible to perform a
similar operation on the result of a query.

As usual, we’ll start by opening a connection:

In [82]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

and importing sample data - in this case containing birth and
(when applicable) coronation dates of the sons of Henry II.

In [83]:
conn.addData("""
  @prefix : <ex://> .
  @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

  :Henry :born "1155-02-28"^^xsd:date .
  :Richard :born "1157-09-08"^^xsd:date .
  :Geoffrey :born "1158-09-23"^^xsd:date .
  :John :born "1166-12-24"^^xsd:date .

  :Henry :crowned "1170-06-14"^^xsd:date .  # sort of...
  :Richard :crowned "1189-09-03"^^xsd:date .
  :John :crowned "1199-05-27"^^xsd:date .""")

Query results can be exported by passing a file name or a file-like
object as the output parameter of the [TupleQuery.evaluate()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.query.TupleQuery.evaluate)
method of the query object. In this case we’ll want to print all kings
born in or after 1156 from our dataset to standard output (we can use
True as the file name to indicate stdout):

In [84]:
from franz.openrdf.query.query import QueryLanguage
from franz.openrdf.rio.tupleformat import TupleFormat

query = conn.prepareTupleQuery(
    QueryLanguage.SPARQL,
    """
    select ?name ?crowned {
       ?name <ex://born> ?birth .
       ?name <ex://crowned> ?crowned .
       filter(?birth >= "1156-01-01"^^xsd:date) .
    }""")
query.evaluate(output=True,
               output_format=TupleFormat.CSV)

name,crowned
"ex://Richard","1189-09-03"
"ex://John","1199-05-27"


We can see that results are printed in the specified format:

We can export the result of a CREATE or DESCRIBE query in a
similar fashion. The difference is that we need to supply an
[RDFFormat](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.rio.html#franz.openrdf.rio.rdfformat.RDFFormat) instead of a [TupleFormat](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.rio.html#franz.openrdf.rio.tupleformat.TupleFormat), since the
result is a set of triples.

In [85]:
from franz.openrdf.rio.rdfformat import RDFFormat

query = conn.prepareGraphQuery(
    QueryLanguage.SPARQL, "describe <ex://Richard> where {}")
query.evaluate(output=True,
               output_format=RDFFormat.TURTLE) # NTRIPLES

@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix err: <http://www.w3.org/2005/xqt-errors#> .
@prefix fn: <http://www.w3.org/2005/xpath-functions#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix fti: <http://franz.com/ns/allegrograph/2.2/textindex/> .
@prefix keyword: <http://franz.com/ns/keyword#> .
@prefix nd: <http://franz.com/ns/allegrograph/5.0/geo/nd#> .
@prefix ndfn: <http://franz.com/ns/allegrograph/5.0/geo/nd/fn#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix xs: <http://www.w3.org/2001/XMLSchema#> .
<ex://Richard> <ex://born> "1157-09-08"^^xs:date ; 
  <ex://crowned> "1189-09-03"^^xs:date . 


As expected, the result contains two triples:

A file path can also be passed as the output argument:

In [86]:
import os
import sys

query = conn.prepareTupleQuery(
    QueryLanguage.SPARQL,
    """
    select ?name ?birth ?coronation {
      ?name <ex://born> ?birth ;
            <ex://crowned> ?coronation .
    }""")
query.evaluate(output='example9.csv',
               output_format=TupleFormat.CSV)
with open('example9.csv', 'r') as f:
    sys.stdout.write(f.read())

os.remove('example9.csv')

name,birth,coronation
"ex://Henry","1155-02-28","1170-06-14"
"ex://Richard","1157-09-08","1189-09-03"
"ex://John","1166-12-24","1199-05-27"


This outputs data read from the file:


<a id='example10'></a>

# Example 10: Graphs in SPARQL

In [Example 6: Importing triples](#example6) and [Example 7: Querying multiple contexts](#example7) we’ve seen how to import data
to a non-default context and run queries against such data. In this
example we’ll explore facilities for handling multiple contexts
provided by SPARQL and the AllegroGraph Python client.

We’ll start by opening a connection:

In [82]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

Now we will create two URIs that will represent named contexts.

In [83]:
context1 = conn.createURI("ex://context1")
context2 = conn.createURI("ex://context2")

The first context will be filled using the [addData()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addData) method:

In [84]:
conn.addData("""
    @prefix : <ex://> .
    :alice a :person ;
           :name "Alice" .""",
    context=context1)

The second context will be filled using [addTriple()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addTriple). Notice how
we use a constant defined in the RDF class to obtain the URI of
the type predicate:

In [85]:
from franz.openrdf.vocabulary.rdf import RDF

bob = conn.createURI('ex://bob')
bob_name = conn.createLiteral('Bob')
name = conn.createURI('ex://person')
person = conn.createURI('ex://person')
conn.addTriple(bob, RDF.TYPE, person,
               contexts=[context2])
conn.addTriple(bob, name, bob_name,
               contexts=[context2])

Finally we’ll add two triples to the default context using
[addStatement()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addStatement):

In [86]:
from franz.openrdf.model import Statement

ted = conn.createURI('ex://ted')
ted_name = conn.createLiteral('Ted')
stmt1 = Statement(ted, name, ted_name)
stmt2 = Statement(ted, RDF.TYPE, person)
conn.addStatement(stmt1)
conn.addStatement(stmt2)

The [Statement](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.model.html#franz.openrdf.model.Statement) object contains a context field.
This field is *ignored* by [addStatement()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addStatement). If you
wish to add a statement object to a specific context, use
the contexts parameter.

As we’ve seen already in [Example 7: Querying multiple contexts](#example7), a call to
[getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) will return triples from all contexts:

In [87]:
with conn.getStatements() as result:
    print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size()))

getStatements(): 6
size(): 6


[size()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.size) will also process all contexts by default.

Both [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) and [size()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.size) accept a contexts
parameter that can be used to limit processing to a specified list of
graphs:

In [88]:
contexts = [context1, context2]
with conn.getStatements(contexts=contexts) as result:
    print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size(contexts=contexts)))

getStatements(): 4
size(): 4


As expected, triples from the default context are not processed:

To include the default graph when using the contexts parameter use
None as a graph URI:

In [89]:
contexts = [context1, None]
with conn.getStatements(contexts=contexts) as result:
    print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size(contexts=contexts)))

getStatements(): 4
size(): 4


Now triples from the default context and from one of our named
contexts are processed:

## SPARQL using FROM, FROM DEFAULT, and FROM NAMED

In many of our examples we have used a simple SPARQL query to retrieve
triples from AllegroGraph’s default graph. This has been very
convenient but it is also misleading. As soon as we tell SPARQL to
search a specific graph, we lose the ability to search AllegroGraph’s
default graph! Triples from the null graph vanish from the search
results. Why is that?

It is important to understand that AllegroGraph and SPARQL use the
phrase “default graph” to identify two very different
things.

- AllegroGraph’s default graph, or null context, is simply the set
  of all triples that have null in the fourth field of the
  “triple.” The *default graph* is an unnamed subgraph of the
  AllegroGraph triple store.  
  - SPARQL uses *default graph* to describe something that is very
  different. In SPARQL, the *default graph* is a temporary pool of
  triples imported from one or more *named* graphs. SPARQL’s
  *default graph* is constructed and discarded in the service of a
  single query.  Standard SPARQL was designed for named graphs
  only, and has no syntax to identify a truly unnamed
  graph. AllegroGraph’s SPARQL, however, has been extended to allow
  the unnamed graph to participate in multi-graph queries.  
  

We can use AllegroGraph’s SPARQL to search specific subgraphs in three
ways.

- We can create a temporary *default graph* using the FROM
  operator.  
  - We can put AllegroGraph’s unnamed graph into SPARQL’s default
  graph using FROM DEFAULT.  
  - Or we can target specific named graphs using the FROM NAMED
  operator.  
  

Here’s an example of a query that accesses the unnamed graph explicitly:

In [95]:
query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s FROM DEFAULT {
        ?s ?p ?o
    }""")
query.evaluate(output=True)

------------
| s        |
| ex://ted |
------------


This will not process any of the triples in named contexts:

Here’s an example of a query that uses FROM. It instructs SPARQL
to regard context1 as the default graph for the purposes of this
query.

In [96]:
query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s FROM <ex://context1> {
        ?s ?p ?o
    }""")
query.evaluate(output=True)

--------------
| s          |
| ex://alice |
--------------


Now only one context is processed:

The next example changes FROM to FROM NAMED in the same query:

In [97]:
query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s FROM NAMED <ex://context1> {
        ?s ?p ?o
    }""")
query.evaluate(output=True)

-----
| s |
=====
-----


There are no matches now! The pattern { ?s ?p ?o . } only matches
the SPARQL default graph. We declared context1 to be a *named*
graph, so it is no longer the default graph.

To match triples in named graphs, SPARQL requires a GRAPH pattern:

In [98]:
query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s ?g FROM NAMED <ex://context1> {
        GRAPH ?g { ?s ?p ?o }
    }""")
query.evaluate(output=True)

------------------------------
| s          | g             |
| ex://alice | ex://context1 |
------------------------------


This time we’ll also print the graph:

We can also combine all the forms presented above:

In [90]:
query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s ?g
    FROM DEFAULT
    FROM <ex://context1>
    FROM NAMED <ex://context2> {
        { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } }
    }""")
query.evaluate(output=True)

------------------------------
| s          | g             |
| ex://ted   | ---           |
| ex://alice | ---           |
| ex://bob   | ex://context2 |
------------------------------


This query puts AllegroGraph’s unnamed graph and the context1
graph into SPARQL’s default graph, where the triples can be found by
using a simple {?s ?p ?o . } query.  Then it identifies
context2 as a named graph, which can be searched using a GRAPH
pattern.  In the final line, we used a UNION operator to combine
the matches of the simple and GRAPH patterns.

This query should find all three subjects:

## SPARQL with [Dataset](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.dataset.Dataset) object

A [Dataset](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.dataset.Dataset) object is a construct that contains two lists of
named graphs. There is one list of graphs that will become the SPARQL
default graph, just like using FROM in the query. There is a
second list of graphs that will be *named graphs* in the query, just
like using FROM NAMED. To use the dataset, we put the graph URIs into
the dataset object, and then add the dataset to the query object. When
we evaluate the query, the results will be confined to the graphs
listed in the dataset.

In [91]:
from franz.openrdf.query.dataset import Dataset

dataset = Dataset()
dataset.addDefaultGraph(context1)
dataset.addNamedGraph(context2)
query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s ?g {
      { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } }
    }""")
query.setDataset(dataset)
query.evaluate(output=True)

------------------------------
| s          | g             |
| ex://alice | ---           |
| ex://bob   | ex://context2 |
------------------------------


Note that, since we’re explicitly specifying graphs (through a dataset
object), we need a GRAPH pattern to match triples from the named
graphs. Triples from the unnamed graph are not matched at all, since
that graph is not a part of the dataset.


<a id='example11'></a>

# Example 11: Namespaces

A *namespace* is that portion of a URI that preceeds the last #,
/, or : character, inclusive. The remainder of a URI is called
the localname. For example, with respect to the URI
http://example.org/people/alice, the namespace is
http://example.org/people/ and the localname is alice. When
writing SPARQL queries, it is convenient to define prefixes or
nicknames for the namespaces, so that abbreviated URIs can be
specified. For example, if we define ex to be a nickname for
http://example.org/people/, then the string ex:alice is a
recognized abbreviation for http://example.org/people/alice. This
abbreviation is called a *qname* (qualified name).

In the SPARQL query discussed in this chapter we see two qnames,
rdf:type and ex:alice. Ordinarily, we would expect to see
PREFIX declarations in SPARQL that define namespaces for the
rdf and ex nicknames. However, the connection and query
machinery can do that job for you. The mapping of prefixes to
namespaces includes the built-in prefixes rdf, rdfs, xsd,
and owl. Hence, we can write rdf:type in a SPARQL query, and
the system already knows its meaning. In the case of the ex
prefix, we need to instruct it. The [setNamespace()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.setNamespace) method of the
connection object registers a new namespace.

It is legal, although not recommended, to redefine the
built-in prefixes (RDF, XSD etc…).

We start by opening a connection

In [92]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

and creating two URIs. Note how [createURI()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.createURI) allows us to compose
URIs from namespaces and local names.

In [93]:
exns = "http://example.org/people/"
alice = conn.createURI(namespace=exns, localname="alice")
person = conn.createURI(namespace=exns, localname="Person")

Now we can assert Alice’s RDF:TYPE triple.

In [94]:
from franz.openrdf.vocabulary.rdf import RDF

conn.add(alice, RDF.TYPE, person)

Now we register the exns namespace with the connection object, so
we can use it in a SPARQL query. The query looks for triples that have
rdf:type in the predicate position, and ex:Person in the
object position.

In [95]:
conn.setNamespace('ex', exns)
conn.executeTupleQuery("""
    SELECT ?s ?p ?o WHERE {
        ?s ?p ?o .
        FILTER (?p = rdf:type && ?o = ex:Person)
    }""", output=True)

-----------------------------------
| s        | p        | o         |
| ex:alice | rdf:type | ex:Person |
-----------------------------------


The output shows the single triple that we expected to find. This
demonstrates that the qnames in the SPARQL query successfully matched
the fully-expanded URIs in the triple. Note that the namespace prefix
is also used in the table below.

It should be mentioned here that the prefix of a namespace can be an
empty string. This allows the resulting qnames to be very concise and
readable:

In [96]:
conn.setNamespace('', 'http://a-long-and-often-used-namespace/')
conn.executeUpdate('insert data { :this :looks :nice }')
conn.executeTupleQuery('select ?s { ?s :looks :nice }',
                       output=True)

---------
| s     |
| :this |
---------



<a id='example12'></a>

# Example 12: Free Text indexing

It is common for users to build RDF applications that combine some
form of “keyword search” with their queries. For example, a user might
want to retrieve all triples for which the string “Alice” appears as a
word within the third (object) field of the triple. AllegroGraph
provides a capability for including free text matching within a SPARQL
query, and also by using the [evalFreeTextSearch()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.evalFreeTextSearch) method of the
connection object. It requires, however, that you create and configure
indexes appropriate to the searches you want to pursue.

First let’s open a connection

In [97]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

We will start this example by importing some sample data

In [98]:
conn.addData("""
    @prefix : <ex://> .

    :alice a :Person ;
             :fullname "Alice B. Toklas" .
    :book1 a :Book ;
             :title "Alice in Wonderland" ;
             :author :carroll .

    :carroll a :Person ;
               :fullname "Lewis Carroll" .""")

We have to create an index. AllegroGraph lets you create any number of
text indexes, each for a specific purpose. In this case we are
indexing the literal values we find in the fullname predicate,
which we have used in resources that describe people. The
[createFreeTextIndex()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.createFreeTextIndex) method has many configurable
parameters. Their default settings are appropriate to this
situation. All we have to provide is a name for the index and the URI
of the predicate (or predicates) that contain the text to be indexed.

In [99]:
fullname = conn.createURI(namespace='ex://',
                         localname='fullname')
conn.createFreeTextIndex(
    "index1", predicates=[fullname])

We can view the index configuration using the
[getFreeTextIndexConfiguration()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getFreeTextIndexConfiguration) method:

In [100]:
config = conn.getFreeTextIndexConfiguration("index1")
for key, value in config.items():
    print('{key}: {value}'.format(key=key, value=value))

predicates: [<ex://fullname>]
indexLiterals: True
indexResources: False
indexFields: ['object']
minimumWordSize: 3
stopWords: ['and', 'are', 'but', 'for', 'into', 'not', 'such', 'that', 'the', 'their', 'then', 'there', 'these', 'they', 'this', 'was', 'will', 'with']
wordFilters: []
innerChars: []
borderChars: []
tokenizer: default


This configuration says that index1 will operate on the literal
values it finds in the object position of the <ex://fullname>
predicate. It ignores words smaller than three characters in
length. It will ignore the words in its stopWords list (elided
from sample output). If it encounters a resource URI in the object
position, it will ignore it. This index doesn’t use any
wordFilters, which are sometimes used to remove accented letters
and to perform stemming on indexed text and search strings.

The text match occurs through a “magic” predicate called fti:match.
This predicate has two arguments. One is the subject URI of the
resources to search. The other is the string pattern to search for,
such as “Alice”. Only full-word matches will be found.

In [101]:
query = conn.prepareTupleQuery(query="""
    SELECT ?s WHERE {
        ?s fti:match "Alice" .
    }""")
query.evaluate(output=True)

--------------
| s          |
| ex://alice |
--------------


There is no need to include a prefix declaration for the fti
namespace. That is because fti is included among the built-in
namespace mappings in AllegroGraph.

When we execute our SPARQL query, it matches the "Alice" within the literal "Alice B. Toklas" because that literal occurs in a triple having the fullname predicate, but it does not match the “Alice” in the literal "Alice in Wonderland" because the title predicate was not included in our index.

By default fti:match searches in all text indexes. It is possible
to specify a single index name when searching. We’ll illustrate this
be creating another index, this time on the title predicate:

In [102]:
title = conn.createURI(namespace='ex://',
                       localname='title')
conn.createFreeTextIndex(
    "index2", predicates=[title])

query = conn.prepareTupleQuery(query="""
    SELECT ?s WHERE {
        ?s fti:match ( "Alice" "index2" ) .
    }""")
query.evaluate(output=True)

--------------
| s          |
| ex://book1 |
--------------


This time only the book title will match our query

Another way of searching text indexes is the
[evalFreeTextSearch()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.evalFreeTextSearch) method:

In [103]:
for triple in conn.evalFreeTextSearch(
        "Alice", index="index1"):
    print(triple[0])

<ex://alice>


This works just like our first query. Note that
[evalFreeTextSearch()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.evalFreeTextSearch) returns a list of lists of strings (in
N-Triples format), not a list of [Statement](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.model.html#franz.openrdf.model.Statement) objects.

Yay for consistency!

The text index supports simple wildcard queries. The asterisk (*)
may be appended to the end of the pattern to indicate “any number of
additional characters.” For instance, this query looks for whole words
that begin with “Ali”:

In [104]:
for triple in conn.evalFreeTextSearch("Ali*"):
    print(triple[0])

<ex://book1>
<ex://alice>


This search runs across both indexes, so it will find both the
:title and the :fullname triples.

There is also a single-character wildcard, the question mark. It will
match any single character. You can add as many question marks as you
need to the string pattern. This query looks for a five-letter word
that has “l” in the second position, and “c” in the fourth position:

In [105]:
for triple in conn.evalFreeTextSearch("?l?c?*"):
    print(triple[0])

<ex://book1>
<ex://alice>


The result is the same as for the previous query

Text indexes are not the only way of matching text values available in
SPARQL. One may also filter results using regular expressions. This
approach is more flexible, but at the price of performance. Regular
expression filters do not use any form of indexing to speed up the
query.

In [106]:
query = conn.prepareTupleQuery(query="""
    SELECT ?s ?p ?o WHERE {
        ?s ?p ?o .
        FILTER regex(?o, "lic|oll")
    }""")
query.evaluate(output=True)

------------------------------------------------------
| s            | p             | o                   |
| ex://alice   | ex://fullname | Alice B. Toklas     |
| ex://book1   | ex://title    | Alice in Wonderland |
| ex://carroll | ex://fullname | Lewis Carroll       |
------------------------------------------------------


Note how this search matches the provided pattern inside words.

In addition to indexing literal values, AllegroGraph can also index
resource URIs. index3 is an index that looks for URIs in the
object position of the author predicate, and then indexes only the
local name of the resource (the characters following the rightmost
/, # or : in the URI). This lets us avoid indexing
highly-repetitive namespace strings, which would fill the index with
data that would not be very useful.

In [107]:
author = conn.createURI(namespace='ex://',
                        localname='author')

conn.createFreeTextIndex(
    "index3", predicates=[author],
    indexResources="short", indexFields=["object"])

for triple in conn.evalFreeTextSearch("carroll",
                                      index="index3"):
    print(triple[0])

<ex://book1>


The text search located the triple that has carroll in the URI in
the object position:


<a id='example13'></a>

# Example 13: SPARQL query forms

SPARQL provides alternatives to the standard SELECT query. This
example exercises these alternatives to show how AllegroGraph Server
and the Python client handle them.

Let’s connect to the database:

In [108]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

We’ll need some sample data to illustrate all the query types. Our
dataset will contain information about rulers of 17th century England.

In [109]:
conn.addData("""
    @prefix : <ex://> .
    @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

    :james_i :reigned_from "1603-03-24"^^xsd:date ;
             :reigned_to "1625-03-27"^^xsd:date .
    :charles_i :reigned_from "1625-03-27"^^xsd:date ;
               :reigned_to "1649-01-30"^^xsd:date ;
               :child_of :james_i .
    :charles_ii :reigned_from "1649-01-30"^^xsd:date ;
             :reigned_to "1685-02-06"^^xsd:date ;
             :child_of :charles_i .
    :james_ii :reigned_from "1685-02-06"^^xsd:date ;
             :reigned_to "1688-12-11"^^xsd:date ;
             :child_of :charles_i .
    :mary_ii :reigned_from "1689-02-13"^^xsd:date ;
             :reigned_to "1694-12-28"^^xsd:date ;
             :child_of :james_ii .
    :william_iii :reigned_from "1672-07-04"^^xsd:date ;
                 :reigned_to "1702-03-08"^^xsd:date .
    :anne :reigned_from "1707-05-01"^^xsd:date ;
          :reigned_to "1714-08-01"^^xsd:date ;
          :child_of :james_ii .
""")

## SELECT

This kind of query returns a sequence of tuples, binding variables to
matching elements of a search pattern. SELECT queries are created
using [prepareTupleQuery()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.prepareTupleQuery) and return results of type
[TupleQueryResult](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.queryresult.TupleQueryResult). Query result can also be serialized in a
supported [TupleFormat](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.rio.html#franz.openrdf.rio.tupleformat.TupleFormat) - in previous examples we used
output=True and relied on the default TupleFormat.TABLE.

Here’s a sample query which locates all rulers whose grandchildren
inherited the crown:

In [110]:
conn.setNamespace('', 'ex://')
query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?name WHERE {
        ?grandchild :child_of/:child_of ?name .
    } ORDER BY ?name """)

with query.evaluate() as result:
    for bindings in result:
        print(bindings.getValue('name'))

<ex://charles_i>
<ex://james_i>


Two names are returned:

We can also serialize the output instead of processing the result
object. This time let us reverse the query and ask for rulers whose
grandparents are also in the dataset:

In [111]:
from franz.openrdf.rio.tupleformat import TupleFormat

query = conn.prepareTupleQuery(query="""
   SELECT DISTINCT ?name WHERE {
      ?name :child_of/:child_of ?grandparent .
   } ORDER BY ?name """)

query.evaluate(output=True, output_format=TupleFormat.CSV)

name
"ex://anne"
"ex://charles_ii"
"ex://james_ii"
"ex://mary_ii"


We get four results, serialized as CSV:

## ASK

The ASK query returns a Boolean, depending on whether the triple
pattern matched any triples. Queries of this type are created using
[prepareBooleanQuery()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.prepareBooleanQuery).

Let’s check if there were any co-regencies in the time period
described by our dataset:

In [112]:
query = conn.prepareBooleanQuery(query="""

    ASK { ?ruler1 :reigned_from ?r1from ;
                  :reigned_to ?r1to .
          ?ruler2 :reigned_from ?r2from ;
                  :reigned_to ?r2to .
          FILTER (?ruler1 != ?ruler2 &&
                  ?r1from >= ?r2from &&
                  ?r1from < ?r2to)
    }""")

print(query.evaluate())

True


There was one (William and Mary):

## CONSTRUCT

The CONSTRUCT query creates triples by substantiating provided
templates with values resulting from matching a pattern. Queries of
this kind are created using [prepareGraphQuery()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.prepareGraphQuery) and return a
[RepositoryResult](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.repositoryresult.RepositoryResult) - which is an iterator over the constructed
triples.

Executing a CONSTRUCT query will *not* add any triples to the
store. To insert the data we have to iterate over the result and
add each triple using [addStatement()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.addStatement) (or use an INSERT
query).

Let us consider a query that calculates a :sibling_of
relationship:

In [113]:
print('Size before: {0}'.format(conn.size()))
query = conn.prepareGraphQuery(query="""
   CONSTRUCT {
       ?person1 :sibling_of ?person2 .
   } WHERE {
       ?person1 :child_of ?parent .
       ?person2 :child_of ?parent .
       filter (?person1 != ?person2) .
   }""")
for stmt in query.evaluate():
    print('{0} <-> {1}'.format(stmt.getSubject(),
                               stmt.getObject()))
print('Size after: {0}'.format(conn.size()))

Size before: 19
<ex://mary_ii> <-> <ex://anne>
<ex://anne> <-> <ex://mary_ii>
<ex://charles_ii> <-> <ex://james_ii>
<ex://james_ii> <-> <ex://charles_ii>
Size after: 19


The returned object is an iterator over [Statement](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.model.html#franz.openrdf.model.Statement) objects. We can
also see that no data has been added to the repository.

We can also serialize the result using any of the supported
[RDFFormats](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.rio.html#franz.openrdf.rio.rdfformat.RDFFormat):

In [114]:
from franz.openrdf.rio.rdfformat import RDFFormat

query.evaluate(output=True,
               output_format=RDFFormat.NTRIPLES)

<ex://mary_ii> <ex://sibling_of> <ex://anne> .
<ex://anne> <ex://sibling_of> <ex://mary_ii> .
<ex://charles_ii> <ex://sibling_of> <ex://james_ii> .
<ex://james_ii> <ex://sibling_of> <ex://charles_ii> .


Here we use the [N-Triples](https://www.w3.org/TR/n-triples/) format. This happens to be the default,
so we could have omitted the output_format argument.

## DESCRIBE

The DESCRIBE query returns triples that ‘describe’ a given set of
resources. Such queries are created using [prepareGraphQuery()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.prepareGraphQuery)
and return [RepositoryResult](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.repositoryresult.RepositoryResult) objects.

The set of resources to be processed is specified by a query
pattern. The SPARQL standard does not say what triples constitute a
‘description’ of a particular resource. AllegroGraph will return the
[Concise Bounded Description](https://www.w3.org/Submission/CBD/) of the queried resources.

Let’s use a DESCRIBE query to see what data do we have regarding
the children of Charles I:

In [115]:
query = conn.prepareGraphQuery(query="""
    DESCRIBE ?child WHERE {
        ?child :child_of :charles_i
    }""")
for stmt in query.evaluate():
    print(stmt)

(<ex://james_ii>, <ex://child_of>, <ex://charles_i>)
(<ex://james_ii>, <ex://reigned_to>, "1688-12-11"^^<http://www.w3.org/2001/XMLSchema#date>)
(<ex://james_ii>, <ex://reigned_from>, "1685-02-06"^^<http://www.w3.org/2001/XMLSchema#date>)
(<ex://charles_ii>, <ex://child_of>, <ex://charles_i>)
(<ex://charles_ii>, <ex://reigned_to>, "1685-02-06"^^<http://www.w3.org/2001/XMLSchema#date>)
(<ex://charles_ii>, <ex://reigned_from>, "1649-01-30"^^<http://www.w3.org/2001/XMLSchema#date>)


In this case AllegroGraph will simply return all triples with subject
in the specified set:

DESCRIBE queries can be useful for exploring a dataset and learning
what properties a certain object might have. The results of such
queries can be serialized to any supported [RDFFormat](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.rio.html#franz.openrdf.rio.rdfformat.RDFFormat):

In [116]:
query.evaluate(output=True,
               output_format=RDFFormat.NTRIPLES)

<ex://james_ii> <ex://child_of> <ex://charles_i> .
<ex://james_ii> <ex://reigned_to> "1688-12-11"^^<http://www.w3.org/2001/XMLSchema#date> .
<ex://james_ii> <ex://reigned_from> "1685-02-06"^^<http://www.w3.org/2001/XMLSchema#date> .
<ex://charles_ii> <ex://child_of> <ex://charles_i> .
<ex://charles_ii> <ex://reigned_to> "1685-02-06"^^<http://www.w3.org/2001/XMLSchema#date> .
<ex://charles_ii> <ex://reigned_from> "1649-01-30"^^<http://www.w3.org/2001/XMLSchema#date> .



<a id='example14'></a>

# Example 14: Parametric queries

In previous examples our SPARQL queries were always fixed strings. In
practice it is often necessary to include some variable elements
(e.g. user input, results from another query, …) in the query
strings.

To illustrate, let us create a connection

In [118]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

and populate the repository with sample data:

In [119]:
conn.addData(r"""
  @prefix : <ex://> .

  :cipher42 :label "RSA" ; :code 1 .
  :hedge1 :label "\\/\\/\\/\\/\\/\\/\\" ; :code 2 .
  :hedge2 :label "/\\/\\/\\/\\/\\/\\//" .  # No code
  :has_no_label :secret "squeamish ossifrage" .
""")

Suppose that we need a function that will take a search text and find
all subjects that have a label containing a given search pattern. To
make the query a little more interesting we’ll also print the value of
the :code predicate for all subjects found.

In [120]:
conn.setNamespace('', 'ex://')
query = conn.prepareTupleQuery(query="""
    SELECT ?s ?code WHERE {
       ?s :label ?o .
       FILTER contains(?o, ?search)
       OPTIONAL { ?s <ex://code> ?code } .
    }""")

def print_labelled_subjects(search_text):
   query.setBinding('search', search_text)
   query.evaluate(output=True)

We have created a query object with a variable (?search) in place of the search pattern. To use it we have to provide a value for the variable. We do that using the [setBinding()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.query.html#franz.openrdf.query.query.Query.setBinding) method.

Let’s check if our function works as expected:

In [121]:
print_labelled_subjects(r'\/\/')

------------------
| s       | code |
| :hedge1 | 2    |
| :hedge2 | ---  |
------------------


This will return the ‘hedge’ subjects:

## String formatting

Another way to achieve our goal would be to use formatting or string
concatenation, like this:

In [122]:
import sys

conn.setNamespace('', 'ex://')
def print_labelled_subjects(search_text):
    query = conn.prepareTupleQuery(query="""
        SELECT ?s ?code WHERE {
            ?s :label ?o .
            FILTER contains(?o, "%s")
            OPTIONAL { ?s <ex://code> ?code } .
        }""" % search_text)
    query.evaluate(output=True)

print_labelled_subjects('RS')

--------------------
| s         | code |
| :cipher42 | 1    |
--------------------


This seems to work

But attempting to use a trickier input reveals a problem:

In [131]:
print_labelled_subjects(r'\/\/')

400 MALFORMED QUERY: Line 4, Invalid literal "\"\\/\\/\"". Note that single backslashes in literals must be escaped. I.e., use '\\x', not '\x'


RequestError: Server returned 400: Line 4, Invalid literal "\"\\/\\/\"". Note that single backslashes in literals must be escaped. I.e., use '\\x', not '\x'

The query is now invalid

A [devious user](https://xkcd.com/327/) could take advantage of this
bug to access data that is not supposed to be available

In [123]:
print_labelled_subjects(
    r'S") optional { ?x <ex://secret> ?code } # ')

-----------------------------------
| s         | code                |
| :cipher42 | squeamish ossifrage |
-----------------------------------


It should not be possible to reveal this literal by searching labels,
and yet:

We can work around this by ensuring proper escaping:

In [124]:
def print_labelled_subjects(search_text):
    search_lit = conn.createLiteral(search_text)
    query = conn.prepareTupleQuery(query="""
        SELECT ?s ?code WHERE {
            ?s :label ?o .
            FILTER contains(?o, %s)
            OPTIONAL { ?s <ex://code> ?code } .
        }""" % search_lit.toNTriples())
    query.evaluate(output=True)

print_labelled_subjects(r'\/\/')

------------------
| s       | code |
| :hedge1 | 2    |
| :hedge2 | ---  |
------------------


The function now works as expected:


<a id='example15'></a>

# Example 15: Range queries

In many of the previous examples we have used the
[getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) method to find all triples conforming to a given
pattern. The patterns we have used so far matched each triple
component against a single value. It is possible to use more complex
patterns that can match a range of values for each component. To
illustrate this let us first create a connection:

In [125]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

and construct some data:

In [126]:
conn.addData("""
    @prefix : <ex://> .

    :mercury a :planet ; :moons 0 .
    :venus a :planet ; :moons 0 .
    :earth a :planet ; :moons 1 .
    :mars a :planet ; :moons 2 .
    :jupiter a :planet ; :moons 67 .
    :saturn a :planet ; :moons 62 .
    :uranus a :planet ; :moons 27 .
    :neptune a :planet ; :moons 14 .
    :pluto a :dwarf_planet ; :moons 5 .
""")

Suppose that we want to locate all planets that have at least one, but
no more than five moons. To issue such a query we need to create a
Range object:

In [127]:
one_to_five = conn.createRange(1, 5)

We can pass the range object to [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements):

In [128]:
moons = conn.createURI('ex://moons')
with conn.getStatements(
        None, moons, one_to_five) as result:
    for statement in result:
        print(statement.getSubject())

<ex://pluto>
<ex://mars>
<ex://earth>


This will find two planets and one dwarf planet, as expected:

The arguments to [createRange()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.createRange) can be either RDF terms or
regular Python values that will be converted to typed literals. In our example we used have used values of type int,
which will be mapped to literals of type
<http://www.w3.org/2001/XMLSchema#integer>. Range queries will
only match values of exactly the same type. For instance if we add
another triple to our store:

In [129]:
conn.addData("""
    @prefix : <ex://> .
    @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

    :coruscant a :planet ; :moons "4"^^xsd:long .
""")

And then reissue our query:

In [130]:
with conn.getStatements(
        None, moons, one_to_five) as result:
    for statement in result:
        print(statement.getSubject())

<ex://pluto>
<ex://mars>
<ex://earth>


we will find that the result has not changed:

Range queries can also be performed with SPARQL, using FILTER:

In [131]:
conn.executeTupleQuery('''
    SELECT ?planet {
        ?planet <ex://moons> ?moons .
        filter (?moons <= 5 && ?moons >= 1)
    }''', output=True)

------------------
| planet         |
| ex://coruscant |
| ex://pluto     |
| ex://mars      |
| ex://earth     |
------------------


The result is the same as in the previous example.

When the filter expression is a simple set of inequalities, as it is
in this case, the query engine will use indices to optimize the query
execution, similaraly to the way [getStatements()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection.getStatements) does for range
queries.




<a id='example16'></a>

# Example 16: Federated repositories

AllegroGraph lets you split up your triples among repositories on
multiple servers and then search them all in parallel. To do this we
query a single “federated” repository that automatically distributes
the queries to the secondary repositories and combines the
results. From the point of view of your Python code, it looks like you
are working with a single repository.

To illustrate this, let us first create two repositories and import
some data. The data will represent positive numbers below 15. The
first repository will contain all Fibonacci numbers in that range,
while the second one will contain all other numbers.

In [2]:
from franz.openrdf.connect import ag_connect

with ag_connect('python_fib', create=True, clear=True) as conn:
    conn.addData("""
        @prefix : <ex://> .

        :one :value 1 .
        :two :value 2 .
        :three :value 3 .
        :five :value 5 .
        :eight :value 8 .
        :thirteen :value 13 .
    """)

with ag_connect('python_boring', create=True, clear=True) as conn:
    conn.addData("""
        @prefix : <ex://> .

        :four :value 4 .
        :six :value 6 .
        :seven :value 7 .
        :nine :value 9 .
        :ten :value 10 .
        :eleven :value 11 .
        :twelve :value 12 .
        :fourteen :value 14 .
        :fifteen :value 15 .
    """)

To create a federated repository, we first have to connect to the
server that will be used to aggregate results. We do this by creating
an [AllegroGraphServer](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.sail.html#franz.openrdf.sail.allegrographserver.AllegroGraphServer) instance.

In [3]:
from franz.openrdf.sail.allegrographserver import AllegroGraphServer

server = AllegroGraphServer()

We are using default server address and credentials, as described in
the [Setting the environment for the tutorial](setup.ipynb#setup) section of the tutorial.

The next step is to use the [openFederated()](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.sail.html#franz.openrdf.sail.allegrographserver.AllegroGraphServer.openFederated)
method to create a federated session. We will pass the list of
repositories to federate as an argument. Elements of this list could
be

- [Repository](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repository.Repository) objects  
- [RepositoryConnection](https://franz.com/agraph/support/documentation/6.4.0/python/_gen/franz.openrdf.repository.html#franz.openrdf.repository.repositoryconnection.RepositoryConnection) objects  
- strings (naming a store in the root catalog, or the URL of a
  store)  
- (storename, catalogname) tuples.  
  

We’ll use the third option

In [4]:
conn = server.openFederated(['python_fib', 'python_boring'])

Now we can query the combined repository.

In [7]:
query = conn.prepareTupleQuery(query="""
    select (avg(?v) as ?avg)
           (min(?v) as ?min)
           (max(?v) as ?max) where {
       ?number <ex://value> ?v .
    }""")
query.evaluate(output=True)

-------------------
| avg | min | max |
| 8.0 | 1   | 15  |
-------------------


As we can see, data from both repositories has been returned and
aggregates have been correctly computed over the whole dataset.

Another example of using federated repositories, this time with
multiple server machines, can be found in [Running AG on AWS EC2](fedex.ipynb#fedex).



# Example 17: Triple Attributes

Triples offer a way of describing model elements and relationships between them. In come cases, however, it is also convenient to be able to store data that is associated with a triple as a whole rather than with a particular element. For instance one might wish to record the source from which a triple has been imported or access level necessary to include it in query results. Traditional solutions of this problem include using graphs, RDF reification or triple IDs. All of these approaches suffer from various flexibility and performance issues. For this reason AllegroGraph offers an alternative: triple attributes.

Attributes are key-value pairs associated with a triple. Keys refer to attribute definitions that must be added to the store before they are used. Values are strings. The set of legal values of an attribute can be constrained by the definition of that attribute. It is possible to associate multiple values of a given attribute with a single triple.

Possible uses for triple attributes include:

* Access control: It is possible to instruct AllegroGraph to prevent an user from accessing triples with certain attributes.
* Sharding: Attributes can be used to ensure that related triples are always placed in the same shard when AllegroGraph acts as a distributed triple store.

Like all other triple components, attribute values are immutable. They must be provided when the triple is added to the store and cannot be changed or removed later.

To illustrate the use of triple attributes we will construct an artificial data set containing a log of information about contacts detected by a submarine at a single moment in time.

## Managing attribute definitions

Before we can add triples with attributes to the store we must create appropriate attribute definitions.

First let’s open a connection

In [3]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

Attribute definitions are represented by **AttributeDefinition** objects. Each definition has a name, which must be unique, and a few optional properties (that can also be passed as constructor arguments):

* `allowed_values`: a list of strings. If this property is set then only the values from this list can be used for the defined attribute.
* `ordered`: a boolean. If true then attribute value comparisons will use the ordering defined by `allowed_values`. The default is false.
* `minimum_number`, `maximum_number`: integers that can be used to constrain the cardinality of an attribute. By default there are no limits.

Let’s define a few attributes that we will later use to demonstrate various attribute-related capabilities of AllegroGraph. To do this, we will use the **setAttributeDefinition()** method of the connection object.

In [4]:
from franz.openrdf.repository.attributes import AttributeDefinition

# A simple attribute with no constraints governing the set
# of legal values or the number of values that can be
# associated with a triple.
tag = AttributeDefinition(name='tag')

# An attribute with a limited set of legal values.
# Every bit of data can come from multiple sources.
# We encode this information in triple attributes,
# since it refers to the tripe as a whole. Another
# way of achieving this would be to use triple ids
# or RDF reification.
source = AttributeDefinition(
    name='source',
    allowed_values=['sonar', 'radar', 'esm', 'visual'])

# Security level - notice that the values are ordered
# and each triple *must* have exactly one value for
# this attribute. We will use this to prevent some
# users from accessing classified data.
level = AttributeDefinition(
    name='level',
    allowed_values=['low', 'medium', 'high'],
    ordered=True,
    minimum_number=1,
    maximum_number=1)

# An attribute like this could be used for sharding.
# That would ensure that data related to a particular
# contact is never partitioned across multiple shards.
# Note that this attribute is required, since without
# it an attribute-sharded triple store would not know
# what to do with a triple.
contact = AttributeDefinition(
    name='contact',
    minimum_number=1,
    maximum_number=1)

# So far we have created definition objects, but we
# have not yet sent those definitions to the server.
# Let's do this now.
conn.setAttributeDefinition(tag)
conn.setAttributeDefinition(source)
conn.setAttributeDefinition(level)
conn.setAttributeDefinition(contact)

# This line is not strictly necessary, because our
# connection operates in autocommit mode.
# However, it is important to note that attribute
# definitions have to be committed before they can
# be used by other sessions.
conn.commit()

It is possible to retrieve the list of attribute definitions from a repository by using the **getAttributeDefinitions()** method:

In [5]:
# we want to ignore system attributes and only
# look for attributes we've added
expected = ["tag", "source", "level", "contact"]
for attr in conn.getAttributeDefinitions():
    if attr.name not in expected: continue
    print('Name: {0}'.format(attr.name))
    if attr.allowed_values:
        print('Allowed values: {0}'.format(
            ', '.join(attr.allowed_values)))
        print('Ordered: {0}'.format(
            'Y' if attr.ordered else 'N'))
    print('Min count: {0}'.format(attr.minimum_number))
    print('Max count: {0}'.format(attr.maximum_number))
    print()

Name: tag
Min count: 0
Max count: 1152921504606846975

Name: source
Allowed values: sonar, radar, esm, visual
Ordered: N
Min count: 0
Max count: 1152921504606846975

Name: level
Allowed values: low, medium, high
Ordered: Y
Min count: 1
Max count: 1

Name: contact
Min count: 1
Max count: 1



Notice that in cases where the maximum cardinality has not been explicitly defined, the server replaced it with a default value. In practice this value is high enough to be interpreted as ‘no limit’.

Attribute definitions can be removed (provided that the attribute is not used by the static attribute filter, which will be discussed later) by calling **deleteAttributeDefinition()**:

In [6]:
conn.deleteAttributeDefinition('tag')
possible = ["tag", "source", "level", "contact"]
defs = conn.getAttributeDefinitions()
# there may be system attributes and we want
# to ignore them and only look at the ones
# we've added
filtered_defs = [attr for attr in defs if attr.name in possible]
print(', '.join(sorted(a.name for a in filtered_defs)))

contact, level, source


## Adding triples with attributes

Now that the attribute definitions have been established we can demonstrate the process of adding triples with attributes. This can be achieved using various methods. A common element of all these methods is the way in which triple attributes are represented. In all cases dictionaries with attribute names as keys and strings or lists of strings as values are used.

When **addTriple()** is used it is possible to pass attributes in a keyword parameter, as shown below:

In [7]:
ex = conn.namespace('ex://')
conn.addTriple(ex.S1, ex.cls, ex.Udaloy, attributes={
    'source': 'sonar',
    'level': 'low',
    'contact': 'S1'
})

The **addStatement()** method works in similar way. Note that it is not possible to include attributes in the **Statement** object itself.

In [8]:
from franz.openrdf.model import Statement

s = Statement(ex.M1, ex.cls, ex.Zumwalt)
conn.addStatement(s, attributes={
    'source': ['sonar', 'esm'],
    'level': 'medium',
    'contact': 'M1'
})

When adding multiple triples with **addTriples()** one can add a fifth element to each tuple to represent attributes. Let us illustrate this by adding an aircraft to our dataset.

In [9]:
conn.addTriples(
    [(ex.R1, ex.cls, ex['Ka-27'], None,
      {'source': 'radar',
       'level': 'low',
       'contact': 'R1'}),
     (ex.R1, ex.altitude, 200, None,
      {'source': 'radar',
       'level': 'medium',
       'contact': 'R1'})])

When all or most of the added triples share the same attribute set it might be convenient to use the attributes keyword parameter. This provides default values, but is completely ignored for all tuples that already contain attributes (the dictionaries are not merged). In the example below we add a triple representing an aircraft carrier and a few more triples that specify its position. Notice that the first triple has a lower security level and multiple sources. The common ‘contact’ attribute could be used to ensure that all this data will remain on a single shard.

In [10]:
conn.addTriples(
    [(ex.M2, ex.cls, ex.Kuznetsov, None, {
        'source': ['sonar', 'radar', 'visual'],
        'contact': 'M2',
        'level': 'low',
     }),
     (ex.M2, ex.position, ex.pos343),
     (ex.pos343, ex.x, 430.0),
     (ex.pos343, ex.y, 240.0)],
    attributes={
       'contact': 'M2',
       'source': 'radar',
       'level': 'medium'
    })

Another method of adding triples with attributes is to use the NQX file format. This works both with **addFile()** and **addData()** (illustrated below):

In [11]:
from franz.openrdf.rio.rdfformat import RDFFormat

conn.addData('''
    <ex://S2> <ex://cls> <ex://Alpha> \
    {"source": "sonar", "level": "medium", "contact": "S2"} .
    <ex://S2> <ex://depth> "300" \
    {"source": "sonar", "level": "medium", "contact": "S2"} .
    <ex://S2> <ex://speed_kn> "15.0" \
    {"source": "sonar", "level": "medium", "contact": "S2"} . ''', rdf_format=RDFFormat.NQX)

When importing from a format that does not support attributes, it is possible to provide a common set of attribute values with a keyword parameter:

In [12]:
from franz.openrdf.rio.rdfformat import RDFFormat

conn.addData('''
    <ex://V1> <ex://cls> <ex://Walrus> ;
              <ex://altitude> 100 ;
              <ex://speed_kn> 12.0e+8 .
    <ex://V2> <ex://cls> <ex://Walrus> ;
              <ex://altitude> 200 ;
              <ex://speed_kn> 12.0e+8 .
    <ex://V3> <ex://cls> <ex://Walrus> ;
              <ex://altitude> 300;
              <ex://speed_kn> 12.0e+8 .
    <ex://V4> <ex://cls> <ex://Walrus> ;
              <ex://altitude> 400 ;
              <ex://speed_kn> 12.0e+8 .
    <ex://V5> <ex://cls> <ex://Walrus> ;
              <ex://altitude> 500 ;
              <ex://speed_kn> 12.0e+8 .
    <ex://V6> <ex://cls> <ex://Walrus> ;
              <ex://altitude> 600 ;
              <ex://speed_kn> 12.0e+8 .
''', attributes={
    'source': 'visual',
    'level': 'high',
    'contact': 'a therapist'})

The data above represents six visually observed Walrus-class submarines, flying at different altitudes and well above the speed of light. It has been highly classified to conceal the fact that someone has clearly been drinking while on duty - after all there are only four Walrus-class submarines currently in service, so the observation is obviously incorrect.

## Retrieving attribute values¶

We will now print all the data we have added to the store, including attributes, to verify that everything worked as expected. The only way to do that is through a SPARQL query using the appropriate [magic property](https://franz.com/ns/allegrograph/6.2.0/attributes) to access the attributes. The query below binds a literal containing a JSON representation of triple attributes to the _?a_ variable:

In [13]:
import json

r = conn.executeTupleQuery('''
   PREFIX attr: <http://franz.com/ns/allegrograph/6.2.0/>
   SELECT ?s ?p ?o ?a {
       ?s ?p ?o .
       ?a attr:attributes (?s ?p ?o) .
   } ORDER BY ?s ?p ?o''')
with r:
    for row in r:
        print(row['s'], row['p'], row['o'])
        print(json.dumps(json.loads(row['a'].label),
                         sort_keys=True,
                         indent=4))

<ex://M1> <ex://cls> <ex://Zumwalt>
{
    "contact": "M1",
    "level": "medium",
    "source": [
        "esm",
        "sonar"
    ]
}
<ex://M2> <ex://cls> <ex://Kuznetsov>
{
    "contact": "M2",
    "level": "low",
    "source": [
        "visual",
        "radar",
        "sonar"
    ]
}
<ex://M2> <ex://position> <ex://pos343>
{
    "contact": "M2",
    "level": "medium",
    "source": "radar"
}
<ex://R1> <ex://altitude> "200"^^<http://www.w3.org/2001/XMLSchema#integer>
{
    "contact": "R1",
    "level": "medium",
    "source": "radar"
}
<ex://R1> <ex://cls> <ex://Ka-27>
{
    "contact": "R1",
    "level": "low",
    "source": "radar"
}
<ex://S1> <ex://cls> <ex://Udaloy>
{
    "contact": "S1",
    "level": "low",
    "source": "sonar"
}
<ex://S2> <ex://cls> <ex://Alpha>
{
    "contact": "S2",
    "level": "medium",
    "source": "sonar"
}
<ex://S2> <ex://depth> "300"
{
    "contact": "S2",
    "level": "medium",
    "source": "sonar"
}
<ex://S2> <ex://speed_kn> "15.0"
{
    "conta

The result contains all the expected triples with pretty-printed attributes.

## Attribute filters

Triple attributes can be used to provide fine-grained access control. This can be achieved by using static attribute filters.

Static attribute filters are simple expressions that control which triples are visible to a query based on triple attributes. Each repository has a single, global attribute filter that can be modified using **setAttributeFilter()**. The values passed to this method must be either strings (the syntax is described in the documentation of static attribute filters) or filter objects.

Filter objects are created by applying set operators to ‘attribute sets’. These can then be combined using filter operators.

An attribute set can be one of the following:

* a string or a list of strings: represents a constant set of values.
* _TripleAttribute.name_: represents the value of the name attribute associated with the currently inspected triple.
* _UserAttribute.name_: represents the value of the name attribute associated with current query. User attributes will be discussed in more detail later.

Available set operators are shown in the table below. All classes and functions mentioned here can be imported from the `franz.openrdf.repository.attributes` package:

| **Syntax** | **Meaning** |
|:------------|:-------------|
|      `Empty(x)`      |       True if the specified attribute set is empty.      |
|     `Overlap(x, y)`       |       True if there is at least one matching value between the two attribute sets.      |
|       `Subset(x, y)`, `x << y`     |        True if every element of _x_ can be found in _y_     |
|       `Superset(x, y)`, `x >> y`     |        True if every element of _y_ can be found in _x_     |
|      `Equal(x, y)`, `x == y`      |       True if _x_ and _y_ have exactly the same contents.      |
|     `Lt(x, y)`, `x < y`       |       True if both sets are singletons, at least one of the sets refers to a triple or user attribute, the attribute is ordered and the value of the single element of _x_ occurs before the single value of _y_ in the `lowed_values` list of the attribute.      |
|      `Le(x, y)`, `x <= y`      |       True if _y_ < _x_ is false.      |
|      `Eq(x, y)`      |      True if both _x_ < _y_ and _y_ < _x_ are false. Note that using the == Python operator translates to `Equals`, not `Eq`.       |
|      `Ge(x, y)`, `x >= y`      |      True if _x_ < _y_ is false.       |
|     `Gt(x, y), x > y`       |       True if _y_ < _x_.      |


Note that the overloaded operators only work if at least one of the attribute sets is a `UserAttribute` or `TripleAttribute` reference - if both arguments are strings or lists of strings the default Python semantics for each operator are used. The prefix syntax always produces filters.

Filters can be combined using the following operators:

| **Syntax** | **Meaning** |
|:------------|:-------------|
|       `Not(x)`, `~x`     |        Negates the meaning of the filter.     |
|      `And(x, y, ...)`, `x & y`      |       True if all subfilters are true.      |
|     `Or(x, y, ...)`, `x \| y`       |       True if at least one subfilter is true.      |

Filter operators also work with raw strings, but overloaded operators will only be recognized if at least one argument is a filter object.

## Using filters and user attributes

The example below displays all classes of vessels from the dataset after establishing a static attribute filter which ensures that only sonar contacts are visible:

In [14]:
from franz.openrdf.repository.attributes import *

conn.setAttributeFilter(TripleAttribute.source >> 'sonar')
conn.executeTupleQuery(
    'select ?class { ?s <ex://cls> ?class } order by ?class',
    output=True)

------------------
| class          |
| ex://Alpha     |
| ex://Kuznetsov |
| ex://Udaloy    |
| ex://Zumwalt   |
------------------


The output contains neither the visually observed Walruses nor the radar detected ASW helicopter.

To avoid having to set a static filter before each query (which would be inefficient and cause concurrency issues) we can employ user attributes. User attributes are specific to a particular connection and are sent to the server with each query. The static attribute filter can refer to these and compare them with triple attributes. Thus we can use code presented below to create a filter which ensures that a connection only accesses data at or below the chosen clearance level.

In [15]:
conn.setUserAttributes({'level': 'low'})
conn.setAttributeFilter(
    TripleAttribute.level <= UserAttribute.level)
conn.executeTupleQuery(
    'select ?class { ?s <ex://cls> ?class } order by ?class',
    output=True)

------------------
| class          |
| ex://Ka-27     |
| ex://Kuznetsov |
| ex://Udaloy    |
------------------


We can see that the output here contains only contacts with the access level of low. It omits the destroyer and Alpha submarine (these require medium level) as well as the top-secret Walruses.

```
------------------
| class          |
==================
| ex://Ka-27     |
| ex://Kuznetsov |
| ex://Udaloy    |
------------------
```

The main advantage of the code presented above is that the filter can be set globally during the application setup and access control can then be achieved by varying user attributes on connection objects.

Let us now remove the attribute filter to prevent it from interfering with other examples. We will use the **clearAttributeFilter()** method.

```
conn.clearAttributeFilter()
```

It might be useful to change connection’s attributes temporarily for the duration of a single code block and restore prior attributes after that. This can be achieved using the **temporaryUserAttributes()** method, which returns a context manager. The example below illustrates its use. It also shows how to use **getUserAttributes()** to inspect user attributes.

In [16]:
with conn.temporaryUserAttributes({'level': 'high'}):
    print('User attributes inside the block:')
    for k, v in conn.getUserAttributes().items():
        print('{0}: {1}'.format(k, v))
    print()
print('User attributes outside the block:')
for k, v in conn.getUserAttributes().items():
    print('{0}: {1}'.format(k, v))

User attributes inside the block:
level: high

User attributes outside the block:
level: low


```
User attributes inside the block:
level: high

User attributes outside the block:
level: low
```

# Example 18: Pandas support

The SPARQL query language has somewhat limited capabilities when it comes to advanced numerical data analysis, data mining and other similar tasks. In these cases it is best to only use SPARQL to extract, filter and normalize data (perhaps coming from diverse sources - the ability to work with such data is one of the key advantages of the RDF data model) and rely on other tools to perform further analysis. One of the more popular tools that can be used in this context is the [Pandas framework](https://pandas.pydata.org/). The AllegroGraph Python client contains basic support for processing query results with this library. Let us see how this support can be leveraged in a simplified scenario.

As usual, we will start by opening a connection.

In [5]:
from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

Now we will add some data. The first data set describes per capita cheese consumption in the United States in years 2000-2009 according to USDA. The values are expressed in pounds:

In [6]:
conn.addData('''
    prefix ex: <ex://>
    ex:c2000 ex:year 2000; ex:cheese 29.8 .
    ex:c2001 ex:year 2001; ex:cheese 30.1 .
    ex:c2002 ex:year 2002; ex:cheese 30.5 .
    ex:c2003 ex:year 2003; ex:cheese 30.6 .
    ex:c2004 ex:year 2004; ex:cheese 31.3 .
    ex:c2005 ex:year 2005; ex:cheese 31.7 .
    ex:c2006 ex:year 2006; ex:cheese 32.6 .
    ex:c2007 ex:year 2007; ex:cheese 33.1 .
    ex:c2008 ex:year 2008; ex:cheese 32.7 .
    ex:c2009 ex:year 2009; ex:cheese 32.8 .
''')

Our second set of samples is derived from NSF data and describes the number of civil engineering doctorates awarded each year.

In [7]:
conn.addData('''
    prefix ex: <ex://>
    ex:d2000 ex:year 2000; ex:doctorates 480 .
    ex:d2001 ex:year 2001; ex:doctorates 501 .
    ex:d2002 ex:year 2002; ex:doctorates 540 .
    ex:d2003 ex:year 2003; ex:doctorates 552 .
    ex:d2004 ex:year 2004; ex:doctorates 547 .
    ex:d2005 ex:year 2005; ex:doctorates 622 .
    ex:d2006 ex:year 2006; ex:doctorates 655 .
    ex:d2007 ex:year 2007; ex:doctorates 701 .
    ex:d2008 ex:year 2008; ex:doctorates 712 .
    ex:d2009 ex:year 2009; ex:doctorates 708 .
''')

We can use a SPARQL query to extract all this data and create a Pandas DataFrame from it:

In [8]:
query = '''
prefix ex: <ex://>
select ?year ?cheese ?doctorates {
    _:b1 ex:year ?year ; ex:cheese ?cheese .
    _:b2 ex:year ?year ; ex:doctorates ?doctorates .
} order by ?year'''
with conn.executeTupleQuery(query) as result:
    df = result.toPandas()
print(df)

   year cheese  doctorates
0  2000   29.8         480
1  2001   30.1         501
2  2002   30.5         540
3  2003   30.6         552
4  2004   31.3         547
5  2005   31.7         622
6  2006   32.6         655
7  2007   33.1         701
8  2008   32.7         712
9  2009   32.8         708


Notice that the DataFrame can be used after the result has been discarded, since all required data has been copied.

At this point the **TupleQueryResult.toPandas()** method does not allow fine-grained control over types of the returned columns. The `'cheese'` column contains decimal values, but floats would be more convenient for further computation. Thus we will modify the dataframe and convert the data:

In [9]:
df['cheese'] = df['cheese'].astype(float)

Now that we have the data in a form suitable for Pandas, we can perform some analysis. To keep this tutorial simple we will just measure the correlation between the number of civil engineering doctorates awarded and per capita cheese consumption:

In [10]:
correlation = df.corr()['cheese']['doctorates']
print("Correlation: %.5f" % correlation)

Correlation: 0.97433


The interpretation of this result is left as an exercise to the reader.

# Example 19: Using JSON-LD

JSON-LD is described pretty well at [https://json-ld.org/](https://json-ld.org/) and the specification can be found at [https://json-ld.org/latest/json-ld/](https://json-ld.org/latest/json-ld/) .

The website [https://json-ld.org/playground/](https://json-ld.org/playground/) is also useful.

There are many reasons for working with JSON-LD. The major search engines such as Google require ecommerce companies to mark up their websites with a systematic description of their products and more and more companies use it as an easy serialization format to share data.

The benefit for your organization is that you can now combine your documents with graphs, graph search and graph algorithms. Normally when you store documents in a document store you set up your documents in such a way that it is optimized for direct retrieval queries. Doing complex joins for multiple types of documents or even doing a shortest path through a mass of object (types) is however very complicated. Storing JSON-LD objects in AllegroGraph gives you all the benefits of a document store and you can semantically link objects together, do complex joins and even graph search.

A second benefit is that, as an application developer, you do not have to learn the entire semantic technology stack, especially the part where developers have to create individual triples or edges. You can work with the JSON data serialization format that application developers usually prefer.

In the following you will first learn about JSON-LD as a syntax for semantic graphs. After that we will talk more about using JSON-LD with AllegroGraph as a document-graph-store.

In [12]:
with ag_connect('repo') as conn:
    print(conn.size())

0


If the script runs successfully a new repository named repo will be created.

## JSON-LD setup

We next define some utility functions which are somewhat different from what we have used before in order to work better with JSON-LD. **createdb()** creates and opens a new repository and **opendb()** opens an existing repo (modify the values of host, port, user, and password arguments in the definitions if necessary). Both return repository connections which can be used to perform repository operations. **showtriples()** displays triples in a repository.

In [8]:
import os
import json, requests, copy

from franz.openrdf.sail.allegrographserver import AllegroGraphServer
from franz.openrdf.connect import ag_connect
from franz.openrdf.vocabulary.xmlschema import XMLSchema
from franz.openrdf.rio.rdfformat import RDFFormat

# Functions to create/open a repo and return a RepositoryConnection

def createdb(name):
    return ag_connect(name, create=True, clear=True)

def opendb(name):
    return ag_connect(name, create=False)

def showtriples(limit=100):
    statements = conn.getStatements(limit=limit)
    with statements:
        for statement in statements:
             print(statement)

Finally we call our **createdb** function to create a repository and return a _RepositoryConnection_ to it:

In [9]:
conn=createdb('jsonplay')

## Some Examples of Using JSON-LD

In the following we try things out with some JSON-LD objects that are defined in json-ld playground: jsonld

The first object we will create is an event dict. Although it is a Python dict, it is also valid JSON notation. (But note that not all Python dictionaries are valid JSON. For example, JSON uses null where Python would use None and there is no magic to automatically handle that.) This object has one key called @context which specifies how to translate keys and values into predicates and objects. The following `@context` says that every time you see ical: it should be replaced by `http://www.w3.org/2002/12/cal/ical#`, `xsd:` by `http://www.w3.org/2001/XMLSchema#`, and that if you see `ical:dtstart` as a key than the value should be treated as an `xsd:dateTime`.

In [10]:
event = {
  "@context": {
    "ical": "http://www.w3.org/2002/12/cal/ical#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "ical:dtstart": { "@type": "xsd:dateTime" }
      },
    "ical:summary": "Lady Gaga Concert",
    "ical:location": "New Orleans Arena, New Orleans, Louisiana, USA",
    "ical:dtstart": "2011-04-09T20:00:00Z"
}

Let us try it out (the subjects are blank nodes so you will see different values):

In [11]:
conn.addData(event)
showtriples()

(_:bF080D75Fx1, <http://www.w3.org/2002/12/cal/ical#summary>, "Lady Gaga Concert")
(_:bF080D75Fx1, <http://www.w3.org/2002/12/cal/ical#location>, "New Orleans Arena, New Orleans, Louisiana, USA")
(_:bF080D75Fx1, <http://www.w3.org/2002/12/cal/ical#dtstart>, "2011-04-09T20:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>)


## Adding an @id and @type to Objects

In the above we see that the JSON-LD was correctly translated into triples but there are two immediate problems: first each subject is a blank node, the use of which is problematic when linking across repositories; and second, the object does not have an RDF type. We solve these problems by adding an `@id` to provide an IRI as the subject and adding a `@type` for the object (those are at the lines just after the @context definition):

In [12]:
event = {
  "@context": {
      "ical": "http://www.w3.org/2002/12/cal/ical#",
      "xsd": "http://www.w3.org/2001/XMLSchema#",
      "ical:dtstart": { "@type": "xsd:dateTime" }
        },
      "@id": "ical:event-1",
      "@type": "ical:Event",
      "ical:summary": "Lady Gaga Concert",
      "ical:location": "New Orleans Arena, New Orleans, Louisiana, USA",
      "ical:dtstart": "2011-04-09T20:00:00Z"
  }

We also create a test function to test our JSON-LD objects. It is more powerful than needed right now (here we just need _conn_, _addData(event)_ and _showTriples()_ but **test** will be useful in most later examples. Note the _allow_external_references=True_ argument to _addData()_. Again, not needed in this example but later examples use external contexts and so this argument is required for those. Note that external references will not work on the free version of the cloud AllegroGraph servers. Please skip the next section if that is how you are connecting to a repository.

In [23]:
def test(object, json_ld_context=None, rdf_context=None, maxPrint=100, conn=conn):
    conn.clear()
    conn.addData(object, allow_external_references=True)
    showtriples(limit=maxPrint)

Here we apply test to event:

In [24]:
test(event)

(<http://www.w3.org/2002/12/cal/ical#event-1>, <http://www.w3.org/2002/12/cal/ical#summary>, "Lady Gaga Concert")
(<http://www.w3.org/2002/12/cal/ical#event-1>, <http://www.w3.org/2002/12/cal/ical#location>, "New Orleans Arena, New Orleans, Louisiana, USA")
(<http://www.w3.org/2002/12/cal/ical#event-1>, <http://www.w3.org/2002/12/cal/ical#dtstart>, "2011-04-09T20:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>)
(<http://www.w3.org/2002/12/cal/ical#event-1>, <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, <http://www.w3.org/2002/12/cal/ical#Event>)


Note in the above that we now have a proper subject and a type.

## Referencing a External Context Via a URL

The next object we add to AllegroGraph is a person object. This time the `@context` is not specified as a JSON object but as a link to a context that is stored at [http://schema.org/](http://schema.org/). Also in the definition of the function **test** above we had this parameter in `addData`: `allow_external_references=True`. Requiring that argument explicitly is a security feature. One should use external references only that context at that URL is trusted (as it is in this case).

In [21]:
person = {
  "@context": "http://schema.org/",
  "@type": "Person",
  "@id": "foaf:person-1",
  "name": "Jane Doe",
  "jobTitle": "Professor",
  "telephone": "(425) 123-4567",
  "url": "http://www.janedoe.com"
}

Here is the output:

In [None]:
test(person)

## Improving Performance by Adding Lists¶

Adding one person at a time requires doing an interaction with the server for each person. It is much more efficient to add lists of objects all at once rather than one at a time. Note that `addData` will take a list of dicts and still do the right thing. So let us add a 1000 persons at the same time, each person being a copy of the above person but with a different `@id`. (The example code is repeated below for ease of copying.):

In [None]:
x = [copy.deepcopy(person) for i in range(1000)]
len(x)

c = 0
for el in x:
    el['@id']= "http://franz.com/person-" + str(c)
    c= c + 1

test(x,maxPrint=10)

conn.size()

## Adding a Context Directly to an Object¶

You can download a context directly in Python, modify it and then add it to the object you want to store. As an illustration we load a person context from json-ld.org (actually a fragment of the schema.org context) and insert it in a person object. (We have broken and truncated some output lines for clarity and all the code executed is repeated below for ease of copying.):

In [None]:
context=requests.get("https://json-ld.org/contexts/person.jsonld").json()['@context']
# The next produces lots of output, uncomment if desired
#context

person = {
  "@context": context,
  "@type": "Person",
  "@id": "foaf:person-1",
  "name": "Jane Doe",
  "jobTitle": "Professor",
  "telephone": "(425) 123-4567",
}
test(person)

## Building a Graph of Objects

We start by forcing a key’s value to be stored as a resource. We saw above that we could specify the value of a key to be a date using the `xsd:dateTime` specification. We now do it again for `foaf:birthdate`. Then we created several linked objects and show the connections using Gruff.

In [None]:
context = { "foaf:child": {"@type":"@id"},
            "foaf:brotherOf": {"@type":"@id"},
            "foaf:birthdate": {"@type":"xsd:dateTime"}}

p1 = {
    "@context": context,
    "@type":"foaf:Person",
    "@id":"foaf:person-1",
    "foaf:birthdate": "1958-04-09T20:00:00Z",
    "foaf:child": ['foaf:person-2', 'foaf:person-3']
}

p2 = {
    "@context": context,
    "@type":"foaf:Person",
    "@id":"foaf:person-2",
    "foaf:brotherOf": "foaf:person-3",
    "foaf:birthdate": "1992-04-09T20:00:00Z",
}

p3 = {"@context": context,
    "@type":"foaf:Person",
    "@id":"foaf:person-3",
    "foaf:birthdate": "1994-04-09T20:00:00Z",
}

test([p1,p2,p3])

The following shows the graph that we created in Gruff. Note that this is what JSON-LD is all about: connecting objects together.

![person-graph.png](https://franz.com/agraph/support/documentation/current/python/_images/person-graph.png)

## JSON-LD Keyword Directives can be Added at any Level

Here is an example from the wild. The URL [https://www.ulta.com/antioxidant-facial-oil?productId=xlsImpprod18731241](https://www.ulta.com/antioxidant-facial-oil?productId=xlsImpprod18731241) goes to a web page advertising a facial oil. (We make no claims or recommendations about this product. We are simply showing how JSON-LD appears in many places.) Look at the source of the page and you’ll find a JSON-LD object similar to the following. Note that @ directives go to any level. We added an _@id_ key.

In [None]:
hippieoil = {"@context":"http://schema.org",
 "@type":"Product",
 "@id":"http://franz.com/hippieoil",
 "aggregateRating":
    {"@type":"AggregateRating",
     "ratingValue":4.6,
     "reviewCount":73},
     "description":"""Make peace with your inner hippie while hydrating & protecting against photoaging....Mad Hippie's preservative-free Antioxidant Facial Oil is truly the most natural way to moisturize.""",
     "brand":"Mad Hippie",
     "name":"Antioxidant Facial Oil",
     "image":"https://images.ulta.com/is/image/Ulta/2530018",
     "productID":"2530018",
     "offers":
        {"@type":"Offer",
         "availability":"http://schema.org/InStock",
         "price":"24.99",
         "priceCurrency":"USD"}}


test(hippieoil)

![hippie oil](https://franz.com/agraph/support/documentation/current/python/_images/hippieoil.png)

## JSON-LD @graphs

One can put one or more JSON-LD objects in an RDF named graph. This means that the fourth element of each triple generated from a JSON-LD object will have the specified graph name. Let’s show in an example.

In [None]:
context = {
        "name": "http://schema.org/name",
        "description": "http://schema.org/description",
        "image": {
            "@id": "http://schema.org/image", "@type": "@id" },
        "geo": "http://schema.org/geo",
        "latitude": {
            "@id": "http://schema.org/latitude", "@type": "xsd:float" },
        "longitude": {
            "@id": "http://schema.org/longitude",  "@type": "xsd:float" },
        "xsd": "http://www.w3.org/2001/XMLSchema#"
    }

place = {
    "@context": context,
    "@id": "http://franz.com/place1",
    "@graph": {
        "@id": "http://franz.com/place1",
        "@type": "http://franz.com/Place",
        "name": "The Empire State Building",
        "description": "The Empire State Building is a 102-story landmark in New York City.",
        "image": "http://www.civil.usherbrooke.ca/cours/gci215a/empire-state-building.jpg",
        "geo": {
               "latitude": "40.75",
               "longitude": "73.98" }
        }}

test(place, maxPrint=3)

Note that the fourth element (graph) of each of the triples is <http://franz.com/place1>. If you don’t add the @id the triples will be put in the default graph.

Here a slightly more complex example:

In [None]:
library = {
  "@context": {
    "dc": "http://purl.org/dc/elements/1.1/",
    "ex": "http://example.org/vocab#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "ex:contains": {
      "@type": "@id"
    }
  },
  "@id": "http://franz.com/mygraph1",
  "@graph": [
    {
      "@id": "http://example.org/library",
      "@type": "ex:Library",
      "ex:contains": "http://example.org/library/the-republic"
    },
    {
      "@id": "http://example.org/library/the-republic",
      "@type": "ex:Book",
      "dc:creator": "Plato",
      "dc:title": "The Republic",
      "ex:contains": "http://example.org/library/the-republic#introduction"
    },
    {
      "@id": "http://example.org/library/the-republic#introduction",
      "@type": "ex:Chapter",
      "dc:description": "An introductory chapter on The Republic.",
      "dc:title": "The Introduction"
    }
  ]
}

test(library, maxPrint=3)

![library](https://franz.com/agraph/support/documentation/current/python/_images/library-graph.png)

## JSON-LD as a Document Store

So far we have treated JSON-LD as a syntax to create triples. Now let us look at the way we can start using AllegroGraph as a combination of a document store and graph database at the same time. And also keep in mind that we want to do it in such a way that you as a Python developer can add documents such as dictionaries and also retrieve values or documents as dictionaries.

## Setup

The **Python source file jsonld_tutorial_helper.py** contains various definitions useful for the remainder of this example. Once it is downloaded, do the following (after adding the path to the filename):

In [None]:
conn=createdb("docugraph")
from jsonld_tutorial_helper import *
addNamespace(conn,"jsonldmeta","http://franz.com/ns/allegrograph/6.4/load-meta#")
addNamespace(conn,"ical","http://www.w3.org/2002/12/cal/ical#")

Let’s use our event structure again and see how we can store this JSON document in the store as a document. Note that the `addData` call includes the keyword: `json_ld_store_source=True`.

In [None]:
event = {
  "@context": {
    "@id": "ical:event1",
    "@type": "ical:Event",
    "ical": "http://www.w3.org/2002/12/cal/ical#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "ical:dtstart": { "@type": "xsd:dateTime" }
      },
    "ical:summary": "Lady Gaga Concert",
    "ical:location":
    "New Orleans Arena, New Orleans, Louisiana, USA",
    "ical:dtstart": "2011-04-09T20:00:00Z"
}

conn.addData(event, allow_external_references=True,json_ld_store_source=True)

The _jsonld_tutorial_helper.py_ file defines the function store as simple wrapper around `addData` that always saves the JSON source. For experimentation reasons it also has a parameter `fresh` to clear out the repository first.

In [None]:
from jsonld_tutorial_helper import store

store(conn,event, fresh=True)

If we look at the triples in Gruff we see that the JSON source is stored as well, on the root (top-level _@id_) of the JSON object.

![event](https://franz.com/agraph/support/documentation/current/python/_images/event-store-source.png)

For the following part of the tutorial we want a little bit more data in our repository so please look at the helper file _jsonld_tutorial_helper.py_ where you will see that at the end we have a dictionary named obs with about 9 diverse objects, mostly borrowed from the json-ld.org site: a person, an event, a place, a recipe, a group of persons, a product, and our hippieoil.

First let us store all the objects in a fresh repository. Then we check the size of the repo. Finally, we create a freetext index for the JSON sources.

In [None]:
store(conn,[v for k,v in obs.items()], fresh=True)
print(conn.size())
conn.createFreeTextIndex("source",['<http://franz.com/ns/allegrograph/6.4/load-meta#source>'])

## Retrieving values with SPARQL

To simply retrieve values in objects but not the objects themselves, regular SPARQL queries will suffice. But because we want to make sure that Python developers only need to deal with regular Python structures as lists and dictionaries, we created a simple wrapper around SPARQL (see helper file). The name of the wrapper is `runSparql`.

Here is an example. Let us find all the roots (top-level _@ids_) of objects and their types. Some objects do not have roots, so `None` stands for a blank node.

In [None]:
pprint(runSparql(conn,"select ?s ?type { ?s a ?type }"))
[{'s': 'cocktail1', 'type': 'Cocktail'},
 {'s': None, 'type': 'Individual'},
 {'s': None, 'type': 'Vehicle'},
 {'s': 'tesla', 'type': 'Offering'},
 {'s': 'place1', 'type': 'Place'},
 {'s': None, 'type': 'Offer'},
 {'s': None, 'type': 'AggregateRating'},
 {'s': 'hippieoil', 'type': 'Product'},
 {'s': 'person-3', 'type': 'Person'},
 {'s': 'person-2', 'type': 'Person'},
 {'s': 'person-1', 'type': 'Person'},
 {'s': 'person-1000', 'type': 'Person'},
 {'s': 'event1', 'type': 'Event'}]

We do not see the full URIs for ?s and ?type. You can see them by adding an appropriate _format_ argument to **runSparql**, but the default is `terse`.

In [None]:
pprint(runSparql(conn,"select ?s ?type { ?s a ?type } limit 2",format='ntriples'))

## Retrieving a Dictionary or Object

`retrieve` is another function defined (in _jsonld_tutorial_helper.py_) for this tutorial. It is a wrapper around SPARQL to help extract objects. Here we see how we can use it. The sole purpose of `retrieve` is to retrieve the JSON-LD/dictionary based on a SPARQL pattern.

In [None]:
retrieve(conn,"{?this a ical:Event}")

Ok, for a final fun (if you like expensive cars) example: Let us find a thing that is “fast and furious”, that is worth more than $80,000 and that we can pay for in cash:

In [None]:
addNamespace(conn,"gr","http://purl.org/goodrelations/v1#")

x = retrieve(conn, """{ ?this fti:match 'fast furious*';
                          gr:acceptedPaymentMethods gr:Cash ;
                          gr:hasPriceSpecification ?price .
                    ?price gr:hasCurrencyValue ?value ;
                           gr:hasCurrency "USD" .
                    filter ( ?value > 80000.0 ) }""")

pprint(x)

In [None]:
x[0]['@id']

# Example 20: Reasoning

AllegroGraph supports the following RDFS and OWL predicates:

* `rdf:type`;
* `rdfs:domain`;
* `rdfs:range`;
* `rdfs:subClassOf`;
* `rdfs:subPropertyOf`;
* `owl:inverseOf`;
* `owl:sameAs`;
* `owl:SymmetricProperty`;
* `owl:TransitiveProperty`.

A more detailed description of reasoning support can be found in the [Reasoner Tutorial](https://franz.com/agraph/support/documentation/current/reasoner-tutorial.html) chapter of AllegroGraph LISP documentation and is not repeated here for brevity. This tutorial only contains Python setup and querying examples.

## Setup

In order to enable reasoning, a connection to the server must be constructed by passing repository spec to a **reason()** function and creating a session from the resulting spec:

```conn = server.openSession(reason(repo))```

This connection must be used instead of a regular connection as an entry-point to execute queries on inferred statements. Here is a complete example of a setup that enables RDFS++ reasoning over a regular repository `repo`:

In [6]:
from franz.openrdf.sail.allegrographserver import AllegroGraphServer
from franz.openrdf.sail.spec import reason

server = AllegroGraphServer()

# Create repository 'repo' in the root catalog.
server.openCatalog().createRepository('repo')

# Open session with reasoning enabled on repository 'repo'.
conn = server.openSession(reason('<repo>'))

## Reasoner examples

The examples below assume that you already have an RDFS++ repository created as described in the Setup section. Each example removes all of the triples in the store, adds some new triples, and makes some queries to illustrate the different sorts of reasoning that AllegroGraph supports.

In order to demonstrate the results, we will use the following function to print triples matching a simple pattern:

In [4]:
def ptl(conn, s, p, o):
    """
    Get statements matching subject, predicate and object,
    which can be either entities represented by strings
    of the form '<prefix>:<name>' or wildcard None values.
    """
    namespaces = {
        'rdf': conn.namespace('http://www.w3.org/1999/02/22-rdf-syntax-ns#'),
        'ex': conn.namespace('ex://')
    }

    def part_to_uri(part):
        # Split part string on ':' and get a full URI from a corresponding
        # namespace object:
        if part:
            prefix, name = part.split(':', 2)
            return getattr(namespaces[prefix], name)
        else:
            return None

    us, up, uo = (part_to_uri(part) for part in (s, p, o))
    for rs, rp, ro, _ in conn.getStatements(us, up, uo):
        print("{} {} {}.".format(rs, rp, ro))

It is also possible to ask for inferred statements without creating a reasoning session by setting the value of the `includeInferred` argument to **getStatements()** to `True` as in the following example:
```
conn.getStatements(s, p, o, includeInferred=True)
```

### inverseOf

In [7]:
conn.addData("""
   @prefix owl: <http://www.w3.org/2002/07/owl#>.
   @prefix ex: <ex://>.

   ex:Jans    ex:owns       ex:Birra.
   ex:ownedBy owl:inverseOf ex:owns.
   ex:has     owl:inverseOf ex:ownedBy. """)

error: (28, "Failed to connect to ag197y8xsj2epl2e.allegrograph.cloud port 44551 after 127323 ms: Couldn't connect to server")

In [None]:
ptl(conn, "ex:Birra", "ex:ownedBy", None)

In [None]:
ptl(conn, None, "ex:ownedBy", None)

In [None]:
ptl(conn, None, "ex:ownedBy", "ex:Jans")

In [None]:
ptl(conn, "ex:Jans", "ex:has", None)

In [None]:
ptl(conn, None, "ex:has", None)

In [None]:
ptl(conn, None, "ex:has", "ex:Birra")

### subPropertyOf

In [None]:
conn.addData("""
   @prefix owl: <http://www.w3.org/2002/07/owl#>.
   @prefix ex: <ex://>.
   @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

   ex:Jans   ex:hasPet          ex:Birra.
   ex:Birra  ex:friendOf        ex:Samira.
   ex:hasPet rdfs:subPropertyOf ex:owns. """)

In [None]:
ptl(conn, "ex:Jans", "ex:owns", "ex:Birra")

In [None]:
ptl(conn, "ex:Jans", "ex:owns", None)

In [None]:
ptl(conn, None, "ex:owns", "ex:Birra")

In [None]:
ptl(conn, "ex:Jans", "ex:hasPet", "ex:Birra")

In [None]:
ptl(conn, "ex:Jans", "ex:hasPet", None)

In [None]:
ptl(conn, None, "ex:hasPet", "ex:Birra")

### inverseOf with subPropertyOf

In [None]:
# direct triples
ptl(conn, "ex:Jans", "ex:hasPet", "ex:Birra")

In [None]:
ptl(conn, None, "ex:hasPet", "ex:Birra")

In [None]:
ptl(conn, "ex:Jans", "ex:hasPet", None)

In [None]:
# inverse of ex:hasPet
ptl(conn, "ex:Birra", "ex:petOf", "ex:Jans")

In [None]:
ptl(conn, None, "ex:petOf", "ex:Jans")

In [None]:
ptl(conn, "ex:Birra", "ex:petOf", None)

In [None]:
# subproperty
ptl(conn, "ex:Jans", "ex:owns", "ex:Birra")

In [None]:
ptl(conn, "ex:Jans", "ex:owns", None)

In [None]:
ptl(conn, None, "ex:owns", "ex:Birra")

In [None]:
# inverse of subproperty
ptl(conn, "ex:Birra", "ex:ownedBy", "ex:Jans")

In [None]:
ptl(conn, None, "ex:ownedBy", "ex:Jans")

In [None]:
ptl(conn, "ex:Birra", "ex:ownedBy", None)

In [None]:
# inverse of inverse
ptl(conn, "ex:Jans", "ex:has", "ex:Birra")

In [None]:
ptl(conn, None, "ex:has", "ex:Birra")

In [None]:
ptl(conn, "ex:Jans", "ex:has", None)

### sameAs

In [None]:
conn.addData("""
  @prefix owl: <http://www.w3.org/2002/07/owl#>.
  @prefix ex: <ex://>.
  @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

  ex:Jans   ex:owns    ex:Birra.
  ex:Jans   owl:sameAs ex:Jannes.
  ex:Aasman owl:sameAs ex:Jannes.
  ex:Birra  owl:sameAs ex:SonOfSamira.
""")

In [None]:
ptl(conn, "ex:Aasman", "ex:owns", "ex:SonOfSamira")

In [None]:
ptl(conn, "ex:Aasman", "ex:owns", None)

In [None]:
ptl(conn, None, "ex:owns", "ex:SonOfSamira")

In [None]:
ptl(conn, None, "ex:owns", None)

### sameAs with inverseOf and subPropertyOf¶

In [None]:
conn.addData("""
  @prefix owl: <http://www.w3.org/2002/07/owl#>.
  @prefix ex: <ex://>.
  @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

  ex:Jans    ex:hasPet ex:Birra.
  ex:ownedBy owl:inverseOf ex:owns.
  ex:has     owl:inverseOf ex:ownedBy.
  ex:hasPet  rdfs:subPropertyOf ex:owns.
  ex:petOf   owl:inverseOf ex:hasPet.
  ex:Birra   ex:age ex:twelve.

  ex:Jans   owl:sameAs ex:Jannes.
  ex:Aasman owl:sameAs ex:Jannes.
  ex:Birra  owl:sameAs ex:SonOfSamira.
""")

In [None]:
# direct triples
ptl(conn, "ex:Aasman", "ex:hasPet", "ex:SonOfSamira")

In [None]:
ptl(conn, None, "ex:hasPet", "ex:SonOfSamira")

In [None]:
ptl(conn, "ex:Aasman", "ex:hasPet", None)

In [None]:
# inverse of 'owns'
ptl(conn, "ex:SonOfSamira", "ex:petOf", "ex:Aasman")

In [None]:
ptl(conn, None, "ex:petOf", "ex:Aasman")

In [None]:
ptl(conn, "ex:SonOfSamira", "ex:petOf", None)

In [None]:
# inverse of inverse
ptl(conn, "ex:Aasman", "ex:has", "ex:SonOfSamira")

In [None]:
ptl(conn, None, "ex:has", "ex:SonOfSamira")

In [None]:
ptl(conn, "ex:Aasman", "ex:has", None)

In [None]:
# subproperty
ptl(conn, "ex:Aasman", "ex:owns", "ex:SonOfSamira")

In [None]:
ptl(conn, "ex:Aasman", "ex:owns", None)

In [None]:
ptl(conn, None, "ex:owns", "ex:SonOfSamira")

In [None]:
# inverse of subproperty
ptl(conn, "ex:SonOfSamira", "ex:ownedBy", "ex:Aasman")

In [None]:
ptl(conn, None, "ex:ownedBy", "ex:Aasman")

In [None]:
ptl(conn, "ex:SonOfSamira", "ex:ownedBy", None)

### type with subClassOf¶

In [None]:
conn.addData("""
  @prefix owl: <http://www.w3.org/2002/07/owl#>.
  @prefix ex: <ex://>.
  @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
  @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

  ex:Mammal rdfs:subClassOf ex:Animal.
  ex:Human  rdfs:subClassOf ex:Mammal.
  ex:Man    rdfs:subClassOf ex:Human.
  ex:Jans   rdf:type        ex:Man.
  ex:Jans   owl:sameAs      ex:Jannes.
  ex:Aasman owl:sameAs      ex:Jannes.
""")

In [None]:
ptl(conn, "ex:Jans", "rdf:type", "ex:Man")

In [None]:
ptl(conn, "ex:Jans", "rdf:type", "ex:Human")

In [None]:
ptl(conn, "ex:Jans", "rdf:type", None)

In [None]:
ptl(conn, "ex:Aasman", "rdf:type", "ex:Man")

In [None]:
ptl(conn, "ex:Aasman", "rdf:type", "ex:Human")

In [None]:
ptl(conn, "ex:Aasman", "rdf:type", None)

In [None]:
ptl(conn, None, "rdf:type", "ex:Man")

In [None]:
ptl(conn, None, "rdf:type", "ex:Human")

In [None]:
ptl(conn, None, "rdf:type", None)

### type with range

In [None]:
conn.addData("""
  @prefix owl: <http://www.w3.org/2002/07/owl#>.
  @prefix ex: <ex://>.
  @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

  ex:Jans   ex:hasPet      ex:Birra.
  ex:hasPet rdfs:range      ex:Pet.
  ex:Pet    rdfs:subClassOf ex:Mammal.
  ex:Fatcat owl:sameAs      ex:Birra.
""")

In [None]:
ptl(conn, "ex:Birra", "rdf:type", "ex:Pet")

In [None]:
ptl(conn, "ex:Birra", "rdf:type", None)

In [None]:
ptl(conn, None, "rdf:type", "ex:Pet")

In [None]:
ptl(conn, "ex:Birra", "rdf:type", "ex:Mammal")

In [None]:
ptl(conn, "ex:Fatcat", "rdf:type", "ex:Mammal")

### type with domain

In [None]:
conn.addData("""
  @prefix owl: <http://www.w3.org/2002/07/owl#>.
  @prefix ex: <ex://>.
  @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

  ex:Jans   ex:hasPet      ex:Birra.
  ex:hasPet rdfs:domain     ex:Human.
  ex:Human  rdfs:subClassOf ex:Mammal.
  ex:Jans   owl:sameAs      ex:Aasman.
""")

In [None]:
ptl(conn, "ex:Jans", "rdf:type", "ex:Human")

In [None]:
ptl(conn, "ex:Jans", "rdf:type", None)

In [None]:
ptl(conn, None, "rdf:type", "ex:Human")

In [None]:
ptl(conn, None, "rdf:type", None)

### Transitivity with sameAs

In [None]:
conn.addData("""
  @prefix owl: <http://www.w3.org/2002/07/owl#>.
  @prefix ex: <ex://>.
  @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.

  ex:contains    rdf:type    owl:TransitiveProperty.
  ex:USA         ex:contains ex:California.
  ex:GoldenState ex:contains ex:ContraCosta.
  ex:ContraCosta ex:contains ex:Moraga.

  ex:USA        owl:sameAs ex:UncleSam.
  ex:Moraga     owl:sameAs ex:MyTown.
  ex:California owl:sameAs ex:GoldenState.
""")

In [None]:
ptl(conn, "ex:USA", "ex:contains", "ex:Moraga")

In [None]:
ptl(conn, "ex:UncleSam", "ex:contains", "ex:MyTown")

In [None]:
ptl(conn, "ex:GoldenState", "ex:contains", "ex:Moraga")

In [None]:
ptl(conn, "ex:California", "ex:contains", "ex:Moraga")

In [None]:
ptl(conn, "ex:California", "ex:contains", "ex:MyTown")

In [None]:
ptl(conn, "ex:USA", "ex:contains", None)

In [None]:
ptl(conn, "ex:UncleSam", "ex:contains", None)

In [None]:
ptl(conn, None, "ex:contains", "ex:Moraga")

In [None]:
ptl(conn, None, "ex:contains", "ex:MyTown")