Skip to content

Commit

Permalink
Merge pull request #42 from scossu/development
Browse files Browse the repository at this point in the history
Alpha 12 development merge.
  • Loading branch information
scossu committed Apr 7, 2018
2 parents 4ba31dc + 74e2852 commit d683573
Show file tree
Hide file tree
Showing 16 changed files with 774 additions and 292 deletions.
11 changes: 11 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,14 @@ install:
- pip install -e .
script:
- python setup.py test

deploy:
provider: pypi
user: "scossu"
password:
  secure: "ANSqNv9T5AjDh2hkcWtikwxGu+MVmUC1K8s0QUZwGFfaLoNhwAe+Ol+a12It/oSQumZZQTPImpqvJ2dp6KaUXVvury9AI6La48lTinHNlZkRgLKhdqg0XV2ByxKkBxL0lmixtS+o0Ynv5CVX76iBxoaFTKU/eRMF9Pja6UvjNC7CZM+uh3C5/MUg82RdOS01R7m7SmM9uMTIoMzWb87837stTBmL8FiN3BkX25Weo4NDrLDamKl8QlFx2ozqkOj9SYJLO/HHhPv3HfSJeWNC6fsbNud9OAvKu+ZckPdVw1yNgjeTqpxhL7S/K0GuqZJ/efdwwPZLlsP+dSMSB3ftpUucpp3cBNOOjCvE+KHUWbHvIKJijwkMbVp/N/RWgfSzzwVlpy28JFzZirgvI0VGOovYI1NOW+kwe6aAffM0C00WA16bGZxxCDXeK2CeNDOpjXb0UhtwJTEayfpcRXEiginOaoUXISahPLnhVQoGLuyM+UG6oFg8RURAziXNOfaI6VgzcOF6EcfBhQlLs10RDVnfl9giP1kQ6twko/+n3bbRURDe1YXxk9HLwlzOszv8KGFU0G5UjRaX76RtMh5Y+a8wqni7g8ti74QiDmgG8a7aGZu9VUrLGnl1iRrM+xmoogYSuB7OxeUu+k+2mOJTHNz9qP+0+/FEeKazHoH8SmQ="
on:
tags: true
branch: master
distributions: "bdist_wheel"

2 changes: 1 addition & 1 deletion data/bootstrap/rsrc_centric_layout.sparql
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ INSERT DATA {
GRAPH <info:fcsystem/graph/admin/> {
<info:fcres/> a
fcrepo:RepositoryRoot , fcrepo:Resource , fcrepo:Container ,
ldp:Container , ldp:BasicContainer , ldp:RDFSource ;
ldp:Resource , ldp:Container , ldp:BasicContainer , ldp:RDFSource ;
fcrepo:created "$timestamp"^^xsd:dateTime ;
fcrepo:lastModified "$timestamp"^^xsd:dateTime ;
.
Expand Down
Binary file added docs/assets/lsup_sparql_query_ui.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
72 changes: 72 additions & 0 deletions docs/discovery.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
Resource Discovery & Query
==========================

LAKEsuperior offers several way to programmatically discover resources and
data.

LDP Traversal
-------------

The method compatible with the standard Fedora implementation and other LDP
servers is to simply traverse the LDP tree. While this offers the broadest
compatibility, it is quite expensive for the client, the server and the
developer.

For this method, please consult the dedicated `LDP specifications
<https://www.w3.org/TR/ldp/>`__ and `Fedora API specs
<https://wiki.duraspace.org/display/FEDORA4x/RESTful+HTTP+API+-+Containers>`__.

SPARQL Query
------------

A `SPARQL <https://www.w3.org/TR/sparql11-query/>`__ endpoint is available in
LAKEsuperior both as an API and a Web UI.

.. figure:: assets/lsup_sparql_query_ui.png
:alt: LAKEsuperior SPARQL Query Window

LAKEsuperior SPARQL Query Window

The UI is based on `YASGUI <http://about.yasgui.org/>`__.

Note that:

#. The SPARQL endpoint only supports the SPARQL 1.1 Query language.
SPARQL updates are not, and will not be, supported.
#. The LAKEshore data model has an added layer of structure that is not exposed
through the LDP layer. The SPARQL endpoint exposes this low-level structure
and it is beneficial to understand its layout. See :doc:`model` for details
in this regard.
#. The underlying RDF structure is mostly in the RDF named graphs. Querying
only triples will give a quite uncluttered view of the data, as close to the
LDP representation as possible.

SPARQL Caveats
~~~~~~~~~~~~~~

The SPARQL query facility has not yet been tested thoroughly. the RDFLib
implementation that it is based upon can be quite efficient for certain
queries but has some downsides. For example, do **not** attempt the following
query in a graph with more than a few thousands resources::

SELECT ?p ?o {
GRAPH ?g {
<info:fcres/my-uid> ?p ?o .
}
}

What the RDFLib implementation does is going over every single graph in the
repository and perform the ``?s ?p ?o`` query on each of them. Since
LAKEsuperior creates several graphs per resource, this can run for a very long
time in any decently sized data set.

The solution to this is either to omit the graph query, or use a term search,
or a native Python method if applicable.

Term Search
-----------

This feature has not yet been implemented. It is meant to provide a discovery
tool based on simple term match, and possibly comparison. It should be more
efficient and predictable than SPARQL.

13 changes: 13 additions & 0 deletions docs/fcrepo4_deltas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,19 @@ identifiers will be different).
This seems to break Hyrax at some point, but might have been fixed. This
needs to be verified further.

Allow PUT requests with empty body on existing resources
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FCREPO4 returns a ``409 Conflict`` if a PUT request with no payload is sent
to an existing resource.

LAKEsuperior allows to perform this operation, which would result in deleting
all the user-provided properties in that resource.

If the original resource is an LDP-NR, however, the operation will raise a
``415 Unsupported Media Type`` because the resource will be treated as an empty
LDP-RS, which cannot replace an existing LDP-NR.

Non-standard client breaking changes
------------------------------------

Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Indices and tables
:maxdepth: 3
:caption: User Reference

Discovery & Query <discovery>
Divergences from Fedora 4 <fcrepo4_deltas>
Messaging <messaging>
Migration Guide <migration>
Expand Down
107 changes: 106 additions & 1 deletion docs/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,4 +114,109 @@ Immediately forget a resource
Python API
----------

**TODO**
Set up the environment
~~~~~~~~~~~~~~~~~~~~~~

Before using the API, either do::

>>> import lakesuperior.env_setup

Or, to specify an alternative configuration::

>>> from lakesuperior.config_parser import parse_config
>>> from lakesuperior.globals import AppGlobals
>>> env.config, test_config = parse_config('/my/custom/config_dir')
Reading configuration at /my/custom/config_dir
>>> env.app_globals = AppGlobals(env.config)

Create and replace resources
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Create an LDP-RS (RDF reseouce) providng a Graph object::

>>> from rdflib import Graph, URIRef
>>> uid = '/rsrc_from_graph'
>>> gr = Graph().parse(data='<> a <http://ex.org/type#A> .',
... format='text/turtle', publicID=nsc['fcres'][uid])
>>> rsrc_api.create_or_replace(uid, init_gr=gr)

Issuing a ``create_or_replace()`` on an existing UID will replace the existing
property set with the provided one (PUT style).

Create an LDP-NR (non-RDF source)::

>>> uid = '/test_ldpnr01'
>>> data = b'Hello. This is some dummy content.'
>>> rsrc_api.create_or_replace(
... uid, stream=BytesIO(data), mimetype='text/plain')
'_create_'

Create under a known parent, providing a slug (POST style)::

>>> rsrc_api.create('/rsrc_from_stream', 'res1')


Retrieve Resources
~~~~~~~~~~~~~~~~~~

Retrieve a resource::

>>> rsrc = rsrc_api.get('/rsrc_from_stream')
>>> rsrc.uid
'/rsrc_from_stream'
>>> rsrc.uri
rdflib.term.URIRef('info:fcres/rsrc_from_stream')
>>> set(rsrc.metadata)
{(rdflib.term.URIRef('info:fcres/rsrc_from_stream'),
rdflib.term.URIRef('http://fedora.info/definitions/v4/repository#created'),
rdflib.term.Literal('2018-04-06T03:30:49.460274+00:00', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#dateTime'))),
[...]

Retrieve non-RDF content::

>>> ldpnr = rsrc_api.get('/test_ldpnr01')
>>> ldpnr.content.read()
b'Hello. This is some dummy content.'

See the :doc:`API docs <api>` for more details on resource methods.

Update Resources
~~~~~~~~~~~~~~~~

Using a SPARQL update string::

>>> uid = '/test_delta_patch_wc'
>>> uri = nsc['fcres'][uid]
>>> init_trp = {
... (URIRef(uri), nsc['rdf'].type, nsc['foaf'].Person),
... (URIRef(uri), nsc['foaf'].name, Literal('Joe Bob')),
... (URIRef(uri), nsc['foaf'].name, Literal('Joe Average Bob')),
... }

>>> update_str = '''
... DELETE {}
... INSERT { <> foaf:name "Joe Average 12oz Bob" . }
... WHERE {}
... '''

Using add/remove triple sets::

>>> remove_trp = {
... (URIRef(uri), nsc['foaf'].name, None),
... }
>>> add_trp = {
... (URIRef(uri), nsc['foaf'].name, Literal('Joan Knob')),
... }

>>> gr = Graph()
>>> gr += init_trp
>>> rsrc_api.create_or_replace(uid, graph=gr)
>>> rsrc_api.update_delta(uid, remove_trp, add_trp)

Note above that wildcards can be used, only in the remove triple set. Wherever
``None`` is used, all matches will be removed (in this example, all values of
``foaf:name``.

Generally speaking, the delta approach providing a set of remove triples and/or
a set of add triples is more convenient than SPARQL, which is a better fit for
complex query/update scenarios.
71 changes: 42 additions & 29 deletions lakesuperior/api/resource.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@

import arrow

from rdflib import Literal
from rdflib import Graph, Literal, URIRef
from rdflib.namespace import XSD

from lakesuperior.config_parser import config
from lakesuperior.exceptions import (
InvalidResourceError, ResourceNotExistsError, TombstoneError)
from lakesuperior.env import env
from lakesuperior.globals import RES_DELETED
from lakesuperior.globals import RES_DELETED, RES_UPDATED
from lakesuperior.model.ldp_factory import LDP_NR_TYPE, LdpFactory
from lakesuperior.store.ldp_rs.lmdb_store import TxnManager

Expand Down Expand Up @@ -77,7 +77,7 @@ def _wrapper(*args, **kwargs):
with TxnManager(env.app_globals.rdf_store, write=write) as txn:
ret = fn(*args, **kwargs)
if len(env.app_globals.changelog):
job = Thread(target=process_queue)
job = Thread(target=_process_queue)
job.start()
delattr(env, 'timestamp')
delattr(env, 'timestamp_term')
Expand All @@ -86,18 +86,18 @@ def _wrapper(*args, **kwargs):
return _transaction_deco


def process_queue():
def _process_queue():
"""
Process the message queue on a separate thread.
"""
lock = Lock()
lock.acquire()
while len(env.app_globals.changelog):
send_event_msg(*env.app_globals.changelog.popleft())
_send_event_msg(*env.app_globals.changelog.popleft())
lock.release()


def send_event_msg(remove_trp, add_trp, metadata):
def _send_event_msg(remove_trp, add_trp, metadata):
"""
Send messages about a changed LDPR.
Expand Down Expand Up @@ -199,7 +199,8 @@ def create(parent, slug, **kwargs):
:param str parent: UID of the parent resource.
:param str slug: Tentative path relative to the parent UID.
:param \*\*kwargs: Other parameters are passed to the
:meth:`LdpFactory.from_provided` method.
:py:meth:`~lakesuperior.model.ldp_factory.LdpFactory.from_provided`
method.
:rtype: str
:return: UID of the new resource.
Expand All @@ -214,31 +215,19 @@ def create(parent, slug, **kwargs):


@transaction(True)
def create_or_replace(uid, stream=None, **kwargs):
def create_or_replace(uid, **kwargs):
r"""
Create or replace a resource with a specified UID.
If the resource already exists, all user-provided properties of the
existing resource are deleted. If the resource exists and the provided
content is empty, an exception is raised (not sure why, but that's how
FCREPO4 handles it).
:param string uid: UID of the resource to be created or updated.
:param BytesIO stream: Content stream. If empty, an empty container is
created.
:param \*\*kwargs: Other parameters are passed to the
:meth:`LdpFactory.from_provided` method.
:py:meth:`~lakesuperior.model.ldp_factory.LdpFactory.from_provided`
method.
:rtype: str
:return: Event type: whether the resource was created or updated.
"""
rsrc = LdpFactory.from_provided(uid, stream=stream, **kwargs)

if not stream and rsrc.is_stored:
raise InvalidResourceError(rsrc.uid,
'Resource {} already exists and no data set was provided.')

return rsrc.create_or_replace()
return LdpFactory.from_provided(uid, **kwargs).create_or_replace()


@transaction(True)
Expand All @@ -248,19 +237,43 @@ def update(uid, update_str, is_metadata=False):
:param string uid: Resource UID.
:param string update_str: SPARQL-Update statements.
:param bool is_metadata: Whether the resource metadata is being updated.
If False, and the resource being updated is a LDP-NR, an error is
raised.
:param bool is_metadata: Whether the resource metadata are being updated.
:raise InvalidResourceError: If ``is_metadata`` is False and the resource
being updated is a LDP-NR.
"""
rsrc = LdpFactory.from_stored(uid)
# FCREPO is lenient here and Hyrax requires it.
rsrc = LdpFactory.from_stored(uid, handling='lenient')
if LDP_NR_TYPE in rsrc.ldp_types and not is_metadata:
raise InvalidResourceError(uid)
raise InvalidResourceError(
'Cannot use this method to update an LDP-NR content.')

rsrc.sparql_update(update_str)
delta = rsrc.sparql_delta(update_str)
rsrc.modify(RES_UPDATED, *delta)

return rsrc


@transaction(True)
def update_delta(uid, remove_trp, add_trp):
"""
Update a resource graph (LDP-RS or LDP-NR) with sets of add/remove triples.
A set of triples to add and/or a set of triples to remove may be provided.
:param string uid: Resource UID.
:param set(tuple(rdflib.term.Identifier)) remove_trp: Triples to
remove, as 3-tuples of RDFLib terms.
:param set(tuple(rdflib.term.Identifier)) add_trp: Triples to
add, as 3-tuples of RDFLib terms.
"""
rsrc = LdpFactory.from_stored(uid)
remove_trp = rsrc.check_mgd_terms(remove_trp)
add_trp = rsrc.check_mgd_terms(add_trp)

return rsrc.modify(RES_UPDATED, remove_trp, add_trp)


@transaction(True)
def create_version(uid, ver_uid):
"""
Expand Down

0 comments on commit d683573

Please sign in to comment.