# XML-RPC connection with *pyexistdb*

2019-12-21 djb

## Notes

* This tutorial assumes that you have installed and launched eXist-db and installed the Demo Apps. We run it on our local
machine, but you can use any running eXist-db instance that has the app files.
* We focus on browsing collections, reading files, and executing queries. We do not write into the database
(and therefore do not authenticate).
* We will use REST, rather than XML-RPC, for the NEH Institute.

## Import the library

Importing the pacakge with `import pyexistdb` does not expose the `db` module, so we import it directly.

The import sometimes (!) raises an error (“yacc table file version is out of date”). We ignore this because
we don’t understand it.

In [1]:
from pyexistdb.db import ExistDB

## Create a connection and verify that it works

From the docs: “Construction doesn't initiate server communication, only store
information about where the server is, to be used in later
communications.”

In [2]:
test = ExistDB(server_url = 'http://localhost:8080/exist')
print(test)

<pyexistdb.db.ExistDB object at 0x7ffd6847c4a8>


## Look at the XML data files in a collection

This assumes that you have installed the *eXist-db Demo Apps* from the eXist-db repo. You can do this from the
package manager if you are using an eXist-db instance where you have dba privileges. If not, you can browse any
collection to which you have access.

The connection is at */db/*, so paths are relative to that.

In [3]:
test.getCollectionDescription('apps/demo/data')

{'owner': 'demo',
 'collections': ['addresses', 'binary', 'i18n'],
 'documents': [{'name': 'r_and_j.xml',
   'owner': 'demo',
   'type': 'XMLResource',
   'permissions': 509,
   'group': 'demo'},
  {'name': 'mondial.xml',
   'owner': 'demo',
   'type': 'XMLResource',
   'permissions': 509,
   'group': 'demo'},
  {'name': 'macbeth.xml',
   'owner': 'demo',
   'type': 'XMLResource',
   'permissions': 509,
   'group': 'demo'},
  {'name': 'hamlet.xml',
   'owner': 'demo',
   'type': 'XMLResource',
   'permissions': 509,
   'group': 'demo'},
  {'name': 'shakes.xsl',
   'owner': 'demo',
   'type': 'XMLResource',
   'permissions': 509,
   'group': 'demo'}],
 'created': '1576494179903',
 'permissions': 509,
 'name': '/db/apps/demo/data',
 'group': 'demo'}

## What can pyexistdb do?

In [4]:
help(ExistDB)

Help on class ExistDB in module pyexistdb.db:

class ExistDB(builtins.object)
 |  ExistDB(server_url=None, username=None, password=None, resultType=None, encoding='UTF-8', verbose=False, keep_alive=None, timeout=<object object at 0x7ffd68374790>)
 |  
 |  Connect to an eXist database, and manipulate and query it.
 |  
 |  Construction doesn't initiate server communication, only store
 |  information about where the server is, to be used in later
 |  communications.
 |  
 |  :param server_url: The eXist server URL.  New syntax (as of 0.20)
 |      expects primary eXist url and *not* the ``/xmlrpc`` endpoint;
 |      for backwards compatibility, urls that include `/xmlrpc``
 |      are still handled, and will be parsed to set exist server path
 |      as well as username and password if specified.  Note that username
 |      and password parameters take precedence over username
 |      and password in the server url if both are specified.
 |  :param username: exist username, if any
 |  :

## Retrieve a resource (XML file)

If the document contains a processing instruction for an XSLT transformation, as is the case with *Hamlet* in the demo apps, the result *after( the transformation is returned. See below for accessing the raw XML.

In [5]:
hamlet = test.getDocument('apps/demo/data/hamlet.xml')
print(hamlet[:1000])
type(hamlet)

b'<?xml version="1.0" encoding="UTF-8"?>\n<div xmlns:exist="http://exist.sourceforge.net/NS/exist">\n    <h1>The Tragedy of Hamlet, Prince of Denmark</h1>\n    <h3><em>HAMLET</em></h3>\n    <blockquote>\n        <tt>ASCII text placed in the public domain by Moby Lexical Tools, 1992.</tt><br />\n        <tt>SGML markup by Jon Bosak, 1992-1994.</tt><br />\n        <tt>XML version by Jon Bosak, 1996-1999.</tt><br />\n        <tt>The XML markup in this version is Copyright \xc2\xa9 1999 Jon Bosak.\nThis work may freely be distributed on condition that it not be\nmodified or altered in any way.</tt><br />\n    </blockquote>\n    <p><b>Table of Contents</b></p>\n    <ul>\n        <li><a href="#d14e18">Dramatis Personae</a></li>\n        <ul></ul>\n        <li><a href="#d14e94">ACT I</a></li>\n        <ul>\n            <li><a href="#d14e98">SCENE I.  Elsinore. A platform before the castle.</a></li>\n            <li><a href="#d14e828">SCENE II.  A room of state in the castle.</a></li>\n       

bytes

## To render HTML instead of bytes, use the IPython package

In [6]:
from IPython.display import HTML

In [None]:
HTML(hamlet.decode('UTF-8'))

## To query a document or collection in the database

### Simple XPath path expression

In [7]:
q = '''
doc('apps/demo/data/hamlet.xml')/descendant::SPEECH
'''
first_speech = test.query(q).results[0]
first_speech

<Element SPEECH at 0x7ffd7838fdc8>

Uh oh. It’s an *lxml* element. Better import:

In [8]:
from lxml.etree import tostring

In [9]:
tostring(first_speech)

b'<SPEECH xmlns:exist="http://exist.sourceforge.net/NS/exist">\n                <SPEAKER>BERNARDO</SPEAKER>\n                <LINE>Who\'s there?</LINE>\n            </SPEECH>\n    '

### XQuery FLWOR

#### Run the query

By default the `query()` method returns the first 10 results. We’ll leave that default in place for now. `query()` returns a `queryResult` object, whose `results` attribute contains a list of all (that is, the first 10) results.

In [10]:
q = '''
let $ham := doc('apps/demo/data/hamlet.xml')
let $speakers := distinct-values($ham/descendant::SPEAKER)
for $speaker in $speakers
order by $speaker
return <speaker>{$speaker}</speaker>
'''
all_speakers = test.query(q)
print('The type of the result is', type(all_speakers))
print('There are', all_speakers.count, 'results')

The type of the result is <class 'pyexistdb.db.QueryResult'>
There are 10 results


#### Output the results

In [11]:
for i in all_speakers.results:
    print(tostring(i))

b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">All</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">BERNARDO</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">CORNELIUS</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">Captain</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">Danes</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">FRANCISCO</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">First Ambassador</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">First Clown</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">First Player</speaker>\n    '
b'<speaker xmlns:exist="http://exist.sourceforge.net/NS/exist">First Priest</speaker>\n'


### Passing parameters to a script within the database

eXist-db can store queries inside the database (stored procedure) and execute them with parameters supplied at run time. So far we haven’t found a way to pass parameters into a stored procedure using XML-RPC. Inquiry on eXist-open 2019-12-22.