In [5]:
import cordra
from lucenequerybuilder import Q

from pathlib import Path
import json

# Initialize interactions

The CordraClient class handles interactions with a Cordra database.  The class stores the host URL and authentification settings. If username and/or password are not given, a prompt will ask for them.

Thoughts: integrate token usage in during initialization with a "usetokens" bool option? Everything else is then handled internally.

In [2]:
db = cordra.CordraClient("https://localhost:8443", username='admin', verify=False)

Enter password for admin @ https://localhost:8443:········


## Data as a Python object

Here, a Python object is defined to generate the Cordra object content.  This is a very simple example that could be refined and improved on.

Cordra-centric improvements:

- Need a Base class to build on.
- Since each class represents a specific schema, the schema name should be an attribute.
- Integrate in payload handling? Maybe an "add_payload()" method on the Base class?
- Use pydantic rather than dict for building the json.

Broader design ideas:

- Python object should primarily represent the Python interpretation of the data. Here, the representation is extremely simple but things can quickly get complicated.
- Then, methods should exist to generate records based on the object content.
- From this the "json()" method might be renamed "cordra()", or have the cordra() method use json(). 

In [2]:
class Document():
    def __init__(self, name=None, description=None):
        self.name = name
        self.description = description
    
    def json(self):
        out = {}
        if self.name:
            out['name'] = self.name
        if self.description:
            out['description'] = self.description
            
        return json.dumps(out)

In [3]:
# Create Document object
doc = Document('Test', 'A test for uploading a document and payload')

In [6]:
# Show that Cordra json is automatically generated
doc.json()

'{"name": "Test", "description": "A test for uploading a document and payload"}'

## Payload object as well?

Payloads is a utility class that helps support the upload of data files to the Cordra repository by generating the appropriate file content. 

Still simple, but a new Payloads takes:

- An equal number of names and filenames
- Allows for new payloads to be added after init
- Has a "json()" method that generates the params['files'] content for the request.


In [5]:
payloads = cordra.Payloads('p1', "example-data.csv")

## Create

The CordraClient.create() method creates a new record in the Cordra instance from 

- A Python dict or a object with a json() method.  This allows for either raw content or organized content.
- The schema style. Could be incorporated into the object as mentioned above.
- payloads, given as an object here.  Could be incorporated into the object as mentioned above.

NOTE: I'm missing the acl info so records are not "public". Option is there, but not tested... 

In [6]:
db.create(doc, 'Document', payloads=payloads)

{'id': 'test/7d9ce113ebfe6665b2f5',
 'name': 'Test',
 'description': 'A test for uploading a document and payload'}

## Find

I don't know enough about Cordra yet, but should/can some sort of lucenequerybuilder support be built in to the class?

In [7]:
q = Q('metadata')
my_results = db.find(str(q))
my_results

{'pageNum': 0,
 'pageSize': -1,
 'size': 2,
 'results': [{'id': 'test/f1f0188d7e74ce2e9b39',
   'name': 'example 1',
   'description': 'an example of metadata for CSV payload'},
  {'id': 'test/329512e07aba172ad269',
   'name': 'example 1',
   'description': 'an example of metadata for CSV payload'}]}

## Retrieve

In [4]:
id = 'test/249b3da4b04b22620bc4'

In [8]:
db.retrieve('test/7d9ce113ebfe6665b2f5', text=False)

{'id': 'test/7d9ce113ebfe6665b2f5',
 'name': 'Test',
 'description': 'A test for uploading a document and payload'}