Skip to content

Working with documents

bstefanescu edited this page May 10, 2011 · 7 revisions

Here you will discover the repository API. Enjoy!

Repository object model

The main objects you need to manipulate when working with documents are:

  • RepositoryManager - manage repositories. Mainly used to get repository instances.

  • Repository - a repository instance. Mainly used to open new sessions.

  • CoreSession - an user session on a repository. When working with documents you must always first open a session. This object is providing the repository API.

  • DocumentModel - represent a document instance. It is mainly a data object, but may also provide some logic.

  • DocumentRef - represent a reference to a document. When passing documents as arguments to methods you will usually use references and not document instances. A reference is either a path reference, either an UID reference. Also you can get a reference to a document from the document instance. Example:

    • create an UID reference: new IdRef("the_doc_id")
    • create a path reference: new PathRef("/the/doc/path")
    • get the reference from a document instance: docModel.getRef()
  • Property - represent a property on a document.

  • Blob - represent a blob property value.

Getting the repository

To get a repository instance you use the org.eclipse.ecr.core.api.repository.RepositoryManager service:

RepositoryManager rm = Framework.getService(RepositoryManager.class);
Repository repo = rm.getDefaultRepository();

An application may define multiple repositories. Each repository have an unique name that you can use when getting a repository. The *getDefaultRepository method will return the first registered repository.

Example on getting a repository by using its name:

RepositoryManager rm = Framework.getService(RepositoryManager.class);
Repository repo = rm.getRepository("myrepo");

Opening a session

Now that we have a repository instance we can open a new session.

Note that before opening a session you must login if you are not already logged-in. We assume here that the code is running a in web request so the login was already done by a filter. See Authentication for how to perform logins. The same for transactions. We assume here that the code is executed in a transaction. Modifications on documents will be physically stored in the repository only when the transaction is committed.

RepositoryManager rm = Framework.getService(RepositoryManager.class);
Repository repo = rm.getDefaultRepository();
CoreSession session = repo.open();
// do something with the session
...
// close the session when no more needed
Repository.close(session);

In the following examples we assume that a session was already opened and is available as the 'session' variable.

Browsing the repository

Here is a method that print a repository tree rooted in the given root document:

public static void printTree(CoreSession session, DocumentModel root) {
  System.out.println(root.getPathAsString());
  for (DocumentModel doc : session.getChildren(root.getRef())) {
     printTree(session, root);
  }
}

Now we ca use that method to print the entire repository tree by passing as argument the root of the repository:

// get the repository root
DocumentModel root = session.getRootDocument();
// print the tree rooted in 'root'
printTree(session, root);

We learned here some basic methods from CoreSession useful for document tree traversal:

  • Session.getRootDocument() - get the root of the repository
  • Session.getChildren(DocumentRef parentRef) - get the children of a document as a list of documents. If document has no children an empty list is returned.

The CoreSession provides many other similar methods to retrieve repository documents, we will not explain here each method since the usage is similar. Thus you can have methods to get a child document knowing its local name, methods to get a children iterator, or others to get filtered children etc.

Fetching documents by reference

Usually you don't want to iterate over the repository to find a document given its path or UID. There is a method you can use for this: Session.getDocument(DocumentRef ref). Example:

Document doc = session.getDocument(new PathRef("/my/doc"));

Accessing document properties

Now that we know how to fetch a document we can start looking inside. A document instance is providing you all the set of properties that it contains plus some basic information like the document ID, its local name (the name of the document as stored by its parent), its path, its reference, its parent reference and of course its type.

  • Getting the document type: doc.getType() or doc.DocumentType(). The difference is that the first method returns the type name while the second one returns the type instance (i.e. a DocumentType object).
  • Getting the document UID: doc.getId()
  • Getting the document name: doc.getName() - the document name correspond with the last segment of its path.
  • Getting the document path: doc.getPathAsString() or doc.getPath(). The difference is that the first method returns the document path as a string while the second one returns a Path object. The Path object is immutable and can be used to easily construct other paths relative to this one.
  • Getting the document schemas: doc.getSchemas() - returns the list of schema names implemented by the document type.
  • getting the document facets: doc.getFacets() - returns the list of this document type facets.

Getting the document data

As we know, the document type is defining the structure of the document - such as which are the properties that can be set on the document. These list of possible properties and their types are defined by the schemas implemented by the document type.

For example if your document type is implementing the dublincore schema you will be able to set the "dc:title" property on the document:

doc.setPropertyValue("dc:title", "My Document");

Also you can set the dc:expired date property. To set date properties you can use Calendar or Date objects.

doc.setPropertyValue("dc:expired", new Date());

To retrieve property values you simply use the getter method:

Calendar expired = (Calendar)doc.getPropertyValue("dc:expired");

You can see when getting a date property you will receive a Calendar object. I will discuss the property types to java type mapping below.

Java types mapping

Removing properties

Using complex properties

Using list properties

Using array properties

Using blob properties

Introspecting properties

There are situations were you dynamically want to discover what are the properties already set on a document instance, or what are the possible properties that a document may have

Creating documents

Let's say we have defined a document type named "Folder" and want to create a new "Folder" document name "MyFolder" inside the document located at "/my/space" (so the path of the document will be "/my/space/MyFolder")

DocumentModel parent = session.getDocument(new PathRef("/my/space"));
// create an empty document model (this object is created in memory - it is not yet stored in the repository)
DocumentModel newDoc = session.createDocumentModel("/my/space", "MyFolder", "Folder");
// now set some basic properties like a title and a description.
newDoc.setPropertyValue("dc:title", "My Folder");
newDoc.setPropertyValue("dc:description", "This is My Folder!");
// store the document in the repository
newDoc = session.createDocument(newDoc);

Notes:

  1. You can see that after the document is stored another instance of the document is returned by the createDocument method. You must always use this instance if you want to work further with the document since when a document is persisted some properties of the document are automatically updated by the repository (as for example the dc:created - that represent the document creation time).

  2. The parent document where you want to create a new document must exists. The hierarchy will not be automatically created for you.

Modifying documents

Modifying documents is similar to creating documents. You set document properties then you call the Session.saveDocument(DocumentRef) method to save the changes made on the document object. In a transactional environment the modifications are physically stored in the repository only when the transaction is committed.

DocumentModel doc = session.getDocument(new PathRef("/my/space/MyFolder"));
// set some basic properties like a title and a description.
doc.setPropertyValue("dc:title", "My Folder 2");
doc.setPropertyValue("dc:description", "This is my modified Folder!");
// store the changes in the repository
doc = session.saveDocument(doc);

Again, like in the case of a creation - the saveDocument is returning back an up-to-date document object (i.e. with all automatic properties filled in).

Locking documents

Let's locking our Folder document:

session.setLock(new PathRef("/my/space/MyFolder"));

And now let's remove the lock:

session.removeLock(new PathRef("/my/space/MyFolder"));

Deleting documents

We will now remove the MyFolder document we created above:

session.removeDocument(new PathRef("/my/space/MyFolder"));

Searching documents

To search document you need to create an NXQL query and execute it. The search result is returned back as a list of documents.

DocumentModelList result = session.query("SELECT * FROM Folder");

This will return the all the documents having the "Folder" type.

Changing documents life cycle

TODO

Versioning documents

TODO

Publishing documents

TODO

Writing a repository listener

TODO

Example 1 - automatic generating properties

TODO

Example 2 - implementing an audit on document changes

TODO

What is the Session.save() method?

Initializing the repository tree

When first starting the repository, only the root document is present in the repository. But you usually wants to initialize the repository at first start with a default structure needed by your application business logic.

Let's say for example your application need to partition the repository in 'domains' - each 'domain' documents being used to store the documents being specific to that domain.

So, let's say we want the following initial structure:

/domain1
/domain2

To achieve that you need to install a repository initialization handler. The initialization handler will be called the first time the repository is started (i.e. after the root document is created)

So first you need to write your initialization handler:

public class MyRepositoryInitializationHandler extends RepositoryInitializationHandler {

    @Override
    public void doInitializeRepository(CoreSession session) throws ClientException {
        // This method gets called as a system user
        // so we have all needed rights to do the check and the creation

        // create the first domain
        DocumentModel root = session.getRootDocument();
        DocumentModel domain1 = session.createDocumentModel("/", "domain1", "Folder");
        domain1.setPropertyValue("dc:title", "Domain 1");
        // store the document in the repository
        domain1 = session.createDocument(newDoc);

        // create the second domain
        DocumentModel root = session.getRootDocument();
        DocumentModel domain2 = session.createDocumentModel("/", "domain2", "Folder");
        domain2.setPropertyValue("dc:title", "Domain 2");
        // store the document in the repository
        domain2 = session.createDocument(newDoc);

        session.save();
    }

}

Then to install it you need to create an ECR component that will install your handler when the component is activated:

public class MyComponent extends DefaultComponent {

   private MyRepositoryInitializationHandler initializationHandler;

    @Override
    public void activate(ComponentContext context) {
        initializationHandler = new MyRepositoryInitializationHandler();
        initializationHandler.install();
    }

    @Override
    public void deactivate(ComponentContext context) throws Exception {
        if (initializationHandler != null) {
            initializationHandler.uninstall();
            initializationHandler = null;
        }
    }
}

Don't forget to register your component as you do for any ECR component. Define the component XML description - let's say in OSGI-INF/mycomponent.xml and reference it in the MANIFEST.MF

Nuxeo-Component: OSGI-INF/mycomponent.xml

Here is the component XML file:

<?xml version="1.0"?>
<component name="org.eclipse.ecr.sample.MyComponent">
  <implementation
      class="org.eclipse.ecr.sample.MyComponent"/>
</component>

At the first start of the repository your registered initialization handler will be called and the repository initialized with your initial structure.

Creating a document adapter

TODO

Example of a document adapter

TODO