Skip to content

Latest commit

 

History

History
149 lines (119 loc) · 7.16 KB

README.markdown

File metadata and controls

149 lines (119 loc) · 7.16 KB

MLJSON - A JSON Facade on top of MarkLogic

The MLJSON project is a set of libraries and REST endpoints to enable the MarkLogic Server to become an advanced JSON store.

MarkLogic

  • High-performance, scalable database for unstructured information
  • "NoSQL" datastore (no tables, rows, columns) - just documents and unique IDs (URIs).
  • Uses XML datamodel for documents, query-able via XQuery, XSLT, XPath
  • Uses search-engine techniques to efficiently expose real-time fulltext search
  • ACID-compliant CRUD (Create, Read, Update, Delete)

JSON

  • JavaScript Object Notation
  • A lightweight data-encoding and interchange format
  • Native to JavaScript, now widely utilized across languages
  • Commonly used for passing data to web browsers

Design goal

Enable developers to store and search/query JSON inside MarkLogic (without knowledge of XQuery, XSLT, or XPath)

Design considerations:

  1. Approach things from a JSON angle
  • Create the XML to match the JSON, not vice-versa
  1. Make good use of MarkLogic indexes
  • Craft the XML so it's fast to query
  1. XML representation of JSON is an implementation detail - users only need think in terms of JSON

Overview

MLJSON exposes REST endpoints that allow a developer to easily store and retrieve JSON documents from the database (CRUD). It also exposes a very powerful query interface that uses a native JSON syntax:

Query using native JSON syntax

  • Don't expose the XML internals to users
  • Support full range of MarkLogic indexes

This query interface allows the user to find documents via "path" expressions as well as full text search expressions. For those familiar with MarkLogic, it exposes all of the functionality found in the CTS search functions.

Presentation

Here are some slides from a presentation on MLJSON given at XML Prague 2011.


Installation

Installing MLJSON is fairly simple:

  1. If you don't have a HTTP server configured in MarkLogic, create one
  2. Set the URL rewriter for the HTTP server to: /data/lib/rewriter.xqy
  3. Download the MLJSON source and unzip it underneath the document directory that you configured in the MarkLogic HTTP server

Feel free to remove the README and LICENSE files along with the test directory. But keep the config and data directories structured as they are. You can augment the functionality of MLJSON by writing your own XQuery and having it live alongside the MLJSON files.

The URL rewritter is configured in the config/endpoints.xqy file. You can change the URL structure or add in more rules if need be there.

Files relevant to the end user

  • data/lib/rewriter.xqy - A URL rewriter for the REST calls
  • config/endpoints.xqy - Configuration for the REST endpoints
  • data/lib/json.xqy - Has two public functions:
    • jsonToXML - parses a JSON string into XML that can be stored in MarkLogic
    • xmlToJSON - parses the generated XML into a JSON string
  • data/lib/json-query.xqy - Tinkering with ways to query the stored JSON

REST Capabilities

Document management

Insert a JSON document

  • Request type: PUT or POST

  • Request body should be the JSON document

  • Example: /data/store/foo/bar.json - Will insert the document in the database with a uri of "/foo/bar.json"

  • Optional: When inserting a document you can set permissions, properties, collections and a document quality.

    • /data/store/foo/bar.json?permission=public:read&permission=admin:write
    • /data/store/foo/bar.json?property=key:value&property=published:false
    • /data/store/foo/bar.json?collection=public&collection=published
    • /data/store/foo/bar.json?quality=10
    • /data/store/foo/bar.json?permission=public:read&collection=public&quality=10
  • Notes:

    • You can set multiple permissions, properties and collections by including multiple definitions in your request, as shown above
    • Permissions must follow a : pattern where capability is one of read, update or execute
    • Properties must follow a : pattern where the key is alphanumeric and starts with a letter

Get a JSON document

  • Request type: GET
  • Example: /data/store/foo/bar.json - Get the document with a uri of "/foo/bar.json"
  • Optional: To fetch metadata associated about the document, specify what you'd like to include in the response.
    • /data/store/foo/bar.json?include=content - Simply returns the document as supplied via the PUT (default)
    • /data/store/foo/bar.json?include=permissions - Returns the permissions on the document
    • /data/store/foo/bar.json?include=collections - Returns the collections on the document
    • /data/store/foo/bar.json?include=properties - Returns the properties on the document
    • /data/store/foo/bar.json?include=quality - Returns the quality of the document
    • /data/store/foo/bar.json?include=content&include=permissions&include=quality - Returns the content, permissions and quality of the document
    • /data/store/foo/bar.json?include=all - Returns the content along with all of its metadata

Delete a JSON document

  • Request type: DELETE
  • Example: /data/store/foo/bar.json - Delete the document with a uri of "/foo/bar.json"

Set a property on a document

  • Request type: POST
  • Properties are not held inside the JSON document, properties are stored outside of the document and don't effect the stored document at all. They are best thought of as metadata about the document but should be avoided if possible due to storage overhead.
  • Example: /data/store/foo/bar.json?property=publishState:final&property=needsEditorial:false

Set permissions on a document

  • Request type: POST
  • When setting permissions on a document, all of the existing permissions are overwritten.
  • Example: /data/store/foo/bar.json?permission=public:read&permission=admin:write

Set collections on a document

  • Request type: POST
  • When setting collections on a document, all of the existing collections are overwritten.
  • Example: /data/store/foo/bar.json?collection=public&collection=published

Set the quality of a document

  • Request type: POST
  • Example: /data/store/foo/bar.json?quality=10

Key/Value queries

The key/value query endpoint allows you to easily grab the first document that contains the key/value combination. Multple keys are and'd together and multiple values for the same key are or'd together.

  • Request type: GET
  • Examples:
    • /data/kvquery?foo=bar - Document that has a 'foo' key with a value of 'bar'
    • /data/kvquery?foo=bar&baz=yaz - Document that has a 'foo' key with a value of 'bar' and a 'baz' key with a value of 'yaz'
    • /data/kvquery?foo=bar&foo=bar - Document that has a 'foo' key with a value of 'bar' or 'baz'

Server information

Information about the MarkLogic server version, hardware and index settings can be obtained with an info request.

  • Request type: GET
  • Example: /data/info

TODO

  • Move a document
  • Copy a document