Skip to content

Commit

Permalink
Better docs for datastore
Browse files Browse the repository at this point in the history
Link to docstrings via autodoc to avoid duplication. Check api reference
for consistency. Other minor changes.
  • Loading branch information
amercader committed Oct 12, 2012
1 parent 19edd37 commit 5b808e7
Show file tree
Hide file tree
Showing 3 changed files with 108 additions and 143 deletions.
105 changes: 73 additions & 32 deletions ckanext/datastore/logic/action.py
Expand Up @@ -12,13 +12,19 @@
def datastore_create(context, data_dict):
'''Adds a new table to the datastore.
:param resource_id: resource id that the data is going to be stored under.
The datastore_create action allows a user to post JSON data to be
stored against a resource. This endpoint also supports altering tables,
aliases and indexes and bulk insertion.
See :ref:`fields` and :ref:`records` for details on how to lay out records.
:param resource_id: resource id that the data is going to be stored against.
:type resource_id: string
:param aliases: names for read only aliases to the resource.
:param aliases: names for read only aliases of the resource.
:type aliases: list or comma separated string
:param fields: fields/columns and their extra metadata.
:type fields: list of dictionaries
:param records: the data, eg: [{"dob": "2005", "some_stuff": ['a', b']}]
:param records: the data, eg: [{"dob": "2005", "some_stuff": ["a", "b"]}]
:type records: list of dictionaries
:param primary_key: fields that represent a unique key
:type primary_key: list or comma separated string
Expand Down Expand Up @@ -58,16 +64,29 @@ def datastore_create(context, data_dict):
def datastore_upsert(context, data_dict):
'''Updates or inserts into a table in the datastore
The datastore_upsert API action allows a user to add or edit records to
an existing dataStore resource. In order for the *upsert* and *update*
methods to work, a unique key has to be defined via the datastore_create
action. The available methods are:
*upsert*
Update if record with same key already exists, otherwise insert.
Requires unique key.
*insert*
Insert only. This method is faster that upsert, but will fail if any
inserted record matches an existing one. Does *not* require a unique
key.
*update*
Update only. An exception will occur if the key that should be updated
does not exist. Requires unique key.
:param resource_id: resource id that the data is going to be stored under.
:type resource_id: string
:param records: the data, eg: [{"dob": "2005", "some_stuff": ['a', b']}]
:param records: the data, eg: [{"dob": "2005", "some_stuff": ["a","b"]}]
:type records: list of dictionaries
:param method: the method to use to put the data into the datastore
possible options: upsert (default), insert, update
:param upsert: update if record with same key already exists,
otherwise insert
:param insert: insert only, faster because checks are omitted
:param update: update only, exception if key does not exist
:param method: the method to use to put the data into the datastore.
Possible options are: upsert (default), insert, update
:type method: string
:returns: the newly created data object.
Expand Down Expand Up @@ -97,12 +116,13 @@ def datastore_upsert(context, data_dict):


def datastore_delete(context, data_dict):
'''Deletes a table from the datastore.
'''Deletes a table or a set of records from the datastore.
:param resource_id: resource id that the data will be deleted from.
:type resource_id: string
:param filter: filter to do deleting on over (eg {'name': 'fred'}).
:param filters: filters to apply before deleting (eg {"name": "fred"}).
If missing delete whole table and all dependent views.
:type filters: dictionary
:returns: original filters sent.
:rtype: dictionary
Expand Down Expand Up @@ -134,9 +154,11 @@ def datastore_delete(context, data_dict):
def datastore_search(context, data_dict):
'''Search a datastore table.
:param resource_id: id or alias of the data that is going to be selected.
The datastore_search action allows a user to search data in a resource.
:param resource_id: id or alias of the resource to be searched against.
:type resource_id: string
:param filters: matching conditions to select.
:param filters: matching conditions to select, e.g {"key1": "a", "key2": "b"}
:type filters: dictionary
:param q: full text query
:type q: string
Expand All @@ -146,24 +168,31 @@ def datastore_search(context, data_dict):
:type language: string
:param limit: maximum number of rows to return (default: 100)
:type limit: int
:param offset: offset the number of rows
:param offset: offset this number of rows
:type offset: int
:param fields: fields to return
(default: all fields in original order)
:param fields: fields to return (default: all fields in original order)
:type fields: list or comma separated string
:param sort: comma separated field names with ordering
eg: "fieldname1, fieldname2 desc"
e.g.: "fieldname1, fieldname2 desc"
:type sort: string
:returns: a dictionary containing the search parameters and the
search results.
keys: fields: same as datastore_create accepts
offset: query offset value
limit: query limit value
filters: query filters
total: number of total matching records
records: list of matching results
:rtype: dictionary
**Results:**
The result of this action is a dict with the following keys:
:rtype: A dictionary with the following keys
:param fields: fields/columns and their extra metadata
:type fields: list of dictionaries
:param offset: query offset value
:type offset: int
:param limit: query limit value
:type limit: int
:param filters: query filters
:type filters: list of dictionaries
:param total: number of total matching records
:type total: int
:param records: list of matching results
:type records: list of dictionaries
'''
res_id = _get_or_bust(data_dict, 'resource_id')
Expand All @@ -190,15 +219,27 @@ def datastore_search(context, data_dict):

@logic.side_effect_free
def datastore_search_sql(context, data_dict):
'''Execute SQL-Queries on the datastore.
'''Execute SQL queries on the datastore.
The datastore_search_sql action allows a user to search data in a resource
or connect multiple resources with join expressions. The underlying SQL
engine is the
`PostgreSQL engine <http://www.postgresql.org/docs/9.1/interactive/sql/.html>`_
.. note:: This action is only available when using PostgreSQL 9.X and using a read-only user on the database.
:param sql: a single sql select statement
:type sql: string
:returns: a dictionary containing the search results.
keys: fields: columns for results
records: results from the query
:rtype: dictionary
**Results:**
The result of this action is a dict with the following keys:
:rtype: A dictionary with the following keys
:param fields: fields/columns and their extra metadata
:type fields: list of dictionaries
:param records: list of matching results
:type records: list of dictionaries
'''
sql = _get_or_bust(data_dict, 'sql')
Expand Down
2 changes: 2 additions & 0 deletions doc/apiv3.rst
Expand Up @@ -18,6 +18,8 @@ If you don't specify the version number then you will default to version 1 of th
* ``http://ckan.net/api/util`` (version 1)
* ``http://ckan.net/api/action`` (version 3)

.. _action-api:

Action API
~~~~~~~~~~

Expand Down
144 changes: 33 additions & 111 deletions doc/datastore.rst
Expand Up @@ -6,7 +6,8 @@ The CKAN DataStore provides a database for structured storage of data together
with a powerful Web-accessible Data API, all seamlessly integrated into the CKAN
interface and authorization system.

.. note:: The DataStore requires PostgreSQL 9.0 or later. It is possible to use the DataStore on versions prior to 9.0 (for example 8.4). However, the :ref:`datastore_search_sql` will not be available and the set-up is slightly different. Make sure, you read :ref:`old_pg` for more details.
.. note:: The DataStore requires PostgreSQL 9.0 or later. It is possible to use the DataStore on versions prior to 9.0 (for example 8.4). However, the :meth:`~ckanext.datastore.logic.action.datastore_search_sql` will not be available and the set-up is slightly different. Make sure, you read :ref:`old_pg` for more details.


.. warning:: The DataStore does not support hiding resources in a private dataset.

Expand Down Expand Up @@ -82,7 +83,7 @@ Once the DataStore database and the users are created, the permissions on the Da

1. Use the **paster command** if CKAN and PostgreSQL are on the same server

To set the permissions, use this paster command after you've set the database urls::
To set the permissions, use this paster command after you've set the database urls (make sure to have your virtualenv activated)::

paster datastore set-permissions SQL_SUPER_USER

Expand Down Expand Up @@ -128,7 +129,7 @@ the records inserted above::
Legacy mode: use the DataStore with old PostgreSQL versions
-----------------------------------------------------------

The DataStore can be used with a PostgreSQL version prior to 9.0 in *legacy mode*. Due to the lack of some functionality, the :ref:`datastore_search_sql` and consequently the :ref:`datastore_search_htsql` cannot be used. The set-up for legacy mode is analogous to the normal set-up as described in :ref:`installation` with a few changes and consists of the following steps:
The DataStore can be used with a PostgreSQL version prior to 9.0 in *legacy mode*. Due to the lack of some functionality, the :meth:`~ckanext.datastore.logic.action.datastore_search_sql` and consequently the :ref:`datastore_search_htsql` cannot be used. The set-up for legacy mode is analogous to the normal set-up as described in :ref:`installation` with a few changes and consists of the following steps:

1. Enable the extension
#. Set-Up the database
Expand Down Expand Up @@ -173,7 +174,7 @@ The DataStore Data API
======================

The DataStore's Data API, which derives from the underlying data table,
is RESTful and JSON-based with extensive query capabilities.
is JSON-based with extensive query capabilities.

Each resource in a CKAN instance can have an associated DataStore 'table'. The
basic API for accessing the DataStore is outlined below. For a detailed
Expand All @@ -183,105 +184,15 @@ tutorial on using this API see :doc:`using-data-api`.
API Reference
-------------

.. note:: Lists can always be expressed in different ways. It is possible to use lists, comma separated strings or single items. These are valid lists: ``['foo', 'bar']``, ``'foo, bar'``, ``"foo", "bar"`` and ``'foo'``.


datastore_create
~~~~~~~~~~~~~~~~

The datastore_create API endpoint allows a user to post JSON data to be stored against a resource. This endpoint also supports altering tables, aliases and indexes and bulk insertion. The JSON must be in the following form::

{
resource_id: # the data is going to be stored against.
aliases: # list of names for read-only aliases to the resource
fields: # a list of dictionaries of fields/columns and their extra metadata.
records: # a list of dictionaries of the data, eg: [{"dob": "2005", "some_stuff": ['a', 'b']}, ..]
primary_key: # list of fields that represent a unique key
indexes: # indexes on table
}

See :ref:`fields` and :ref:`records` for details on how to lay out records.



datastore_delete
~~~~~~~~~~~~~~~~

The datastore_delete API endpoint allows a user to delete records from a resource. The JSON for searching must be in the following form::

{
resource_id: # the data that is going to be deleted.
filter: # dictionary of matching conditions to delete
# e.g {'key1': 'a', 'key2': 'b'}
# this will be equivalent to "delete from table where key1 = 'a' and key2 = 'b' "
}


datastore_upsert
~~~~~~~~~~~~~~~~

The datastore_upsert API endpoint allows a user to add or edit records in an existing DataStore resource. In order for the ``upsert`` and ``update`` methods to work, a unique key has to defined via the datastore_create API endpoint command.
The JSON for searching must be in the following form::

{
resource_id: # resource id that the data is going to be stored under.
records: # a list of dictionaries of the data, eg: [{"dob": "2005", "some_stuff": ['a', 'b']}, ..]
method: # the method to use to put the data into the datastore
# possible options: upsert (default), insert, update
}

``upsert``
Update if record with same key already exists, otherwise insert. Requires unique key.
``insert``
Insert only. This method is faster that upsert, but will fail if any inserted record matches an existing one. Does *not* require a unique key.
``update``
Update only. An exception will occur if the key that should be updated does not exist. Requires unique key.

.. _datastore_search:

datastore_search
~~~~~~~~~~~~~~~~
The datastore related API actions are accessed via CKAN's :ref:`action-api`. When POSTing
requests, parameters should be provided as JSON objects.

The datastore_search API endpoint allows a user to search data in a resource.
The JSON for searching must be in the following form::

{
resource_id: # the resource id to be searched against
filters : # dictionary of matching conditions to select e.g {'key1': 'a. 'key2': 'b'}
# this will be equivalent to "select * from table where key1 = 'a' and key2 = 'b' "
q: # full text query
plain: # treat as plain text query (default: true)
language: # language of the full text query (default: english)
limit: # limit the amount of rows to size (default: 100)
offset: # offset the amount of rows
fields: # list of fields return in that order, defaults (empty or not present) to all fields in fields order.
sort: # ordered list of field names as, eg: "fieldname1, fieldname2 desc"
}

.. _datastore_search_sql:

datastore_search_sql
~~~~~~~~~~~~~~~~~~~~

The datastore_search_sql API endpoint allows a user to search data in a resource or connect multiple resources with join expressions. The underlying SQL engine is the `PostgreSQL engine <http://www.postgresql.org/docs/9.1/interactive/sql/.html>`_. The JSON for searching must be in the following form::

{
sql: # a single sql select statement
}


.. _datastore_search_htsql:

datastore_search_htsql
~~~~~~~~~~~~~~~~~~~~~~
.. note:: Lists can always be expressed in different ways. It is possible to use lists, comma separated strings or single items. These are valid lists: ``['foo', 'bar']``, ``'foo, bar'``, ``"foo", "bar"`` and ``'foo'``.

.. note:: HTSQL is not in the core DataStore. To use it, it is necessary to install the ckanext-htsql extension available at https://github.com/okfn/ckanext-htsql.
.. automodule:: ckanext.datastore.logic.action
:members:

The datastore_search_htsql API endpoint allows a user to search data in a resource using the `HTSQL <http://htsql.org/doc/>`_ query expression language. The JSON for searching must be in the following form::

{
htsql: # a htsql query statement.
}

.. _fields:

Expand Down Expand Up @@ -375,19 +286,30 @@ Table aliases

A resource in the DataStore can have multiple aliases that are easier to remember than the resource id. Aliases can be created and edited with the datastore_create API endpoint. All aliases can be found in a special view called ``_table_metadata``.


.. _datastore_search_htsql:

HTSQL Support
=============


The `ckanext-htsql <https://github.com/okfn/ckanext-htsql>`_ extension adds an API action that allows a user to search data in a resource using the `HTSQL <http://htsql.org/doc/>`_ query expression language. Please refer to the extension documentation to know more.



Comparison of different querying methods
----------------------------------------
========================================

The DataStore supports querying with multiple API endpoints. They are similar but support different features. The following list gives an overview of the different methods.

============================== ======================= =========================== =============================
.. :ref:`datastore_search` :ref:`datastore_search_sql` :ref:`datastore_search_htsql`
.. SQL HTSQL
============================== ======================= =========================== =============================
**Status** Stable Stable Available as extension
**Ease of use** Easy Complex Medium
**Flexibility** Low High Medium
**Query language** Custom (JSON) SQL HTSQL
**Connect multiple resources** No Yes Not yet
**Use aliases** Yes Yes Yes
============================== ======================= =========================== =============================
============================== ======================================================== ============================================================ =============================
.. :meth:`~ckanext.datastore.logic.action.datastore_search` :meth:`~ckanext.datastore.logic.action.datastore_search_sql` :ref:`datastore_search_htsql`
.. SQL HTSQL
============================== ======================================================== ============================================================ =============================
**Status** Stable Stable Available as extension
**Ease of use** Easy Complex Medium
**Flexibility** Low High Medium
**Query language** Custom (JSON) SQL HTSQL
**Connect multiple resources** No Yes Not yet
**Use aliases** Yes Yes Yes
============================== ======================================================== ============================================================ =============================

0 comments on commit 5b808e7

Please sign in to comment.