Skip to content

Commit

Permalink
[#1797,doc][s]: basic docs for webstore including how to enable and h…
Browse files Browse the repository at this point in the history
…ow it works.
  • Loading branch information
rufuspollock committed Feb 27, 2012
1 parent 9c04cbd commit ac46a4d
Showing 1 changed file with 95 additions and 0 deletions.
95 changes: 95 additions & 0 deletions doc/webstore.rst
@@ -0,0 +1,95 @@
========
Webstore
========

Webstore is a structured data store integrated into CKAN. It uses ElasticSearch_
as the persistence and query layer with CKAN wrapping this with a thin
authorization and authentication layer.

To use you will need to be using Nginx as your webserver as we utilize its
XSendfile_ feature to transparently hand off data requests to ElasticSeach
internally.

.. _ElasticSearch: http://www.elasticsearch.org/
.. _XSendfile: http://wiki.nginx.org/XSendfile

Using the Webstore
==================

Each resource in a CKAN instance will now have a Webstore 'table' associated
with it. This table will be accessible via a web interface at::

/api/resource/{id}/data

And also, for convenience, at (via a redirect)::

/dataset/{name}/resource/{resource-id}/data

This interface to this data is *exactly* the same as that provided by
ElasticSearch to documents of a specific type in one of its indices.

Installation and Configuration
=============================

1. Install ElasticSearch_
-------------------------

Please see the ElasticSearch_ documentation.

2. Configure Nginx
------------------

You must add to your Nginx CKAN site entry the following::

location /elastic/ {
internal;
# location of elastic search
proxy_pass http://0.0.0.0:9200/;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

.. note:: update the proxy_pass field value to point to your ElasticSearch
instance (if it is not localhost and default port).

3. Enable webstore features in CKAN
-----------------------------------

In your config file set::

ckan.webstore.enabled = 1

4. Test it
----------



Webstorer: Automatically Add Data to the Webstore
=================================================

Often, when you upload data you will want it to be automatically added to the
Webstore. This requires some processing, to extract the data from your files
and to add it to the Webstore in the format it understands. For more
information on the architecture see http://wiki.ckan.org/Storage.

This task of automatically parsing and then adding data to the webstore is
performed by a Webstorer, a queue process that runs asynchronously and can be
triggered by uploads or other activities. The Webstorer is an extension and can
be found, along with installation instructions, at:

https://github.com/okfn/ckanext-webstorer


How It Works (Technically)
==========================

1. Request arrives at e.g. /dataset/{id}/resource/{resource-id}/data
2. CKAN checks authentication and authorization.
3. (Assuming OK) CKAN hands (internally) to ElasticSearch which handles the
request

* To do this we use Nginx's Sendfile / Accel-Redirect feature. This allows
us to hand off a user request *directly* to ElasticSearch after the
authentication and authorization. This avoids the need to proxy the
request and results through CKAN code.

0 comments on commit ac46a4d

Please sign in to comment.