Skip to content

Latest commit

 

History

History
123 lines (83 loc) · 4.39 KB

filestore.rst

File metadata and controls

123 lines (83 loc) · 4.39 KB

FileStore and file uploads

CKAN allows users to upload files directly to it against a resource or images displayed against groups and organizations.

2.2 Previous versions of CKAN used to allow uploads to remote cloud hosting but we have simplified this to only alow local file uploads (see below for details on how to migrate). This is to give CKAN more control over the files and make access control possible.

Setup file uploads

To setup CKAN's FileStore with local file storage:

  1. Create the directory where CKAN will store uploaded files:

    sudo mkdir -p

  2. Add the following lines to your CKAN config file, after the [app:main] line:

    ckan.storage_path =

  3. Set the permissions of the ckan.storage_path. For example if you're running CKAN with Apache, then Apache's user (www-data on Ubuntu) must have read, write and execute permissions for the ckan.storage_path:

    sudo chown www-data sudo chmod u+rwx

  4. Restart your web server, for example to restart Apache:

FileStore web interface

Upload of files to storage is integrated directly into the Dataset creation and editing system with files being associated to Resources.

FileStore API

2.2 The previous API has been deprecated although should still work if you where using local file storage.

The API is part of the :py~ckan.logic.action.create.resource_create and :py~ckan.logic.action.update.resource_update action API functions. You can post mutipart/form-data to the API and the key, value pairs will treated as as if they are a JSON object. The extra key upload is used to actually post the binary data.

Curl automatically puts the multipart-form-data heading when using the --form option:

curl -H'Authorization: your-api-key' 'http://yourhost/api/action/resource_create' --form upload=@filetoupload --form package_id=my_dataset

The Python requests library used the files parameter and automatically sets the multipart/form-data header too:

import requests requests.post('http://0.0.0.0:5000/api/action/resource_create', data={"package_id":"my_dataset}", headers={"X-CKAN-API-Key": "21a47217-6d7b-49c5-88f9-72ebd5a4d4bb"}, files=[('upload', file('/path/to/file/to/upload.csv'))])

With :py~ckan.logic.action.update.resource_update, if you want to override a file you just need to set the upload field again:

curl -H'Authorization: your-api-key' 'http://yourhost/api/action/resource_update' --form upload=@newfiletoupload --form id=resourceid

If you want to clear the upload and change it for a remote URL there is special boolean field clear_upload to do this:

curl -H'Authorization: your-api-key' 'http://yourhost/api/action/resource_update' --form url=http://expample.com --form clear_upload=true --form id=resourceid

It is also possible to have uploaded files (if of a suitable format) stored in the DataStore which will then provides an API to the data. See datapusher for more details.

Migration from 2.1 to 2.2

If you are using pairtree local file storage then you can keep your current settings without issue. The pairtree and new storage can live side by side but you are still encouraged to migrate. If you change your config options to the ones specified in this docs you will need to run the migration below.

If you are running remote storage then all previous links will still be accessible but if you want to move the remote storage documents to the local storage you will run the migration also.

In order to migrate make sure your CKAN instance is running as the script will request the data from the instance using APIs. You need to run the following on the command line todo the migration:

paster db migrate-filestore

This may take a long time especially if you have a lot of files remotely. If the remote hosting goes down or the job is interrupted it is save to run it again and it will try all the unsuccessful ones again.