CKAN allows users to upload files directly to it against a resource or images displayed against groups and organizations.
2.2 Previous versions of CKAN used to allow uploads to remote cloud hosting but we have simplified this to only alow local file uploads (see below for details on how to migrate). This is to give CKAN more control over the files and make access control possible.
To setup CKAN's FileStore with local file storage:
Create the directory where CKAN will store uploaded files:
sudo mkdir -p
Add the following lines to your CKAN config file, after the
[app:main]
line:ckan.storage_path =
Set the permissions of the
ckan.storage_path
. For example if you're running CKAN with Apache, then Apache's user (www-data
on Ubuntu) must have read, write and execute permissions for theckan.storage_path
:sudo chown www-data sudo chmod u+rwx
Restart your web server, for example to restart Apache:
Upload of files to storage is integrated directly into the Dataset creation and editing system with files being associated to Resources.
2.2 The previous API has been deprecated although should still work if you where using local file storage.
The API is part of the :py~ckan.logic.action.create.resource_create
and :py~ckan.logic.action.update.resource_update
action API functions. You can post mutipart/form-data to the API and the key, value pairs will treated as as if they are a JSON object. The extra key upload
is used to actually post the binary data.
Curl automatically puts the multipart-form-data heading when using the --form
option:
curl -H'Authorization: your-api-key' 'http://yourhost/api/action/resource_create' --form upload=@filetoupload --form package_id=my_dataset
The Python requests library used the files parameter and automatically sets the multipart/form-data header too:
import requests requests.post('http://0.0.0.0:5000/api/action/resource_create', data={"package_id":"my_dataset}", headers={"X-CKAN-API-Key": "21a47217-6d7b-49c5-88f9-72ebd5a4d4bb"}, files=[('upload', file('/path/to/file/to/upload.csv'))])
With :py~ckan.logic.action.update.resource_update
, if you want to override a file you just need to set the upload field again:
curl -H'Authorization: your-api-key' 'http://yourhost/api/action/resource_update' --form upload=@newfiletoupload --form id=resourceid
If you want to clear the upload and change it for a remote URL there is special boolean field clear_upload to do this:
curl -H'Authorization: your-api-key' 'http://yourhost/api/action/resource_update' --form url=http://expample.com --form clear_upload=true --form id=resourceid
It is also possible to have uploaded files (if of a suitable format) stored in the DataStore which will then provides an API to the data. See datapusher
for more details.
If you are using pairtree local file storage then you can keep your current settings without issue. The pairtree and new storage can live side by side but you are still encouraged to migrate. If you change your config options to the ones specified in this docs you will need to run the migration below.
If you are running remote storage then all previous links will still be accessible but if you want to move the remote storage documents to the local storage you will run the migration also.
In order to migrate make sure your CKAN instance is running as the script will request the data from the instance using APIs. You need to run the following on the command line todo the migration:
paster db migrate-filestore
This may take a long time especially if you have a lot of files remotely. If the remote hosting goes down or the job is interrupted it is save to run it again and it will try all the unsuccessful ones again.