Skip to content

Commit

Permalink
Merge pull request #7 from cezio/38_updates
Browse files Browse the repository at this point in the history
38 updates
  • Loading branch information
Tobia Di Pisa committed Jun 14, 2018
2 parents 4b1c11a + f54e827 commit c897bda
Show file tree
Hide file tree
Showing 9 changed files with 366 additions and 0 deletions.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added source/images/reports/reports-view-licenses.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added source/images/reports/reports-view-shortcut.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added source/images/reports/reports-view.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions source/install/install_ckan.rst
Original file line number Diff line number Diff line change
Expand Up @@ -380,6 +380,23 @@ Edit the file ``/etc/ckan/default/production.ini``
ckan.auth.create_user_via_web = false
ckan.auth.roles_that_cascade_to_sub_groups = admin

- Misc settings::

ckan.resource_proxy.max_file_size = 5242880

- External resources settings:
(see :ref:`Extras extension <_ckanext-extras-extension>`)::

ckanext.extras.local_sites =
http://127.0.0.1
http://localhost:8080
# ..(other local url prefixes)

ckanext.extras.external_sites =
http://localhost:8080/geoserver
# ..(other remote resources with local prefixes)



The file ``who.ini`` (the *Repoze.who* configuration file) needs to be accessible
in the same directory as your CKAN config file, so create a symlink to it::
Expand Down
349 changes: 349 additions & 0 deletions source/install/install_ckan_otherext_provbz.rst
Original file line number Diff line number Diff line change
Expand Up @@ -426,6 +426,352 @@ ckanext-shibboleth extension should be just appended to the existing ones ('it'
5 - Repete the steps above for the 'de' locales and finally restart CKAN.


.. _ckanext-gsreports-extension:

=================
Reports extension
=================


`ckanext-gsreports` provides aggregated information about:

* resources formats used,
* license types used
* resources which links to errorous or invalid urls.

This extension uses `ckanext-reports`. Reports are generated periodically, and be visible to admin users.

------------
Installation
------------

This extension requires `ckanext-report` and `owslib` to be installed prior to using `ckanext-gsreport`.


1. Install `ckanext-report` and init db::

$ git clone https://github.com/datagovuk/ckanext-report.git
$ cd ckanext-report
$ pip install -e .
$ paster --plugin=ckanext-report report initdb --config=path/to/config.ini

2. Clone repository and install package::

$ git clone https://github.com/geosolutions-it/ckanext-gsreport.git
$ cd ckanext-gsreport
$ pip install -r requirements.txt
$ pip install -e .

3. Add `status_reports` to plugins. **Note** Order of entries matters. This plugin should be placed **before** `report` plugin.::

ckan.plugins = .. status_reports report

4. Run solr data reindexing (license and resource format reports are using special placeholders in solr to access data without value)::

paster --plugin=ckan search-index rebuild_fast -c /path/to/config.ini

5. Run reports generation (see :ref:`ckanext-gsreports-usage` below)

.. ckanext-gsreports-usage:
-----
Usage
-----

Generating reports
^^^^^^^^^^^^^^^^^^

Report can be generated in two ways:

* from CLI (this can be used to set up cron job):
* generate all reports::

$ paster --plugin=ckanext-report report generate --config=path/to/config.ini

* generate one report::

$ paster --plugin=ckanext-report report generate $report-name --config=path/to/config.ini

* in UI, by opening `/report` url when user opens report page for the first time (with no data in report),

.. warning::

Report generation can take a while to produce results. Especially `broken-links` report may take significant amount of time, because it will check each resource for availability.

.. note::

Report generation speed depends on network speed, response time from resources and number of resources to check. That is why it's recommended to run reports generation outside web process, for example with cron.


Usage
^^^^^

Main reports view is available for admin users only. User can access it with `/reports` path, or accessing **Reports** link in the footer:

.. image:: ../images/reports/reports-view-shortcut.png


Main view
+++++++++

Main view shows list of reports available:

.. image:: ../images/reports/reports-view.png


Broken links report view
++++++++++++++++++++++++

Broken links report view will show list of organizations with number of all datasets and datasets with broken links:

.. image:: ../images/reports/reports-view-broken-links.png

User can filter list by organization, either by selecting one from dropdown, or clicking one in in table. After selecting organization, list of broken links is more detailed, with list of actual resources, grouped by dataset. List contains links to resource, dataset and information of type of error (including response if possible).

.. image:: ../images/reports/reports-view-broken-links-org.png

Licenses report view
++++++++++++++++++++

Licenses report view will show types of license and number of datasets using that license. This report will show only public datasets.

.. image:: ../images/reports/reports-view-licenses.png

License names will link to search page which will show datasets using specific license.

Resource type report view
+++++++++++++++++++++++++

Resource type report will show types of formats of resources.

.. image:: ../images/reports/reports-view-resource-type.png

Format name links to detailed report with all resources are using that format. Additionally, detailed report can be filtered by organization.


.. image:: ../images/reports/reports-view-resource-type-type.png


Exporting data
++++++++++++++

Each report can be exported to either CSV or JSON format.

.. note::

Exported data may contain more data than showed in report view.

.. note::

Data export will return data only for current view, so, for examp,e if report view shows data filtered by organization, export will also retun data filtered for specific organization only.

.. _ckanext-extras-extension:

================
Extras extension
================

The ckanext-extras CKAN's extension provides `external_resource_list action`, which returns list of public resources, which are not local (are served by external service).

------------
Installation
------------

Installing all the other extensions required

1. Activate your CKAN virtual environment, for example:

.. code::
. /usr/lib/ckan/default/bin/activate
2. Go into your CKAN path for extension (like /usr/lib/ckan/default/src):

.. code::
git clone https://github.com/geosolutions-it/ckanext-extras.git
cd ckanext-extras
pip install -e .
3. Add `external_resource_list` to the `ckan.plugins` setting in your CKAN config file (by default the config file is located at `/etc/ckan/default/production.ini`).

4. Restart CKAN. For example if you've deployed CKAN with Apache on Ubuntu:

.. code::
sudo service apache2 reload
-------------
Configuration
-------------


This extension uses `ckan.site_url` value to resolve if url is external. If url starts with local site value, it will be considered as local.

However, it may came to situation, that single site url is insufficient. For that case, you can add `ckanext.extras.local_sites` to config. This can be a string or list of strings with base urls, which should be considered as local.

Additionally, urls that starts with values from local sites, may be actually external (proxied from external sites). In that case, you can also set `ckanext.extras.external_sites`

To establish if url is external in such scenario, url will be checked with external sites first (if url starts with external site prefix, it will be considered external at this point), then with local sites (if url starts with local site prefix, it will be considered local). If none of those checks will provide result, url will eventually be considered as external.

Example
^^^^^^^

Sample configuration:

.. code::
ckan.site_url = http://public.address
ckanext.extras.local_sites =
http://localhost
http://127.0.0.1
ckanext.extras.external_sites =
http://localhost/proxied
http://public.address/remote/
.. csv-table::
:header: "Url","Is external?"

`http://public.address/index`,No
`http://public.address/remote/index`,Yes
`http://localhost/resource/001`,No
`http://localhost/proxied/resource/001`,Yes

-----------------------------
Accessing external links list
-----------------------------

External resources list is available through api, under `api/action/external_resource_list` endpoint

Sample response:

.. code:: json
{"help": "http://localhost:5000/api/3/action/help_show?name=external_resource_list",
"success": true,
"result": {"count": 1,
"data": [{"url": "https://ckan.org/documentation-and-api/",
"id": "5e26f241-d3f9-4f48-b342-03e3364ca16f",
"name": "Documentation",
"dataset":
{"url": "http://localhost:5000/dataset/ed66af9c-d8ee-4dd5-8a05-acbfc760a323",
"id": "ed66af9c-d8ee-4dd5-8a05-acbfc760a323",
"title": "Licensed dataset"}}],
"limit": 50, "offset": 0}}
External resource list accepts two params:

* `limit` - number of items returned on a page
* `offset` - offset in list, calculated from item at index 0.

==========
DataPusher
==========

Automatically add Data to the CKAN DataStore.

.. hint::
Doc page at http://docs.ckan.org/projects/datapusher/en/latest/index.html

As ``root`` install the WSGI apache module::

yum install mod_wsgi

As ``ckan``, create a brand new virtualenv, and install the datapusher app in it::

virtualenv /usr/lib/ckan/datapusher
mkdir /usr/lib/ckan/datapusher/src
cd /usr/lib/ckan/datapusher/src
git clone -b stable https://github.com/ckan/datapusher.git
cd datapusher/
. ../../bin/activate
pip install -r requirements.txt
python setup.py develop

Create configuration files::

cp /usr/lib/ckan/datapusher/src/datapusher/deployment/datapusher_settings.py /etc/ckan/default/datapusher_settings.py
cp /usr/lib/ckan/datapusher/src/datapusher/deployment/datapusher.wsgi /etc/ckan/default/datapusher.wsgi
Then edit ``/etc/ckan/default/datapusher.wsgi`` and adjust the settings path from::

os.environ['JOB_CONFIG'] = '/etc/ckan/datapusher_settings.py'

to ::

os.environ['JOB_CONFIG'] = '/etc/ckan/default/datapusher_settings.py'

Then create a file name ``/etc/httpd/conf.d/94-datapusher.conf`` and add these lines::

Listen 8800
<VirtualHost 0.0.0.0:8800>
ServerName ckan
# this is our app
WSGIScriptAlias / /etc/ckan/default/datapusher.wsgi
# pass authorization info on (needed for rest api)
WSGIPassAuthorization On
# Deploy as a daemon (avoids conflicts between CKAN instances)
WSGIDaemonProcess datapusher display-name=demo processes=1 threads=15
WSGIProcessGroup datapusher
ErrorLog /var/log/httpd/datapusher.error.log
CustomLog /var/log/httpd/datapusher.log combined
<Directory "/" >
Require all granted
</Directory>
</VirtualHost>

Now let's allow connections to port 8800 in SELinux::

semanage port -a -t http_port_t -p tcp 8800

and restart httpd in order to load the new configuration::

systemctl restart httpd

Test the datapusher entrypoint with a request like ::

curl http://localhost:8800
on the same machine ckan is running on.
You should get a response like this::

{
"help": "\n Get help at:\n http://ckan-service-provider.readthedocs.org/."
}

Now let's make ckan aware that the datapusher is available.

Edit the file ``/etc/ckan/default/production.ini`` and:

- add the ``datapusher`` plugin::

ckan.plugins = [... other plugins...] datapusher
- remove the comments from the lines::

ckan.datapusher.formats = csv xls xlsx tsv application/csv application/vnd.ms-excel application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
ckan.datapusher.url = http://127.0.0.1:8800/
Eventually restart supervisord to make ckan reload the configuration::

systemctl restart supervisord


==================
Document changelog
==================
Expand All @@ -437,3 +783,6 @@ Document changelog
+---------+------+--------+---------------------------------------+
| 1.1 | | | Improve doc for installing shibboleth |
+---------+------+--------+---------------------------------------+
| 1.2 | 2018 | CS | Updated information on additional |
| | 05-25| | extensions |
+---------+------+--------+---------------------------------------+

0 comments on commit c897bda

Please sign in to comment.