Skip to content

Commit

Permalink
Merge pull request #8 from qld-gov-au/develop
Browse files Browse the repository at this point in the history
Develop to master
  • Loading branch information
ThrawnCA authored Mar 4, 2022
2 parents 362157d + 72d7299 commit b4c3e82
Show file tree
Hide file tree
Showing 33 changed files with 843 additions and 353 deletions.
20 changes: 20 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
[flake8]
# @see https://flake8.pycqa.org/en/latest/user/configuration.html?highlight=.flake8

exclude =
ckan
scripts

# Extended output format.
format = pylint

# Show the source of errors.
show_source = True

max-complexity = 10
max-line-length = 127

# List ignore rules one per line.
ignore =
C901
W503
92 changes: 92 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
name: Tests
on: [push, pull_request]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: '3.6'
- name: Cache pip
uses: actions/cache@v2
with:
# This path is specific to Ubuntu
path: ~/.cache/pip
# Look to see if there is a cache hit for the corresponding requirements file
key: ${{ runner.os }}-pip-flake8-${{ hashFiles('requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-flake8-
${{ runner.os }}-
- name: Install requirements
run: pip install flake8 pycodestyle
- name: Check syntax
run: flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics --exclude ckan

test:
needs: lint
strategy:
matrix:
ckan-version: [2.9, 2.9-py2, 2.8, 2.7]
fail-fast: false

name: CKAN ${{ matrix.ckan-version }}
runs-on: ubuntu-latest
container:
image: openknowledge/ckan-dev:${{ matrix.ckan-version }}
services:
solr:
image: ckan/ckan-solr-dev:${{ matrix.ckan-version }}
postgres:
image: ckan/ckan-postgres-dev:${{ matrix.ckan-version }}
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: postgres
options: --health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5
redis:
image: redis:3
env:
CKAN_SQLALCHEMY_URL: postgresql://ckan_default:pass@postgres/ckan_test
CKAN_DATASTORE_WRITE_URL: postgresql://datastore_write:pass@postgres/datastore_test
CKAN_DATASTORE_READ_URL: postgresql://datastore_read:pass@postgres/datastore_test
CKAN_SOLR_URL: http://solr:8983/solr/ckan
CKAN_REDIS_URL: redis://redis:6379/1

steps:
- uses: actions/checkout@v2

- name: Cache pip
uses: actions/cache@v2
with:
# This path is specific to Ubuntu
path: ~/.cache/pip
# Look to see if there is a cache hit for the corresponding requirements file
key: ${{ runner.os }}-pip-${{ hashFiles('*requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
${{ runner.os }}-
- name: Install requirements
run: |
pip install -r requirements-dev.txt
pip install -e .
# Replace default path to CKAN core config file with the one on the container
sed -i -e 's/use = config:.*/use = config:\/srv\/app\/src\/ckan\/test-core.ini/' test.ini
- name: Setup extension (CKAN >= 2.9)
if: ${{ matrix.ckan-version != '2.7' && matrix.ckan-version != '2.8' }}
run: |
ckan -c test.ini db init
ckan -c test.ini report initdb
ckan -c test.ini report generate tagless-datasets
- name: Setup extension (CKAN < 2.9)
if: ${{ matrix.ckan-version == '2.7' || matrix.ckan-version == '2.8' }}
run: |
paster --plugin=ckan db init -c test.ini
paster --plugin=ckanext-report report initdb -c test.ini
paster --plugin=ckanext-report report generate tagless-datasets -c test.ini
- name: Run tests
run: pytest --ckan-ini=test.ini --cov=ckanext.report --disable-warnings ckanext/report/tests
42 changes: 30 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# ckanext-report
[![Tests](https://github.com/qld-gov-au/ckanext-report/actions/workflows/test.yml/badge.svg)](https://github.com/qld-gov-au/ckanext-report/actions/workflows/test.yml)
ckanext-report
====================

ckanext-report is a CKAN extension that provides a reporting infrastructure. Here are the features offered:

Expand All @@ -12,25 +14,32 @@ Example report:

![Demo report image](report-demo.png)

A number of extensions currently offer reports that rely on this extension, e.g. [ckanext-archiver](https://github.com/datagovuk/ckanext-archiver/blob/master/ckanext/archiver/reports.py), [ckanext-qa](https://github.com/datagovuk/ckanext-qa/blob/master/ckanext/qa/reports.py), [ckanext-dgu](https://github.com/datagovuk/ckanext-dgu/blob/master/ckanext/dgu/lib/reports.py).
A number of extensions currently offer reports that rely on this extension, e.g. [ckanext-archiver](https://github.com/ckan/ckanext-archiver/blob/master/ckanext/archiver/reports.py), [ckanext-qa](https://github.com/ckan/ckanext-qa/blob/master/ckanext/qa/reports.py), [ckanext-dgu](https://github.com/datagovuk/ckanext-dgu/blob/master/ckanext/dgu/lib/reports.py).

TODO:

* Stop a report from being generated multiple times in parallel (unnecessary waste) - use a queue?
* Stop more than one report being generated in parallel (high load for the server) - maybe use a queue.

Compatibility: Requires CKAN version 2.1 or later (but can be easily adapted for older versions).
## Compatibility:

Status: in production at data.gov.uk but since that uses its own CSS rather than core CKAN's, for others to use it CSS needs adding. For an example, see this branch: see https://github.com/yaditi/ckanext-report/tree/geoversion
| CKAN version | Compatibility |
| --------------- | ------------------- |
| 2.6 and earlier | yes |
| 2.7 | yes |
| 2.8 | yes |
| 2.9 | yes |

Author(s): David Read
Status: was in production at data.gov.uk around 2014-2016, but since that uses its own CSS rather than core CKAN's, for others to use it CSS needs adding. For an example, see this branch: see https://github.com/GSA/ckanext-report/tree/geoversion

Author(s): David Read and contributors


## Install & setup

Install ckanext-report into your CKAN virtual environment in the usual way:

(pyenv) $ pip install -e git+https://github.com/datagovuk/ckanext-report.git#egg=ckanext-report
(pyenv) $ pip install -e git+https://github.com/ckan/ckanext-report.git#egg=ckanext-report

Initialize the database tables needed by ckanext-report:

Expand All @@ -43,7 +52,7 @@ Enable the plugin. In your config (e.g. development.ini or production.ini) add `

## Command-line interface

The following operations can be run from the command line using the ``paster --plugin=ckanext-report report`` command:
The following operations can be run from the command line using the ``paster --plugin=ckanext-report report`` or ``ckan report`` commands:

```
report list
Expand All @@ -56,14 +65,17 @@ The following operations can be run from the command line using the ``paster --p
Get the list of reports:

(pyenv) $ paster --plugin=ckanext-report report list --config=mysite.ini
(pyenv) $ ckan --config=mysite.ini report list

Generate all reports:

(pyenv) $ paster --plugin=ckanext-report report generate --config=mysite.ini
(pyenv) $ ckan --config=mysite.ini report generate

Generate a single report:

(pyenv) $ paster --plugin=ckanext-report report generate <report name> --config=mysite.ini
(pyenv) $ ckan --config=mysite.ini report generate <report name>


## Demo report - Tagless Datasets
Expand All @@ -85,7 +97,7 @@ ckanext-report.notes.dataset = ' '.join(('Unpublished' if asbool(pkg.extras.get(

A report has three key elements:

1. Report Code - a python function that produces the report.
1. Report Code - a python function that produces the report.
2. Template - HTML for displaying the report data.
3. Registration - containing the configuration of the report.

Expand All @@ -109,7 +121,7 @@ The returned data should be a dict like this:
'average_tags_per_package': 3.5,
}
```

There should be a `table` with the main body of the data, and any other totals or incidental pieces of data.

Note: the table is required because of the CSV download facility, and CSV demands a table. (The CSV download only includes the table, ignoring any other values in the data.) Although the data has to essentially be stored as a table, you do have the option to display it differently in the web page by using a clever template.
Expand All @@ -131,6 +143,9 @@ Report (snippet)
table - main data, as a list of rows, each row is a dict
data - other data values, as a dict
#}

{% set ckan_29_or_higher = h.ckan_version().split('.')[1] | int >= 9 %}
{% set dataset_read_route = 'dataset.read' if ckan_29_or_higher else 'dataset_read' %}
<ul>
<li>Datasets without tags: {{ table|length }} / {{ data['num_packages'] }} ({{ data['packages_without_tags_percent'] }})</li>
<li>Average tags per package: {{ data['average_tags_per_package'] }} tags</li>
Expand All @@ -149,7 +164,7 @@ data - other data values, as a dict
{% for row in table %}
<tr>
<td>
<a href="{{ h.url_for(controller='package', action='view', id=row.name) }}">
<a href="{{ h.url_for(dataset_read_route, id=row.name) }}">
{{ row.title }}
</a>
</td>
Expand Down Expand Up @@ -215,7 +230,10 @@ class TaglessReportPlugin(p.SingletonPlugin):
The last line refers to `tag_report_info` which is a dictionary with properties of the report. This is stored in `reports.py` together with the report code (see above). The info dict looks like this:

```python
from ckan.common import OrderedDict
try:
from collections import OrderedDict # from python 2.7
except ImportError:
from sqlalchemy.util import OrderedDict
tagless_report_info = {
'name': 'tagless-datasets',
'description': 'Datasets which have no tags.',
Expand Down Expand Up @@ -256,4 +274,4 @@ To update template file with new translation added in the code or templates
run `python setup.py extract_messages` in the root plugin directory. Then run
`./ckanext/report/i18n/unique_pot.sh -v` to strip core ckan's translations.

To update translation files for locale "pl" with new template run `python setup.py update_catalog -l pl`.
To update translation files for locale "pl" with new template run `python setup.py update_catalog -l pl`.
Empty file added ckanext/report/cli/__init__.py
Empty file.
70 changes: 70 additions & 0 deletions ckanext/report/cli/click_cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# encoding: utf-8

import click

from ckanext.report.cli.command import Reporting

# Click commands for CKAN 2.9 and above


@click.group()
def report():
""" XLoader commands
"""
pass


@report.command()
def list():
""" Lists the reports
"""
cmd = Reporting()
cmd.list()


@report.command()
def initdb():
""" Initialize the database tables for this extension
"""
cmd = Reporting()
cmd.initdb()


@report.command()
@click.argument(u'report_names')
def generate(report_names):
"""
Generate and cache reports - all of them unless you specify
a comma separated list of them.
"""
cmd = Reporting()
report_list = [s.strip() for s in report_names.split(',')]
cmd.generate(report_list)


@report.command()
@click.argument(u'report_name')
@click.argument(u'options', nargs=-1)
def generate_for_options(report_name, options):
"""
Generate and cache a report for one combination of option values.
You can leave it with the defaults or specify options
as more parameters: key1=value key2=value
"""
cmd = Reporting()
report_options = {}
for option_arg in options:
if '=' not in option_arg:
raise click.BadParameter(
'Option needs an "=" sign in it',
options)
equal_pos = option_arg.find('=')
key, value = option_arg[:equal_pos], option_arg[equal_pos + 1:]
if value == '':
value = None # this is what the web i/f does with params
report_options[key] = value
cmd.generate_for_options(report_name, report_options)


def get_commands():
return [report]
51 changes: 51 additions & 0 deletions ckanext/report/cli/command.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# encoding: utf-8

import logging
import time


class Reporting():

def __init__(self):
self.log = logging.getLogger("ckanext.report.cli")

def initdb(self):
from ckanext.report import model
model.init_tables()
self.log.info('Report table is setup')

def list(self):
from ckanext.report.report_registry import ReportRegistry
registry = ReportRegistry.instance()
for plugin, report_name, report_title in registry.get_names():
report = registry.get_report(report_name)
date = report.get_cached_date()
print('%s: %s %s' % (plugin, report_name,
date.strftime('%d/%m/%Y %H:%M') if date else '(not cached)'))

def generate(self, report_list=None):
from ckanext.report.report_registry import ReportRegistry
timings = {}

self.log.info("Running reports => %s", report_list)
registry = ReportRegistry.instance()
if report_list:
for report_name in report_list:
s = time.time()
registry.get_report(report_name).refresh_cache_for_all_options()
timings[report_name] = time.time() - s
else:
s = time.time()
registry.refresh_cache_for_all_reports()
timings["All Reports"] = time.time() - s

self.log.info("Report generation complete %s", timings)

def generate_for_options(self, report_name, options):
from ckanext.report.report_registry import ReportRegistry
self.log.info("Running report => %s", report_name)
registry = ReportRegistry.instance()
report = registry.get_report(report_name)
all_options = report.add_defaults_to_options(options,
report.option_defaults)
report.refresh_cache(all_options)
Loading

0 comments on commit b4c3e82

Please sign in to comment.