-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from open-contracting/dev-pr
OCDS Kingfisher Process - first commit.
- Loading branch information
Showing
52 changed files
with
2,016 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
[flake8] | ||
exclude = venv/, .ve/, data/, src/ | ||
max-line-length = 160 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
*.sqlite3 | ||
*.swp | ||
*.mo | ||
*~ | ||
.ve | ||
*.pyc | ||
__pycache__ | ||
media | ||
.coverage | ||
htmlcov | ||
docs/_build | ||
.cache/* | ||
.hypothesis/* | ||
.pytest_cache | ||
venv/ | ||
data/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
sudo: false | ||
addons: | ||
chrome: stable | ||
postgresql: "10" | ||
apt: | ||
packages: | ||
- postgresql-10 | ||
- postgresql-client-10 | ||
env: | ||
global: | ||
- PGPORT=5433 | ||
- KINGFISHER_PROCESS_DB_URI='postgres:///travis' | ||
services: | ||
- postgresql | ||
language: python | ||
python: | ||
- "3.5" | ||
|
||
install: | ||
- "pip install -r requirements.txt" | ||
- "pip install flake8" | ||
script: | ||
- "flake8 ocdskingfisherprocess/ ocdskingfisher-process-cli tests" | ||
- "py.test" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
BSD 3-Clause License | ||
|
||
Copyright (c) 2018, Open Contracting Data Standard | ||
All rights reserved. | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
|
||
* Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
|
||
* Redistributions in binary form must reproduce the above copyright notice, | ||
this list of conditions and the following disclaimer in the documentation | ||
and/or other materials provided with the distribution. | ||
|
||
* Neither the name of the copyright holder nor the names of its | ||
contributors may be used to endorse or promote products derived from | ||
this software without specific prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE | ||
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | ||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR | ||
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | ||
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | ||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
# kingfisher-process | ||
# OCDS Kingfisher |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Command line tool - check-collection option | ||
=========================================== | ||
|
||
This command checks all data so far in a collection. | ||
|
||
It can be run multiple times on a collection, and data already checked will not be rechecked. | ||
|
||
Pass the ID of the collection you want checked. Use :doc:`cli-list-collections` to look up the ID you want. | ||
|
||
.. code-block:: shell-session | ||
python ocdskingfisher-process-cli check-collection 17 | ||
TODO write about checking different schema versions here - but how that works is about to change, so no point documenting it now. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
Command line tool - list-collections option | ||
=========================================== | ||
|
||
This command lists all the collections this install of the app knows about. | ||
|
||
.. code-block:: shell-session | ||
python ocdskingfisher-process-cli list-collections |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
Command line tool - upgrade-database option | ||
=========================================== | ||
|
||
This tool will setup from scratch or update to the latest versions the tables and structure in the Postgresql database. | ||
|
||
.. code-block:: shell-session | ||
python ocdskingfisher-process-cli upgrade-database | ||
If you want to delete all the existing tables before setting up empty tables, pass the `deletefirst` flag. | ||
|
||
.. code-block:: shell-session | ||
python ocdskingfisher-process-cli upgrade-database --deletefirst |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Command line tool | ||
================= | ||
|
||
|
||
You can use the tool with the provided CLI script. There are various sub commands. | ||
|
||
You can pass the `verbose` flag to all sub commands, to get more output printed to the terminal. | ||
|
||
.. code-block:: shell-session | ||
python ocdskingfisher-process-cli --verbose run ... | ||
.. toctree:: | ||
|
||
|
||
cli-upgrade-database.rst | ||
cli-list-collections.rst | ||
cli-check-collection.rst | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
master_doc = 'index' | ||
|
||
project = 'OCDS Kingfisher Process Tool' | ||
copyright = '2018, Open Contracting Data Standard' | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
Configuration | ||
============= | ||
|
||
Database Configuration | ||
---------------------- | ||
|
||
Postgresql Database settings can be set using a `~/.config/ocdskingfisher-process/config.ini` file. A sample one is included in the | ||
main directory. | ||
|
||
|
||
.. code-block:: ini | ||
[DBHOST] | ||
HOSTNAME = localhost | ||
PORT = 5432 | ||
USERNAME = ocdsdata | ||
PASSWORD = FIXME | ||
DBNAME = ocdsdata | ||
It will also attempt to load the password from a `~/.pgpass` file, if one is present. | ||
|
||
You can also set the `KINGFISHER_PROCESS_DB_URI` environmental variable to use a custom PostgreSQL server, for example | ||
`postgresql://user:password@localhost:5432/dbname`. | ||
|
||
The order of precedence is (from least-important to most-important): | ||
|
||
- config file | ||
- password from `~/.pgpass` | ||
- environmental variable | ||
|
||
Web Configuration | ||
----------------- | ||
|
||
TODO write up the API Key - notes: KINGFISHER_PROCESS_WEB_API_KEYS env var or [WEB] API_KEYS= in ini. Comma seperated. | ||
|
||
Logging Configuration | ||
--------------------- | ||
|
||
This tool will provide additional logging information using the standard Python logging module, with loggers in the "ocdskingfisher" | ||
namespace. | ||
|
||
When using the command line tool, it can be configured by setting a `~/.config/ocdskingfisher-process/logging.json` file. | ||
A sample one is included in the main directory. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
Data Model | ||
========== | ||
|
||
Collections | ||
----------- | ||
|
||
Collections are a set of data that are handled separately. | ||
|
||
A collection is defined uniquely by a combination of all the variables listed below. | ||
|
||
* Name. A String. Can be anything you want. | ||
* Date. The date the collection started. | ||
* Sample. A Boolean flag. | ||
|
||
A collection is also given a numeric ID. | ||
|
||
Files | ||
----- | ||
|
||
Each collection contains one or more files. | ||
|
||
Each file is uniquely identified in a collection by it's file name. | ||
|
||
Data Types for Files | ||
-------------------- | ||
|
||
When giving file to this software to load, you must specify a data type. This can be: | ||
|
||
* record - the file is a record. | ||
* release - the file is a release. | ||
* record_package - the file is a record package. | ||
* release_package - the file is a release package. | ||
* record_package_json_lines - the file is JSON lines, and every line is a record package | ||
* release_package_json_lines - see last entry, but release packages. | ||
* record_package_list - the file is a list of record packages. eg [ { record-package-1 } , { record-package-2 } ] | ||
* release_package_list - see last entry, but release packages. | ||
* record_package_list_in_results - the file is a list of record packages in the results attribute. eg { 'results': [ { record-package-1 } , { record-package-2 } ] } | ||
* release_package_list_in_results - see last entry, but release packages. | ||
|
||
Items | ||
----- | ||
|
||
Each File contains one or more items, where an item as a piece of OCDS data - a release, record, release package or record-package. | ||
|
||
Some files only contain one item, and in that case there will only be one item per file. | ||
|
||
Some files contain many items. For example; | ||
|
||
* JSON Lines files | ||
* A file downloaded from an API where the file is a JSON object that contains a list of records. eg http://www.contratosabiertos.cdmx.gob.mx/api/contratos/array | ||
|
||
Each items has an integer number, which lists the order they appear in. | ||
|
||
Each item is uniquely identified in a file by it's number. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
Development | ||
=========== | ||
|
||
Run tests | ||
--------- | ||
|
||
Run `py.test` from root directory. | ||
|
||
The tests will drop and create the database, so you probably want to specify a special testing database with a environmental variable - see :doc:`config`. | ||
|
||
|
||
Main Database - Postgresql | ||
-------------------------- | ||
|
||
Create DB Migrations with Alembic - http://alembic.zzzcomputing.com/en/latest/ | ||
|
||
.. code-block:: shell-session | ||
alembic --config=mainalembic.ini revision -m "message" | ||
Add changes to new migration, and make sure you update database.py table structures and delete_tables to. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
OCDS Kingfisher Process tool | ||
============================ | ||
|
||
OCDS Kingfisher Process is a tool for storing and analysing data from publishers of the Open Contracting Data Standard. | ||
|
||
(It does not download data - for that, see the Scrape part of Kingfisher) | ||
|
||
.. toctree:: | ||
|
||
data-model.rst | ||
requirements-install.rst | ||
config.rst | ||
cli.rst | ||
web.rst | ||
development.rst |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
Requirements and Install | ||
======================== | ||
|
||
Requirements | ||
------------ | ||
|
||
Requirements: | ||
|
||
- python v3.5 or higher | ||
- Postgresql v10 or higher | ||
|
||
Requirements for website | ||
------------------------ | ||
|
||
Requirements: | ||
|
||
- A Web Server capable of running a WSGI Python app | ||
|
||
Installation | ||
------------ | ||
|
||
Set up a venv and install requirements: | ||
|
||
.. code-block:: shell-session | ||
virtualenv -p python3 .ve | ||
source .ve/bin/activate | ||
pip install -r requirements.txt | ||
pip install -e . | ||
Database | ||
-------- | ||
|
||
You need to create a UTF8 Postgresql database and create a user with write access. | ||
|
||
Once you have created the database, you need to configure the tool to connect to the database. | ||
|
||
You can see one way of doing that in the example below, but for other options see :doc:`config`. | ||
|
||
You also have to run a command to create the tables in database. | ||
|
||
You can see the command in the example below, but for more on that see :doc:`cli-upgrade-database`. | ||
|
||
Example of creating an database user, database and setting up the schema: | ||
|
||
.. code-block:: shell-session | ||
sudo -u postgres createuser ocdskingfisher --pwprompt | ||
sudo -u postgres createdb ocdskingfisher -O ocdskingfisher --encoding UTF8 --template template0 --lc-collate en_US.UTF-8 --lc-ctype en_US.UTF-8 | ||
export KINGFISHER_PROCESS_DB_URI='postgres://ocdskingfisher:PASSWORD YOU CHOSE@localhost/ocdskingfisher' | ||
python ocdskingfisher-process-cli upgrade-database |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
## This file is a hack to make Read The Docs Work |
Oops, something went wrong.