Skip to content

Commit

Permalink
Merge 6308369 into ad29d7d
Browse files Browse the repository at this point in the history
  • Loading branch information
kindly committed Dec 11, 2020
2 parents ad29d7d + 6308369 commit 179682c
Show file tree
Hide file tree
Showing 27 changed files with 97 additions and 51 deletions.
9 changes: 9 additions & 0 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
Changelog
=========

2020-12-11
----------

Changed
~~~~~~~

- The ``field_list`` column is now a JSONB object in which keys are paths and values are ``NULL``


2020-12-09
----------

Expand Down
30 changes: 25 additions & 5 deletions docs/cli/use.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,10 +80,10 @@ Use this option if:

.. _field-lists:

Create array of all paths for each row in each summary table
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Calculate JSON paths in each JSON object in each summary table
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``--field_lists`` option adds a ``field_list`` column to each summary table, which contains a JSON array of all JSON paths (excluding array indices) in the object that the row describes. For example, a ``field_list`` value in the ``awards_summary`` table will contain the JSON paths in an award object.
The ``--field_lists`` option adds a ``field_list`` column to each summary table, which contains all JSON paths (excluding array indices) in the object that the row describes. For example, a ``field_list`` value in the ``awards_summary`` table will contain the JSON paths in an award object. A ``field_list`` value is a JSONB object in which keys are paths and values are ``NULL``.

.. code-block:: bash
Expand All @@ -95,9 +95,29 @@ This can be used to check for the presence of multiple fields. For example, to

.. code-block:: sql
SELECT count(*) FROM view_data_collection_1.awards_summary WHERE field_list @> '{documents/id, items/id}';
SELECT count(*) FROM view_data_collection_1.awards_summary WHERE field_list ?& ARRAY['documents/id', 'items/id'];
The ``@>`` operator tests whether the left ARRAY value contains the right ARRAY values.
This could also be written as:

.. code-block:: sql
SELECT count(*) FROM view_data_collection_1.awards_summary WHERE field_list ? 'documents/id' AND field_list ? 'items/id';
The ``?&`` operator tests whether *all* keys in the right-hand array exist in the left-hand object. The ``?`` operator tests whether one key exists in the left-hand object.

To count the number of awards that have either at least one document with an ``id`` or at least one item with an ``id``, run:

.. code-block:: sql
SELECT count(*) FROM view_data_collection_1.awards_summary WHERE field_list ?| ARRAY['documents/id', 'items/id'];
This could also be written as:

.. code-block:: sql
SELECT count(*) FROM view_data_collection_1.awards_summary WHERE field_list ? 'documents/id' OR field_list ? 'items/id';
The ``?|`` operator tests whether *any* key in the right-hand array exists in the left-hand object.

.. _remove:

Expand Down
2 changes: 1 addition & 1 deletion docs/definitions/award_documents_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ data_id,integer,"id for the ""data"" table in Kingfisher that holds the original
document,jsonb,JSONB of the document
documenttype,text,`documentType` field from the document object
format,text,`format` field from the document object
field_list,ARRAY,"Array of JSON paths in the document object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the document object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/award_items_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ unit_currency,text,`currency` from the unit/value object
item_classification,text,Concatenation of classification/scheme and classification/id
item_additionalidentifiers_ids,jsonb,JSONB list of the concatenation of additionalClassification/scheme and additionalClassification/id
additional_classification_count,integer,Count of additional classifications
field_list,ARRAY,"Array of JSON paths in the item object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the item object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/award_suppliers_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ supplier_additionalidentifiers_count,integer,Count of additional identifiers
link_to_parties,integer,"Does this buyer link to a party in the parties array using the `id` field from buyer object linking to the `id` field in a party object? If this is true then 1, otherwise 0"
link_with_role,integer,If there is a link does the parties object have `suppliers` in its roles list? If it does then 1 otherwise 0
party_index,bigint,Position of the party in the parties array
field_list,ARRAY,"Array of JSON paths in the supplier object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the supplier object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/awards_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,4 @@ documents_count,bigint,Number of documents in documents array
documenttype_counts,jsonb,JSONB object with the keys as unique documentTypes and the values as count of the appearances of that `documentType` in the `documents` array
items_count,bigint,Count of items
award,jsonb,JSONB of award object
field_list,ARRAY,"Array of JSON paths in the award object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the award object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/buyer_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ buyer_additionalidentifiers_count,integer,Count of additional identifiers
link_to_parties,integer,"Does this buyer link to a party in the parties array using the `id` field from buyer object linking to the `id` field in a party object? If this is true then 1, otherwise 0"
link_with_role,integer,If there is a link does the parties object have `buyer` in its roles list? If it does then 1 otherwise 0
party_index,bigint,If there is a link what is the index of the party in the `parties` array then this can be used for joining to the `parties_summary` table
field_list,ARRAY,"Array of JSON paths in the buyer object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the buyer object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/contract_documents_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ data_id,integer,"id for the ""data"" table in Kingfisher that holds the original
document,jsonb,JSONB of the document
documenttype,text,`documentType` field from the document object
format,text,`format` field from the document object
field_list,ARRAY,"Array of JSON paths in the document object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the document object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ data_id,integer,"id for the ""data"" table in Kingfisher that holds the original
document,jsonb,JSONB of the document
documenttype,text,`documentType` field from the document object
format,text,`format` field from the document object
field_list,ARRAY,"Array of JSON paths in the document object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the document object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ milestone,jsonb,JSONB of milestone object
type,text,`type` from milestone object
code,text,`code` from milestone object
status,text,`status` from milestone object
field_list,ARRAY,"Array of JSON paths in the milestone object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the milestone object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ data_id,integer,"id for the ""data"" table in Kingfisher that holds the original
transaction_amount,numeric,`amount` field from the value object or the deprecated amount object
transaction_currency,text,`currency` field from the value object or the deprecated amount object
transaction,jsonb,JSONB of transaction object
field_list,ARRAY,"Array of JSON paths in the transaction object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the transaction object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/contract_items_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ unit_currency,text,`currency` from the unit/value object
item_classification,text,Concatenation of classification/scheme and classification/id
item_additionalidentifiers_ids,jsonb,JSONB list of the concatenation of additionalClassification/scheme and additionalClassification/id
additional_classification_count,integer,Count of additional classifications
field_list,ARRAY,"Array of JSON paths in the item object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the item object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/contract_milestones_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ milestone,jsonb,JSONB of milestone object
type,text,`type` from milestone object
code,text,`code` from milestone object
status,text,`status` from milestone object
field_list,ARRAY,"Array of JSON paths in the milestone object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the milestone object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/contracts_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ implementation_documenttype_counts,jsonb,JSONB object with the keys as unique do
implementation_milestones_count,bigint,Number of documents in documents array
implementation_milestonetype_counts,jsonb,JSONB object with the keys as unique milestoneTypes and the values as count of the appearances of that `milestoneType` in the `milestone` array
contract,jsonb,JSONB of contract object
field_list,ARRAY,"Array of JSON paths in the contract object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the contract object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/parties_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ unique_identifier_attempt,text,"The `id` from party object if it exists, otherwi
parties_additionalidentifiers_ids,jsonb,JSONB list of the concatenation of scheme and id of all additionalIdentifier objects
parties_additionalidentifiers_count,integer,Count of additional identifiers
party,jsonb,JSONB of party object
field_list,ARRAY,"Array of JSON paths in the party object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the party object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/planning_documents_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ data_id,integer,"id for the ""data"" table in Kingfisher that holds the original
document,jsonb,JSONB of the document
documenttype,text,`documentType` field from the document object
format,text,`format` field from the document object
field_list,ARRAY,"Array of JSON paths in the document object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the document object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/planning_milestones_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ milestone,jsonb,JSONB of milestone object
type,text,`type` from milestone object
code,text,`code` from milestone object
status,text,`status` from milestone object
field_list,ARRAY,"Array of JSON paths in the milestone object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the milestone object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/planning_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ documenttype_counts,jsonb,JSONB object with the keys as unique documentTypes and
milestones_count,bigint,Count of milestones
milestonetype_counts,jsonb,JSONB object with the keys as unique milestoneTypes and the values as a count of the appearances of that `milestoneType` in the `milestones` array
planning,jsonb,JSONB of planning object
field_list,ARRAY,"Array of JSON paths in the planning object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the planning object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/procuringEntity_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ procuringentity_additionalidentifiers_count,integer,Count of additional identifi
link_to_parties,integer,"Does this procuringEntity link to a party in the parties array using the `id` field from buyer object linking to the `id` field in a party object? If this is true then 1, otherwise 0"
link_with_role,integer,If there is a link does the parties object have `procuringEntity` in its roles list? If it does then 1 otherwise 0
party_index,bigint,If there is a link what is the index of the party in the `parties` array then this can be used for joining to the `parties_summary` table
field_list,ARRAY,"Array of JSON paths in the procuringentity object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the procuringentity object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/release_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,4 @@ release_check,jsonb,JSONB of Data Review Tool output which includes validation e
release_check11,jsonb,JSONB of Data Review Tool output run against 1.1 version of OCDS even if the data is from 1.0
record_check,jsonb,JSONB of Data Review Tool output which includes validation errors and additional field information
record_check11,jsonb,JSONB of Data Review Tool output run against 1.1 version of OCDS even if the data is from 1.0
field_list,ARRAY,"Array of JSON paths in the release object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the release object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/tender_documents_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ data_id,integer,"id for the ""data"" table in Kingfisher that holds the original
document,jsonb,JSONB of the document
documenttype,text,`documentType` field from the document object
format,text,`format` field from the document object
field_list,ARRAY,"Array of JSON paths in the document object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the document object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/tender_items_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ unit_currency,text,`currency` from the unit/value object
item_classification,text,Concatenation of classification/scheme and classification/id
item_additionalidentifiers_ids,jsonb,JSONB list of the concatenation of additionalClassification/scheme and additionalClassification/id
additional_classification_count,integer,Count of additional classifications
field_list,ARRAY,"Array of JSON paths in the item object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the item object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/tender_milestones_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ milestone,jsonb,JSONB of milestone object
type,text,`type` from milestone object
code,text,`code` from milestone object
status,text,`status` from milestone object
field_list,ARRAY,"Array of JSON paths in the milestone object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the milestone object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/tender_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,4 @@ milestones_count,bigint,Count of milestones
milestonetype_counts,jsonb,JSONB object with the keys as unique milestoneTypes and the values as a count of the appearances of that `milestoneType` in the `milestones` array
items_count,bigint,Count of items
tender,jsonb,JSONB of tender object
field_list,ARRAY,"Array of JSON paths in the tender object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the tender object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
2 changes: 1 addition & 1 deletion docs/definitions/tenderers_summary.csv
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ tenderer_additionalidentifiers_count,integer,Count of additional identifiers
link_to_parties,integer,"Does this tenderer link to a party in the parties array using the `id` field from buyer object linking to the `id` field in a party object? If this is true then 1, otherwise 0"
link_with_role,integer,If there is a link does the parties object have `tenderers` in its roles list? If it does then 1 otherwise 0
party_index,bigint,If there is a link what is the index of the party in the `parties` array. This can be used for joining to the `parties_summary` table
field_list,ARRAY,"Array of JSON paths in the tenderer object, excluding array indices.This column is only available if the --field-lists option was used."
field_list,jsonb,"JSONB object of JSON paths in the tenderer object, excluding array indices. Keys in the object are the paths and values are NULL. This column is only available if the --field-lists option was used."
5 changes: 3 additions & 2 deletions manage.py
Original file line number Diff line number Diff line change
Expand Up @@ -521,7 +521,7 @@ def _add_field_list_column(summary_table, tables_only):
CREATE TABLE {summary_table.name}_field_list AS
SELECT
{summary_table.primary_keys},
array_agg(path) AS field_list
jsonb_object_agg(path, NULL) AS field_list
FROM
{summary_table.name}
CROSS JOIN
Expand Down Expand Up @@ -561,7 +561,8 @@ def _add_field_list_comments(summary_table, name):
for row in db.all(statement, {'schema': name, 'table': f'{summary_table.name}_no_field_list'}):
db.execute(f'COMMENT ON COLUMN {summary_table.name}.{row[0]} IS %(comment)s', {'comment': row[1]})

comment = (f'Array of JSON paths in the {summary_table.data_field} object, excluding array indices. '
comment = (f'All JSON paths in the {summary_table.data_field} object, excluding array indices, expressed as '
f'a JSONB object in which keys are paths and values are NULL. '
f'This column is only available if the --field-lists option was used.')
db.execute(f'COMMENT ON COLUMN {summary_table.name}.field_list IS %(comment)s', {'comment': comment})

Expand Down
Loading

0 comments on commit 179682c

Please sign in to comment.