Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data collection guidance #1061

Merged
merged 12 commits into from
Oct 5, 2020
31 changes: 29 additions & 2 deletions docs/guidance/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,15 @@

This phase is about creating a new IT system, or updating an existing IT system, to implement your [mapping](map) and publish OCDS data.

Alternatively, if you don't have the capacity to create or update an IT system, you can consider reusing an existing [data collection tool](build/data_collection_tools). If you're reusing an existing tool, this phase is about customizing that tool to meet your needs and working out how to combine and publish your data. The [OCDS Helpdesk](../../support/#ocds-helpdesk) can help you to consider options for collecting, combining and publishing data.

```eval_rst
.. toctree::
:hidden:

build/data_collection_tools
```

As you complete this phase, you can:

* Fill in the *Publication architecture* sub-section of your [Publication Plan](https://www.open-contracting.org/resources/ocds-publication-plan-template/).
Expand Down Expand Up @@ -39,6 +48,19 @@ Your choice of architecture can determine how frequently your data is updated, w
```
**Resource:** [Technical case studies: OCDS implementation insights report](https://www.open-contracting.org/resources/technical-case-studies-ocds-implementation-insights/) provides insights into the technical choices made in OCDS implementations in Paraguay, Zambia, Colombia, Moldova and Argentina's Road Agency Vialidad.

### Decide how to combine spreadsheet data

If you aren't creating or updating an IT system, but are instead collecting data from different individuals, departments or agencies using spreadsheets, then this step is about working out how to combine your data into a single file for publication. Combining your data makes it easier for users to analyze the whole dataset.

If you plan to publish your data infrequently, you only have a small number of spreadsheets and your spreadsheets have identical headers, then simply copy-pasting the data into a single file for publication may be the easiest method.

Otherwise, you can consider the following methods:

* If you're comfortable using a command-line interface, you can use CSV Kit's [`in2csv` command](https://csvkit.readthedocs.io/en/latest/scripts/in2csv.html) to convert each sheet of a spreadsheet into a CSV file, and then use the [`csvstack` command](https://csvkit.readthedocs.io/en/latest/scripts/csvstack.html) to combine sets of CSV files with identical headers into single CSV files.
* If you're comfortable writing Visual Basic for Applications (VBA) or Google Apps Script code, you can write a macro for Microsoft Excel or Google Sheets to combine your data into a single file.
* If you're comfortable using spreadsheet formulae, you can use Google Sheet's [IMPORTRANGE](https://support.google.com/docs/answer/3093340?hl=en) or [QUERY](https://support.google.com/docs/answer/3093343?hl=en) functions to import data from multiple spreadsheets to a single sheet.
* If you aren't comfortable with the above methods, you can consider using a spreadsheet add-on for combining data from multiple sheets.

## Establish your publication formats and access methods

OCDS data can be published in different formats and accessed using different methods.
Expand Down Expand Up @@ -80,8 +102,13 @@ Having determined your system architecture, it's time to implement it. This is o
* To **make OCDS data available via an API**, you can use another component of [Kavure'i](https://gitlab.com/dncp-opendata/opendata-etl/-/blob/master/README_en.md) to load OCDS data into [ElasticSearch](https://www.elastic.co/), and then use [Pitogüé](https://gitlab.com/dncp-opendata/opendata-api-v3/blob/master/README_en.md) to make it available via an API. (Both tools are authored by Paraguay's Dirección Nacional de Contrataciones Públicas (DNCP).)
* If you intend to **publish [record packages](../../schema/record_package)**, [OCDS Merge](https://ocds-merge.readthedocs.io/en/latest/) is the best software library for creating OCDS [records](../../getting_started/releases_and_records). If you use the [Python](https://www.python.org/) programming language, you can use it directly. If not, you can use its [test cases](https://ocds-merge.readthedocs.io/en/latest/#test-cases) to test your implementation of the [merge routine](../../schema/merging), and you can read its [commented code](https://github.com/open-contracting/ocds-merge) as inspiration for your implementation.
* If you have [release packages](../../schema/release_package) and want to have [record packages](../../schema/record_package), if you have data that follows an older version of OCDS, or if you otherwise need to transform your OCDS data, you can use [OCDS Kit](https://ocdskit.readthedocs.io/) as a command-line tool or [Python](https://www.python.org/) library.
* If you are **authoring data from scratch**, you can use this tool to [enter data](https://github.com/INAImexico/Contrataciones_abiertas_v2), which also includes a web interface for users to access and explore the OCDS data. (This tool is authored by Mexico's Instituto Nacional de Transparencia, Acceso a la Información y Protección de Datos Personales (INAI).) (*Manuals are in Spanish.*)
* If you want to **collect data using a spreadsheet or without an internet connection**, you can develop a [spreadsheet input template](https://www.open-contracting.org/resources/simple-ocds-spreadsheet-template/).

If you aren't creating or updating an IT system, but are instead reusing an existing [data collection tool](build/data_collection_tools), you can customize it:

* The [data collection spreadsheet](https://www.open-contracting.org/resources/data-collection-spreadsheet/) includes instructions describing how to add fields and how to add and reformat sheets.
* The [data collection form](https://www.open-contracting.org/resources/ocds-data-collection-form/) includes instructions describing how to add fields and how to customize descriptions and guidance.

Contact the [OCDS Helpdesk](../../support/#ocds-helpdesk) for support and guidance on customizing a tool to meet your needs.

</div>

Expand Down
102 changes: 102 additions & 0 deletions docs/guidance/build/data_collection_tools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Data collection tools

Where contracting data is managed using IT systems, implementing OCDS involves identifying how to extract, convert and publish that data.

Where contracting data is managed on paper or using unstructured electronic documents, implementing OCDS typically involves creating a bespoke IT system to collect data in a structured form.

If you need to collect structured data, but you don’t have the capacity to create an IT system, you can consider reusing an existing data collection tool. These tools offer fewer opportunities for customization than a bespoke IT system, but can provide a quick route to collecting and publishing data in OCDS format.

## Data collection form

The [data collection form](https://www.open-contracting.org/resources/ocds-data-collection-form/) is a web-based form for collecting OCDS data.

Data from the form is copied to a Google Sheet which structures and formats it to conform to OCDS.

Data entered using the form can be checked and converted using the [OCDS Data Review Tool](https://standard.open-contracting.org/review/) and published in either spreadsheet or JSON format.

Consider using the form if:

* You don’t have the capacity to develop or install a software tool
* Your users have access to **reliable internet connections**
* Your users are **unfamiliar** with spreadsheets
* Data entry will be done by **many** different users
* You want to minimize the work required to collate data
* You **don’t need to update** the data about a contracting process after it is first entered

The form includes a subset of fields from the **tender** and **buyer** sections of OCDS. The OCDS Helpdesk can help you extend and adapt the form to suit your needs.

Read more about using and customizing the form in the [resource guide](https://www.open-contracting.org/resources/ocds-data-collection-form/). The OCDS helpdesk can also provide guidance on using the form to collect data and help you to analyze the data that you collect.

## Data collection spreadsheet

The [data collection spreadsheet](https://www.open-contracting.org/resources/data-collection-spreadsheet/) is available in Google Sheets and for offline use in Microsoft Excel.

Data is entered directly into the template, which is structured and formatted to conform to OCDS.

Data entered the template can be checked and converted using the [OCDS Data Review Tool](https://standard.open-contracting.org/review/) and published in either spreadsheet or JSON format.

Consider using the spreadsheet template to collect data if:

* You don’t have the capacity to develop or install a software tool
* You need to collect data **without** access to an internet connection
* Your users are **familiar** with using spreadsheets
* Data entry will be done by a **small number of** users
* You have the **capacity to collate** data entered in multiple spreadsheets
* You need to make **multiple updates** over the life of a contracting process

The template includes a subset of fields from the **planning**, **tender**, **award**, **contract** and **implementation** sections of OCDS.

The spreadsheet includes instructions describing how to enter data and how to customize the spreadsheet to suit your needs. Read more about developing data collection spreadsheets in our blog series on [prototyping OCDS data using spreadsheets](https://www.open-contracting.org/2020/04/24/prototyping-ocds-data-using-spreadsheets/).

The OCDS Helpdesk can:

* Provide guidance on using the template to collect data
* Help you to extend or adapt the template to meet your requirements
* Help you to analyze the data that you collect

## OpenContractR

[OpenContractR](https://github.com/patxiworks/opencontractr) is a WordPress plugin for collecting and publishing OCDS data.

The plugin adds an interface for entering data, a contracts search function, visualizations and an OCDS format JSON API.

OpenContractR can be added to a new or existing WordPress site. There are many WordPress hosting providers to choose from if you do not already use WordPress.

Consider using OpenContractR if:

* You already have a **WordPress** website, or you have the capacity to set one up
* Your users have access to **reliable internet connections**
* Data entry will be done by **many** different users
* You want to minimize the work required to collate data
* You want users to be able to **edit or update** already entered data
* You want an online **search** interface for your data

Read more about OpenContractR in its [introduction](https://drive.google.com/file/d/18WHnQcnA6oESZtcZgS4rgaBLjd8BKvbM/view). If you’re interested in using OpenContractR please get in touch with the OCDS Helpdesk.

## Contrataciones Abiertas tool

[Contrataciones Abiertas](https://github.com/INAImexico/Contrataciones_abiertas_v2) is a tool developed by the Mexican government and it is made up of two main modules:

* Information capture module
* Visualization module

The objective of the tool is to enable the monitoring of contracting processes by using and producing OCDS data.

Consider using Contrataciones Abiertas if:

* You have the resources and technical capacity to [set up the tool](https://github.com/INAImexico/Contrataciones_abiertas_v2/raw/master/Manual%20de%20instalaci%C3%B3n.docx)
* Data entry will be done by **many** different users
* You want to **validate** your data before publication
* You want to minimize the work required to collate and publish data
* You want a [**dashboard and visualizations**](https://www.open-contracting.org/2020/06/12/mexicos-inai-launches-new-open-source-tool-to-upload-and-visualize-open-contracting-data/) of your data
* **Spanish** is your (and your users') main language

The tool is open source and is [documented in Spanish](https://github.com/INAImexico/Contrataciones_abiertas_v2). You can also find an introduction to the tool in [our blog](https://www.open-contracting.org/2020/06/12/mexicos-inai-launches-new-open-source-tool-to-upload-and-visualize-open-contracting-data/).

## What next?

If you're collecting data using spreadsheets, then you need to [decide how to combine your data for publication](../../build#decide-how-to-collate-spreadsheet-data).

However you choose to collect and structure your data, you need to [establish your publication formats and access methods](../../build#establish-your-publication-formats-and-access-methods), and [publish your data](../../publish).

Consider how you will make it easy for users to discover the data you have published. For example, by publishing it on an existing procurement portal, on your organization’s website, or an open data portal.
2 changes: 2 additions & 0 deletions docs/guidance/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ The answers to these questions will help you to develop a robust publication pla

**Resource:** [Publication Plan Template](https://www.open-contracting.org/resources/ocds-publication-plan-template/)

If your contracting data is mostly on paper, in local spreadsheets or in unstructured electronic documents, and you don’t have the capacity to create a new IT system to collect data, consider reusing one of the existing [tools for collecting OCDS data](build/data_collection_tools).

## Set your goals and priorities

The most useful OCDS implementations are those that were designed around real-world goals and priorities. To achieve success, you need to understand your policy goals and user needs, and use them to inform the design of your OCDS implementation.
Expand Down
9 changes: 8 additions & 1 deletion docs/guidance/map.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

This phase is about documenting your sources of contracting data, and documenting how that data "maps" to OCDS – that is, identifying which [data elements](https://en.wikipedia.org/wiki/Data_element) within your data sources match which OCDS [fields](../../schema/reference) and [codes](../../schema/codelists). The mapping phase is one of the longest and most important steps in the implementation process.

If your contracting processes are managed on paper, using local spreadsheets or via unstructured electronic documents, and you’re reusing one of the existing [tools for collecting OCDS data](build/data_collection_tools), then please [get in touch with the OCDS Helpdesk](../../support/#ocds-helpdesk) for guidance on how to identify which OCDS fields match your local concepts.

Mapping data to OCDS is not always easy. Before writing any software, this phase is an opportunity to:

* Catch errors early on
Expand All @@ -24,6 +26,8 @@ As described in the [Field-Level Mapping Template Guidance](https://www.open-con

To implement OCDS you need to first identify which IT systems capture and store contracting data and related documents. You also need to identify how to connect data held in different systems, to get a complete picture of the contracting process. The [Technical Assessment Template](https://www.open-contracting.org/resources/ocds-technical-assessment-template/) guides you through this process.

If your contracting processes are managed on paper, using local spreadsheets or via unstructured electronic documents, you should use the template to identify those data sources, too.

Once complete, you can:

* Ask the [OCDS Helpdesk](../../support/index) to review your Technical Assessment.
Expand All @@ -37,14 +41,17 @@ To make this step easier we provide templates to list the data elements within y
* OCDS [fields](../../schema/reference), using the [Field-Level Mapping Template](https://www.open-contracting.org/resources/ocds-field-level-mapping-template/) ([read the tutorial](https://www.open-contracting.org/resources/ocds-1-1-mapping-template-guidance/))
* OCDS [codes](../../schema/codelists), using the [Codelist Mapping Template](https://www.open-contracting.org/resources/ocds-1-1-codelist-mapping-template/) ([read the tutorial](https://www.open-contracting.org/resources/ocds-1-1-codelist-mapping-template-guidance/))

If your contracting data is managed on paper or in unstructured electronic documents, you should use the templates to list the data elements in those data sources and map them to OCDS.

You can [contact the OCDS Helpdesk](../../support/#ocds-helpdesk) for support and guidance on using the mapping templates.

Before working on mapping individual fields and codes, consider whether to first [localize OCDS](map/localization) to your context. Localization can be useful when you need to map several different systems, or when multiple organizations will work on implementing OCDS in your country.

```eval_rst
.. toctree::
:hidden:

map/localization
```

### Mapping organization identifiers

Expand Down