Skip to content
This repository has been archived by the owner on Aug 3, 2022. It is now read-only.

Commit

Permalink
Update how to read the results.md
Browse files Browse the repository at this point in the history
  • Loading branch information
morchickit committed May 1, 2017
1 parent 61d696d commit 078b1b0
Showing 1 changed file with 14 additions and 17 deletions.
31 changes: 14 additions & 17 deletions content/how to read the results.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ The Global Open Data Index (GODI) is a tool to educate civil society and governm

Even though we try to make this assessment as understandable and transparent as possible, it is not always easy to interpret the results. While open data has a very strict definition, scoring of any index is an arbitrary action. In real life, you can't be partly open - the published data either fits the criteria or it does not. So, what does the final score mean? How to interpret scores that vary between 0%, 40% or 70%? Here is some guidance on how best to read our results.

## What does a score of 0% mean?
## What does a score of 0% mean?

The Index refers to clearly defined data categories (e.g. budget, air quality, or national maps). Each data category contains specific data points drawing on international standards, as well as open data initiatives like [Open Spending](next.openspending.org), [Open Contracting](http://www.open-contracting.org/), [Opening Parliament](https://www.openingparliament.org/) or [Open Corporates](https://opencorporates.com/).
The Index refers to clearly defined data categories (e.g. budget, air quality, or national maps). Each data category contains specific data points drawing on international standards, as well as open data initiatives like [Open Spending](next.openspending.org), [Open Contracting](http://www.open-contracting.org/), [Opening Parliament](https://www.openingparliament.org/) or [Open Corporates](https://opencorporates.com/).

Our data categories strike a balance between feasibility and relevance - broad enough so that governments should be able to meet them, but also accurate enough to ensure that it describes usable information that is relevant and useful to the public. This balance allows for comparison between countries based on standard criteria, and gives a more realistic picture of the impact of the published open data publication (for more information, see our [methodology page](http://index.okfn.org/methodology/)). This approach allows for the comparability of results and gives a realistic account of data identified as useful for the public.

Expand All @@ -19,25 +19,25 @@ This is different from previous editions. Back then, we applied our data definit

### Does GODI track improvements in open data publication?

Not necessarily. Our scoring (ranging from 0% open to 100% open) does not necessarily show a gradual improvement. In fact, we assess very different degrees of data openness - which is why any score below 100 percent only indicates that a dataset is partially open. These levels of openness include public data, access-controlled data, as well as data gaps (See the index methodology). To understand the differences, we highly recommend reading each score together with our icons that indicate different aspects of open data.
Not necessarily. Our scoring (ranging from 0% open to 100% open) does not necessarily show a gradual improvement. In fact, we assess very different degrees of data openness - which is why any score below 100 percent only indicates that a dataset is partially open. These levels of openness include public data, access-controlled data, as well as data gaps (See the index methodology). To understand the differences, we highly recommend reading each score together with our icons that indicate different aspects of open data.


For instance: a score of 70% can mean that we found access-controlled, machine-readable data, that cannot be downloaded in bulk. Any score below 100% means “no access”, “closed access” or “public access”. This is not open data. Here are some real-world examples of data we found online. We also explain how our results can be better interpreted.

## Public access
## Public access
Data is publicly accessible if the public can find it and view it online without any access restrictions. It does not imply that data can be downloaded, or that it is freely reusable. Often it means that data is presented in HTML format only on a website.


<img src="/images/searchinterface.png" alt=Search interface" class="img-responsive center-block" style="width: 700px;">


This image shows a search interface of a company register. It allows for targeted searches for individual companies, but does not enable the user to retrieve all data at once. Individual search results (non-bulk) are displayed in HTML format and can then be downloaded in PDF format (not machine-readable). Therefore, the score is 70% .

<img src="/images/searchinterface.png" alt=Search interface" class="img-responsive center-block" style="width: 700px;">

In the image below, we are able to view weather forecast data available in HTML. The data is publicly accessible, but cannot be downloaded without using a “scraper” tool that would automatically retrieve the values from the website. Some of the values, like the temperature graph cannot be retrieved at all. Also the data is legally protected by copyright and cannot be reused. The scoring: 45% (not machine-readable, not downloadable, no open license).


<img src="/images/visualisationnotdata.png" alt="html scraper" class="img-responsive center-block" style="width: 700px;">
<img src="/images/visualisationnotdata.png" alt="html scraper "class="img-responsive center-block" style="width: 700px;">


## Access-controlled data
Expand All @@ -47,7 +47,7 @@ Data is access-controlled if a provider regulates who, when, and how the data ca
* Data request forms, data sharing agreement (stipulating use cases);
* Ordering/purchasing data.

The reasons for controlled access are varied, including website traffic management, or to maintain control over how data is used. It is debatable whether some registration/authentication requirements reduce the openness of data (especially when registration is automated). Required use of data request forms on the other hand are simply unacceptable for open data.
The reasons for controlled access are varied, including website traffic management, or to maintain control over how data is used. It is debatable whether some registration/authentication requirements reduce the openness of data (especially when registration is automated). Required use of data request forms on the other hand are simply unacceptable for open data.

**Maximum score**: Up to 85%, indicating that all criteria of open data are met, but 15 points out of 100 deducted because users have to register online to be able to download.

Expand All @@ -58,23 +58,20 @@ This image shows a data request form (controlled access). The dataset is entirel

<img src="/images/report.png" alt="controlled access" class="img-responsive center-block" style="width: 700px;">

## How to read a score of 0%?
## How to read a score of 0%?

## Data gaps
A data gap can mean that governments do not publish any data in a given category. Sometimes, if the Index shows a 0 percent score, we see data gaps. For instance, the case for Western African countries that lack air quality monitoring systems, or countries that have no established postcodes system. Data gaps indicate that the government information systems are not ready to produce open data, sometimes because resources are missing, sometimes because it is not a priority of government.
A data gap can mean that governments do not publish any data in a given category. Sometimes, if the Index shows a 0 percent score, we see data gaps. For instance, the case for Western African countries that lack air quality monitoring systems, or countries that have no established postcodes system. Data gaps indicate that the government information systems are not ready to produce open data, sometimes because resources are missing, sometimes because it is not a priority of government.

**Maximum score**: 0%

### Not granular
Since our criteria requires a particular level of data granularity, we considered all datasets that do not meet this requirement as not granular, and therefore are regarded as not available.
For example - Great Britain has published elections results, but not at the poll station level, which is a crucial level to detect voter fraud; therefore, while there is some data for UK elections, it is not at the right level, and is considered as non existent.
### Not granular
Since our criteria requires a particular level of data granularity, we considered all datasets that do not meet this requirement as not granular, and therefore are regarded as not available.
For example - Great Britain has published elections results, but not at the poll station level, which is a crucial level to detect voter fraud; therefore, while there is some data for UK elections, it is not at the right level, and is considered as non existent.

**Maximum score**: 0%

### Do not fit our criteria.
We are looking for particular datasets in the index. When they don't have all the characteristics we are looking for, we consider them as not available.
### Do not fit our criteria.
We are looking for particular datasets in the index. When they don't have all the characteristics we are looking for, we consider them as not available.

**Maximum score**: 0%



0 comments on commit 078b1b0

Please sign in to comment.