Skip to content

martinbudden/coins

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Coins

This repository contains utilities to read HM Treasury COINS (Combined Online Information System) files, and to convert them to more usable formats.

Installation

Install the git version control system

Download this repository:

git clone git@github.com:martinbudden/coins.git

Install python, if you do not already have it.

Usage

Before running any of these utilities the HM Treasury COINS files should be downloaded, unzipped and placed in the "data" directory. After downloading and unzipping the COINS data released on 4th June 2010 the "data" directory should include the four files:

  • fact_table_extract_2009_10.txt
  • fact_table_extract_2008_09.txt
  • adjustment_table_extract_2009_10.txt
  • adjustment_table_extract_2008_09.txt

These files are in unicode format and need to be converted to UTF-8 format for easier usage. Run coins2utf8.py to do this:

python coins2utf8.py

This currently only operates on the fact_table_extract_*.txt files. It currently does three operations:

  • converts the file to UTF-8 csv format (field delimiter is still '@')
  • extracts only those records where the 'Value' field is non-zero

The output of coins2utf8.py is the files facts_2008_09_nz.csv and facts_2009_10_nz.csv. Note that 'nz' stands for 'non-zero'. The file facts_2009_10_s1000.csv is also produced - this is a small file of 1000 records which can be loaded into a spreadsheet to get a feel for the data. (Note that the field delimiter is '@' and this needs to be specified when loading the file, not also that one of the departmental_codes is 'DEC066', which may be interpreted as a date - be sure to set the 'Column type' for this column to be 'TEXT' when loading the file into a spreadsheet.)

Next run coinsdescriptions.py. This extracts varies COINS codes and descriptions (eg department_code and department_description) and places them in .csv files the data/desc directory. These files are standard .csv files and can be loaded into (say) Excel.

python coinsdescriptions.py

This completes the preliminary preprocessing of the coins files. You may now run coins2csv.py to produce .csv files that can be directly loaded into Excel, or coins2sqlite.py which produces an Sqlite database of the coins data.

###Producing COINS files that can be loaded into Excel

Run coins2csv.py. This uses the the facts_*_nz.csv files output from coins2utf8.py to produce a .csv file for the data_types 'outturn', 'forecasts', 'plans' and 'snapshots'.

python coins2csv.py

Most of these .csv files are small enough (less than 65,535 rows) to be loaded into Excel. These files are in the data/csv directory, and are named:

coin_<date_period>_<data_type>_nz.csv, eg coins_2009_10_outturn_nz.csv.

###Producing a COINS SQLite database

Run coins2sqlite.py. This uses the the facts_*_.csv files output from coins2utf8.py and the description .csv files produced by coinsdescription.py, to produce an SQLite database that can be queried.

python coins2sqlite.py

This produces the files coins_2008_09_sqlite.db and coins_2009_10_sqlite.db in the data/sqlite directory

###Example SQL queries

Run coinssqlexamples.py. This executes a number of example queries on the COINS SQLite datbase, coins_2009_10_sqlite.db.

python coinssqlexamples.py

###Recreating government PESA reports

Run coinssqlpesa.py. This produces HTML tables of the 2008-09 government departmental group spending figures. The tables include two empty columns than can be manually filled with the actual PESA data, so that the figures can be compared.

python coinssqlpesa.py

Data Source

Treasury Information

HM Treasury publish documents that may help to understand COINS and, and how that data may be used and aggregated. These include:

Slightly differing approaches are taken to some data recording and outputs for the Devolved Administrations. Some information on that, by means of concordats, is available here: Devolved Assemblies

HM Treasury publishes a number of documents based COINS data. Public Spending Statistics are available here: Statistics on Public Finance and Spending.

Checking Data

PESA 2010 section 1 - Budgets gives the outrun DEL, AME, Capital and Resource outrun figures for 2004-05 to 2008-09. Outrun figures from the COINS 2008/09 Fact table can be checked against these tables.

Public Expenditure Statistical Analyses 2009 gives the outrun DEL, AME, Capital and Resource outrun figures for 2003-04 to 2007-08 and the estimated outrun for 2008-09 and the planned outruns for 2009-10 and 2010-2011. Forecast figures from the COINS 2009/10 Fact table can be checked against these tables.

I've briefly outlined what some of the fields in the COINS data mean in the data/README.md file.

Useful links

Coding resources

About

Utilities to read HM Treasury COINS files, and to convert to more useful formats

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages