Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update and document GB postcode data #1225

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,6 @@ data/wiki_import.sql
data/wiki_specialphrases.sql
data/osmosischange.osc

data-sources/gb-postcodes/vendor

.vagrant
52 changes: 52 additions & 0 deletions data-sources/gb-postcodes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# GB Postcodes


The server [importing instructions](https://www.nominatim.org/release-docs/latest/admin/Import-and-Update/) allow optionally download [`gb_postcode_data.sql.gz`](https://www.nominatim.org/data/gb_postcode_data.sql.gz). This document explains how the file got created.

## GB vs UK

GB (Great Britain) is more correct as the Ordnance Survey dataset doesn't contain postcodes from Northern Ireland.

## Importing separately after the initial import

If you forgot to download the file, or have a new version, you can import it separately:

1. Import the downloaded `gb_postcode_data.sql.gz` file.

2. Run `utils/setup.php --calculate-postcodes` from the build directory. This will copy data form the `gb_postcode` table to the `location_postcodes` table.



## Converting Code-Point Open data

1. Download from [Code-Point® Open
](https://www.ordnancesurvey.co.uk/business-and-government/products/code-point-open.html). It requires an email address where a download link will be send to.

2. `unzip codepo_gb.zip`

Unpacked you'll see a directory of CSV files.

```
$ more codepo_gb/Data/CSV/n.csv
"N1 0AA",10,530626,183961,"E92000001","E19000003","E18000007","","E09000019","E05000368"
"N1 0AB",10,530559,183978,"E92000001","E19000003","E18000007","","E09000019","E05000368"
```

The coordinates are "Northings" and "Eastings" in [OSGB 1936](http://epsg.io/1314) projection. They can be projected to WGS84 like this

```
SELECT ST_AsText(ST_Transform(ST_SetSRID('POINT(530626 183961)'::geometry,27700), 4326));
POINT(-0.117872733220225 51.5394424719303)
```
[-0.117872733220225 51.5394424719303 on OSM map](https://www.openstreetmap.org/?mlon=-0.117872733220225&mlat=51.5394424719303&zoom=16)

3. install packages

This reads composer.json, downloads and install packages to vendor/ subdirectory.
```
cd data-sources/gb-postcodes
composer install
```

4. `cat codepo_gb/Data/CSV/*.csv | ./convert_codepoint.php > gb_postcode_data.sql`

5 changes: 5 additions & 0 deletions data-sources/gb-postcodes/composer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"require": {
"proj4php/proj4php": "^2.0"
}
}
78 changes: 78 additions & 0 deletions data-sources/gb-postcodes/composer.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

52 changes: 52 additions & 0 deletions data-sources/gb-postcodes/convert_codepoint.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/usr/bin/php
<?php

include('vendor/autoload.php');

use proj4php\Proj4php;
use proj4php\Proj;
use proj4php\Point;

$oProj4 = new Proj4php();
$oProjOSGB36 = new Proj('EPSG:27700', $oProj4);
$oProjWGS84 = new Proj('EPSG:4326', $oProj4);

echo <<< EOT

-- This data contains Ordnance Survey data © Crown copyright and database right 2010.
-- Code-Point Open contains Royal Mail data © Royal Mail copyright and database right 2010.
-- OS data may be used under the terms of the OS OpenData licence:
-- http://www.ordnancesurvey.co.uk/oswebsite/opendata/licence/docs/licence.pdf

SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;

COPY gb_postcode (id, postcode, x, y) FROM stdin;

EOT;

$iCounter = 0;
while ($sLine = fgets(STDIN)) {
$aColumns = str_getcsv($sLine);

// https://stackoverflow.com/questions/9144592/php-split-a-postcode-into-two-parts#comment11589150_9144834
// insert space before the third last position
$sPostcode = $aColumns[0];
$sPostcode = preg_replace('/\s*(...)$/', ' $1', $sPostcode);
$iNorthings = $aColumns[2];
$iEastings = $aColumns[3];

$oPointWGS84 = $oProj4->transform($oProjWGS84, new Point($iNorthings, $iEastings, $oProjOSGB36));
list($fLon, $fLat) = $oPointWGS84->toArray();

echo join("\t", array($iCounter, $sPostcode, $fLon, $fLat))."\n";

$iCounter = $iCounter + 1;
}

echo <<< EOT
\.
EOT;
5 changes: 2 additions & 3 deletions data/gb_postcode_table.sql
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,7 @@ SET default_with_oids = false;
CREATE TABLE gb_postcode (
id integer,
postcode character varying(9),
geometry geometry,
CONSTRAINT enforce_dims_geometry CHECK ((st_ndims(geometry) = 2)),
CONSTRAINT enforce_srid_geometry CHECK ((st_srid(geometry) = 4326))
x double precision,
y double precision
);

1 change: 1 addition & 0 deletions docs/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ ADD_CUSTOM_TARGET(doc
COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_SOURCE_DIR}/index.md ${CMAKE_CURRENT_BINARY_DIR}/index.md
COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_SOURCE_DIR}/extra.css ${CMAKE_CURRENT_BINARY_DIR}/extra.css
COMMAND ${CMAKE_COMMAND} -E create_symlink ${PROJECT_SOURCE_DIR}/data-sources/us-tiger/README.md ${CMAKE_CURRENT_BINARY_DIR}/data-sources/US-Tiger.md
COMMAND ${CMAKE_COMMAND} -E create_symlink ${PROJECT_SOURCE_DIR}/data-sources/gb-postcodes/README.md ${CMAKE_CURRENT_BINARY_DIR}/data-sources/GB-Postcodes.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Centos-7.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Centos-7.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-16.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-16.md
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-18.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-18.md
Expand Down
1 change: 1 addition & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ pages:
- 'External Data Sources':
- 'Overview' : 'data-sources/overview.md'
- 'US Census (Tiger)': 'data-sources/US-Tiger.md'
- 'GB Postcodes': 'data-sources/GB-Postcodes.md'
- 'Appendix':
- 'Installation on CentOS 7' : 'appendix/Install-on-Centos-7.md'
- 'Installation on Ubuntu 16' : 'appendix/Install-on-Ubuntu-16.md'
Expand Down
1 change: 1 addition & 0 deletions phpcs.xml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

<exclude-pattern>./lib/template/*html*</exclude-pattern>
<exclude-pattern>./lib/template/includes/</exclude-pattern>
<exclude-pattern>./**/vendor/</exclude-pattern>
<exclude-pattern>./module/</exclude-pattern>
<exclude-pattern>./website/css</exclude-pattern>
<exclude-pattern>./website/js</exclude-pattern>
Expand Down
2 changes: 1 addition & 1 deletion sql/update-postcodes.sql
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ INSERT INTO tmp_new_postcode_locations (country_code, pc, centroid)
WHERE new.country_code = 'us' AND new.pc = u.postcode);
-- add extra UK postcodes
INSERT INTO tmp_new_postcode_locations (country_code, pc, centroid)
SELECT 'gb', postcode, geometry FROM gb_postcode g
SELECT 'gb', postcode, ST_SetSRID(ST_Point(x,y),4326)
WHERE NOT EXISTS (SELECT 0 FROM tmp_new_postcode_locations new
WHERE new.country_code = 'gb' and new.pc = g.postcode);

Expand Down