Historical Address Geocoder (HAG-GIS) 1.0.0
Konstantinos Daras, July 2015
The Historical Address Geocoder (HAG-GIS) is a Python 2.7 program for automating the geocoding process for the Digitising Scotland project. The geocoding process involves fuzzy-matching historical records with contemporary addresses. This automating system takes into account spatial information deriving from historical administrative data improving the accuracy of the geocoded historical addresses and producing geography boundaries at small administrative scales where geographical boundaries are not available.
The basic HAGGIS-1.0.0 installation requires Python 2.7, with additional packages and libraries needed for the spatial analysis and database management of data.
To install pip, securely download get-pip.py (https://bootstrap.pypa.io/get-pip.py). Then run the following (which may require administrator access):
> python get-pip.py
If setuptools is not already installed, get-pip.py will install setuptools for you. To upgrade an existing setuptools:
> pip install -U setuptools.
Additional Python packages (pip package is required)
> pip install wheel==0.24.0
> pip install PyYAML==3.11
> pip install python-Levenshtein==0.11.2
> pip install scipy==0.15.1
> pip install numpy==1.9.2
> pip install nose==1.3.7
> pip install tqdm==1.0
Follow the installation instructions as suggested at the official sites of the following libraries.
Qhull library (scipy depedency) at http://www.qhull.org/
SpatiaLite v4.0 library (sqlite dependency) at http://www.gaia-gis.it/gaia-sins/
Test of installed modules:
To test if the required libraries are installed in your Python distribution, start Python and try the following:
>>> from scipy.spatial import Voronoi >>> from scipy.spatial import KDTree
>>> import Levenshtein >>> import tqdm
>>> import PyYAML >>> import numpy
None of these import commands should give you an error.
Unpack the archive and a new directory named 'HAGGIS' will be created containing all the necessary HAGGIS modules and additional files such as example data sets, documentation and testing programs.
Go into the 'haggis' sub-directory within 'HAGGIS' and run all tests using the corresponding command provided:
or run the tests individually (within 'HAGGIS/haggis/tests' folder), for example:
> python test_spatial.py
The HAGGIS can be started using:
> python haggis.py
> python haggis.py <config file>
where <config file> is a given configuration file.
Problems and errors:
Please note that this is the initial distribution of HAGGIS-1.0.0 which has only been tested to a limited extent on an Windows platform (specifically Windows 7 & 8 with Python 2.7).
Please report any problems and bugs to: konstantinos.Daras@gmail.com
To receive updates and news on HAGGIS please visit the following open source lists at:
Historical Address Geocoder
- Free software: GPL 3.0 license
- Documentation: http://www.gnu.org/licenses/gpl.html.
- Export Geocoded Historical addresses and RD polygon centroids [Priority]
- Use Q-Gram algorithm
- Use Jaro-Winkler algorithm
- Introduce weights in each token [Priority]
- Use Classification after address comparison [Priority]
- Create example data sets