Skip to content

dedupeio/dedupe-geocoder

Repository files navigation

Dedupe Geocoder

Demonstration app to show how Dedupe might be used as a geocoder

Part of the Dedupe.io cloud service and open source toolset for de-duplicating and finding fuzzy matches in your data.

Setup

Install OS level dependencies:

  • Python 3.4
  • PostgreSQL 9.4 +

Install app requirements

We recommend using virtualenv and virtualenvwrapper for working in a virtualized development environment. Read how to set up virtualenv.

Once you have virtualenvwrapper set up,

mkvirtualenv dedupe-geocoder
git clone https://github.com/datamade/dedupe-geocoder.git
cd dedupe-geocoder
pip install -r requirements.txt
cp geocoder/app_config.py.example geocoder/app_config.py

In app_config.py, put your Postgres user in DB_USER and password in DB_PW.

Afterwards, whenever you want to work on dedupe-geocoder,

workon dedupe-geocoder

Setup your database

Before we can run the website, we need to create a database.

createdb geocoder

Then, we run the loadAddresses.py script to download our data from the Cook County data portal.

python loadAddresses.py --download --load_data 

This command will take between 15-45 min depending on your internet connection.

You can run loadAddresses.py again to get the latest data from the Cook County, add more training data, or create a table of block keys for dedupe to use to match new records. Useful flags are:

 --download     Download fresh address data.
 --load_data    Load downloaded address data into database.
 --train        Add more training data and save settings file.
 --block        After training, create the block table used by dedupe for matching.

Running Dedupe Geocoder

To run locally:

workon dedupe-geocoder
python runserver.py

navigate to http://localhost:5000/

Team

  • Eric van Zanten - developer
  • Derek Eder - developer
  • Forest Gregg - developer
  • Cathy Deng - developer

Errors / Bugs

If something is not behaving intuitively, it is a bug, and should be reported. Report it here: https://github.com/datamade/dedupe-geocoder/issues

Note on Patches/Pull Requests

  • Fork the project.
  • Make your feature addition or bug fix.
  • Commit, do not mess with rakefile, version, or history.
  • Send a pull request. Bonus points for topic branches.

Copyright

Copyright (c) 2015 DataMade. Released under the MIT License.

About

📍 Demonstration of how dedupe might be used as geocoder

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages