Skip to content
Authoritative NYC Address Data for the Pelias Geocoder
Branch: master
Clone or download
julialucyhogan Update NYC PAD importer
Cleanup the importer, and make it add PAD info as pad_meta
Latest commit f0ecc53 Feb 13, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin Update NYC PAD importer Feb 14, 2019
gui-test
lookup Get importer running inside a docker container Jan 11, 2018
stream Update NYC PAD importer Feb 14, 2019
.dockerignore first pass nycpad importer Jan 9, 2018
.gitignore first pass nycpad importer Jan 9, 2018
.jshintignore first pass nycpad importer Jan 9, 2018
.jshintrc use emtyDir to clear dir when downloading Jan 18, 2018
CONTRIBUTING.md add boilerplate files Jan 9, 2018
Dockerfile Update Dockerfile Jan 11, 2018
ISSUE_TEMPLATE.md add boilerplate files Jan 9, 2018
LICENSE.md
PULL_REQUEST_TEMPLATE.md add boilerplate files Jan 9, 2018
README.md Update NYC PAD importer Feb 14, 2019
index.js cleanup Jan 18, 2018
package-lock.json Update NYC PAD importer Feb 14, 2019
package.json Update NYC PAD importer Feb 14, 2019
pelias.json Update NYC PAD importer Feb 14, 2019
schema.js Get importer running inside a docker container Jan 11, 2018

README.md

labs-geosearch-pad-importer

A Pelias Importer for Authoritative NYC Addresses. Part of the NYC Geosearch Geocoder Project

Introduction

The NYC Geosearch API is built on Pelias, the open source geocoding engine that powers Geocode.earth

screen shot 2018-01-18 at 2 48 09 pm

We are treating the normalization of the PAD data as a separate data workflow from Pelias Import. This script picks up the output of labs-geosearch-pad-normalize and imports it into the Pelias elasticsearch database.

Requirements

You will need the following things properly set up to run the importer outside of docker compose. However, it is recommended to use a docker-compose project for simplest standup.

  • Git
  • Node.js (with NPM)
  • An elasticsearch database with target index already created. Elasticsearch host and index name can be specified in pelias.json
      {
        "esclient": {
          "hosts": [{
            "host": DESIRED_ES_HOST
          }]
        },
        "schema": {
          "indexName": DESIRED_INDEX_NAME
        }
        ... other pelias configuration ...
      }

Using the Importer

Running the import includes creating Documents from the .csv rows in the normalized PAD source. The nycpad importer adds custom fields as a pad_meta property to the Document objects. New versions of the pelias schema definition specify dynamic: strict, meaning the actual writes to ES will fail if using the default pelias schema. For our solution to extending the pelias schema to include our pad_meta fields, see our custom pelias docker compose project

Example config:

imports.nycpad is required, datapath, and import defined as show below

{
  "imports": {
    "nycpad": {
      "datapath": "data/nycpad",
      "import": [{
        "filename":"labs-geosearch-pad-normalized.csv"
      }]
    }
  }
  ...
}

Import

npm start reads the downloaded csv and imports each row into the pelias elasticsearch database. The importer looks for data at location specified by datapath + import.filename

Dockerfile

A Dockerfile is included to enable easy integration of the importer into a docker-compose pelias project. You can read more about docker-compose pelias projects and see examples at https://github.com/pelias/docker

You can’t perform that action at this time.