Skip to content
This repository has been archived by the owner on Oct 10, 2023. It is now read-only.

Dataloading instructions

Damon McCullough edited this page Feb 3, 2023 · 2 revisions

See the 01_dataloading.sh script and the db-data-library repo for more details.

Capital Commitment Plan

  1. When OMB releases the latest Capital Commitment Plan, email FISA requesting that they upload the latest Capital Commitment Plan data from FISA to their FTP server
  2. On a computer connected to DCP's intranet, use the program WinSCP to connect to FISA's FTP server (see Data Engineering team for setup and credentials)
  3. Download the file named AICP_OREQ_CAPPLN_PJCP.asc
  4. Convert the file to a .csv using Excel
  • Using 'Import data from text' (typically under the data tab) function, import the .asc file with the following parameters:
    • The data does not have headers
    • Delimiters: Custom/Other |
    • Column data format: General / newer versions of excel use Data Type Detection
  • Newer versions of excel may have slightly different steps or prompt you to Load the data into excel. Load it in and you should see it in excel with numbered columns
  1. Save the file as AICP_OREQ_CAPPLN_PJCP.csv
  2. Send the .csv via email and cc Data Engineering staff
  3. Archive AICP_OREQ_CAPPLN_PJCP.csv using the data library process

DDC Capital Projects

Email DDC asking them to provide the latest spatial data for their capital projects. They will send two shapefiles: 1. containing point level data, 2. containing line shapefile. Load the 2 files into data library.

DOT Capital Projects

Email DOT asking them to provide their latest data for their capital projects on bridges. They will send over one shapefile. Load the file into data library.

EDC Capital Projects

Email EDC asking them to provide their latest data for their capital projects. They will send over one shapefile. Load the file into data library. Data from this file get appended onto the existing data.

NOTE: The data library template for this dataset expects a .zip with 3 files in it (.shp, .shx, .dbf). The process of unzipping, renaming, deleting, and zipping these files sometimes corrupts them and causes a path not found error.

Other inputs

Other input datasets that need to be updated, that are not delivered via email include:

Spatial boundaries

Come from Geosupport update

  • dcp_stateassemblydistricts
  • dcp_ct2020
  • dcp_congressionaldistricts
  • dcp_cdboundaries
  • dcp_statesenatedistricts
  • dcp_municipalcourtdistricts
  • dcp_school_districts
  • dcp_trafficanalysiszones
  • dcp_councildistricts
  • nypd_policeprecincts
  • fdny_firecompanies

Building and lot-level info

  • dcp_mappluto
  • dcp_facilities
  • doitt_buildingfootprints

Projects

  • cpdb_capital_spending
  • dot_projects_intersections
  • dot_projects_streets
  • dpr_capitalprojects
  • dpr_parksproperties
  • dcp_cpdb_agencyverified (does not get updated)