Dataset: DCP Housing Database
The dcp_housingdb
dataset can be added to NYCDB by running:
> nycdb --download dcp_housingdb
> nycdb --load dcp_housingdb
The dataset comes from the NYC Department of City Planning:
-
DCP Housing Database
The NYC Department of City Planning’s (DCP) Housing Database contains all NYC Department of Buildings (DOB)-approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. It includes the three primary construction job types that add or remove residential units: new buildings, major alterations, and demolitions, and can be used to determine the change in legal housing units across time and space. Records in the Database are geocoded to the greatest level of precision possible, subject to numerous quality assurance and control checks, recoded for usability, and joined to other housing data sources relevant to city planners and analysts.
DOB Open Data are the primary source for the NYC Department of City Planning’s (DCP) Housing Database, and despite efforts DCP makes to correct errors, there are still unknown inaccuracies throughout the data. The data are administrative in nature, and most of the information associated with a record is provided by the applicant, and not necessarily verified by DCP.
You can download the data dictionary XLSX.
Click to Expand the dcp_housingdb
Data Dictionary
Column Name | Column Description | Expected/Allowed Values | Field Limitations | Additional Notes |
---|---|---|---|---|
Job_Number | The DOB job application number assigned when the applicant begins the application. This is the unique identifier for the application submitted to the Department of Buildings (DOB). It may contain several work types, and more work types may be added as the application review and the work continues. It is a 9-digit number where the first digit indicates the borough where the building is located. | These values are mapped from the dob_jobapplications field jobnumber | ||
Job_Type | DOB's type category for the job application. More information is available here. The following types are included in this database: New Building (NB): an application to build a new structure. “NB” cannot be selected if any existing building elements are to remain—for example a part of an old foundation, a portion of a façade that will be incorporated into the construction, etc. Alteration Type I (A1): a major alteration that will change the use, egress, or occupancy of the building. Demolition (DM): an application to fully or partially demolish an existing building. Note that many demolition permits are only for partial demolitions and for garages (these are also captured). |
Mapped from dob_jobapplications field jobtype as follows: A1 to Alteration DM to Demolition NB to New Building Administrative records are removed based on this field in combination with job_description. |
||
ResidFlag | This field is used to identify jobs in buildings containing residential uses. A value of "residential" indicates that the job affects residential units in some way through new construction, alteration, or demolition. Only those jobs with a value of "residential" are included in the housing database. Manual research was conducted at DCP to help ensure that all work on buildings with residences receive this flag, though some records may remain misclassified. | This field is dependent on hotel_init, hotel_prop, otherb_init, otherb_prop, classa_init, and classa_prop. If any of the dependent fields are not NULL, then the job is flagged as being residential. Manual research adds/removes this flag as necessary |
||
NonresidFlag | This field is used to identify jobs in buildings containing non-residential uses. A value of "non-residential" indicates that the job affects some use type beyond residential, including commercial, industrial, or community facility uses. Mixed-use buildings will have values in both the Resid_Flag and Nonresid_Flag, since they contain both residential and nonresidential uses, but having flags in both of those columns does not necessarily mean that it is a mixed-use development since the flags do not distinguish between the initial and proposed uses in the building. | This field is dependent on job_description, occ_initial, occ_proposed, and resid_flag. nonresid_flag is true if: job_description has one of these keywords OR occ_initial or occ_proposed are one of these categories. OR resid_flag IS NULL |
||
Job_Inactv* | This field is used to identify job applications that are likely inactive, either because the job is withdrawn, is a duplicate of another job, or the application has stalled for three or more years. Records flagged as inactive should be excluded in most analyses of incomplete jobs since they are extremely unlikely to ever reach completion. Note that this definition is probability based and therefore can not capture all permits that may eventually become inactive. | Inactive: Withdrawn: The job status is 9. Withdrawn Inactive: Stalled: A job has a status of Filed Application, Approved Application, OR Permitted for Construction AND date_lastupdt is 3 or more years before the vintage date. Inactive: Duplicate: A job with a status that is Filed Application, Approved Application, OR Permitted for Construction MATCHES with a job that has a status of either 4. Partially Completed Construction OR 5. Completed Construction ON the following fields: -classa_init (where units are NOT NULL) AND -classa_prop (where units are NOT NULL) AND -address AND -job_type AND -the date_lastupdt of the incomplete job is before the date_lastupdt of the complete job. note: job_inactive can also get set to Inactive: Duplicate through manual research NULL: Not inactive |
*This field only appears in the HousingDB_post2010_inactive_included dataset | |
Job_Status | DCP recode of DOB's status label. This describes the status of the job at the date of the data vintage. For example, a job marked as "3. Permitted" was at that status as of June 30, 2020 if using version 20Q2 of the DCP Housing Database. More details on each DOB status is available here: https://www1.nyc.gov/assets/buildings/pdf/bisjobstatus.pdf Jobs typically move through status A through X over time as they reach certain approval milestones: 1. Filed: job application is at status A - G at the time of publication. Application submitted, but review is not yet in progress. 2. Plan Examination: application is at status H - P. Plan examination is in progress, but not yet approved. 3. Permitted: application is at status Q and R and may begin construction. 4. Partial Complete: application at status U and X, and CO issued for NB or A1 job type, and the CO is a Temporary CO AND less than 80% of the units are completed for a building with 20 or more units. 5. Complete: For new buildings and alterations, application is at status U and X, or a CO has been issued. For demolitions, the application is at status X. DCP has decided to mark demolitions as complete when they reach status X, but list the completion date as equal to status Q because this is likely when the building must be vacated, and it appears that many buildings are physically demolished some time before receiving sign off (status X). 9. Withdrawn: application is at status 3. The application has been withdrawn by the applicant. |
This field is dependent on on job_type, co_latest_certtype, classa_complt_pct, classa_complt_diff, classa_net, x_withdrawal, and the date fields. | ||
CompltYear | Year the job was completed. For new buildings and alterations, this is defined as the year of the first certificate of occupancy issuance. For demolitions, this is the year that the demolition was permitted (reached status Q). | Dependent on date_complete, this field is the year of date_complete. | ||
PermitYear | Year the job was permitted. For all job types, this is defined as the status Q year. | Dependent on date_permittd, this field is the year of date_permittd | ||
ClassAInit | Number of units that initially existed in the building at the time of the job application, as reported by the applicant. This field is edited by DCP to only count Class A units, which are units in houses or apartment buildings intended for long-term residential use (greater than 30 days), and typically do not require the use of shared kitchens and bathrooms. The definition of Class A and other unit types is available here: https://www1.nyc.gov/assets/buildings/pdf/MultipleDwellingLaw.pdf |
Mapped from the dob_jobapplications field existingdwellingunits Set to 0 for New Buildings |
||
ClassAProp | Number of units proposed in the job application after the proposed work has been completed, as reported by the applicant and edited by DCP to count only Class A units. | Mapped from the dob_jobapplications field proposeddwellingunits Set to 0 for Demolitions |
||
ClassANet | Net change in Class A unit count between the number of units existing at the time of application and the number of units proposed. | Dependent on classa_init and classa_prop The difference between classa_init and classa_prop |
||
HotelInit | Number of hotel units that initially existed in the building at the time of the job application, as determined through DCP research. Note that hotel units are a subtype of Class B units which are typically occupied as for-profit businesses for short-term (less than 30 days), full-service lodging. | This field is not populated programatically. All values are directly from the manual research table. | ||
Hotel_Prop | Number of hotel units proposed in the job application after the proposed work has been completed, as determined through DCP research. | This field is not populated programatically. All values are directly from the manual research table. | ||
OtherBInit | Number of Class B units (excluding hotel units) that initially existed in building at the time of the job application, as determined through DCP research. Other Class B units include all dwellings that are not Class A units or hotels, and may include single room occupancy units, dormitories, certain kinds of supportive housing and assisted living, homeless shelters, convents and monasteries, among many other forms of temporary lodging or lodging with communal kitchens or bathrooms. The definition of Class B and other unit types is available here: https://www1.nyc.gov/assets/buildings/pdf/MultipleDwellingLaw.pdf |
This field is not populated programatically. All values are directly from the manual research table. | ||
OtherBProp | Number of Class B units (excluding hotel units) proposed in the job application after the proposed work has been completed, as determined through DCP research. | This field is not populated programatically. All values are directly from the manual research table. | ||
Units_CO | Number of dwelling units provided on the temporary or final certificates of occupancy. This unit count may include Class A units in addition to Class B units and hotel units. | Mapped from dob_cofos field numofdwellingunits | ||
Boro | The NYC borough code where the proposed work will take place. | 1 = Manhattan 2 = Bronx 3 = Brooklyn 4 = Queens 5 = Staten Island |
||
BIN | Building Identification Number (BIN), supplied by Geosupport, that identifies each unique building in the city. | A 7 digit string starting with the borough code and 6 digit unique building number | ||
BBL | Borough-Block-Lot tax ID number of the parcel where the proposed work will take place. | |||
AddressNum | The house number for the building where the proposed work will take place. | This is the house number returned by Geosupport during geocoding. If, however, Geosupport did not return address information, then this is the housenumber from dob_jobapplications. | ||
AddressSt | The street name for the building where the proposed work will take place. | This is the street name returned by Geosupport during geocoding. If, however, Geosupport did not return address information, then this is the streetname from dob_jobapplications. | ||
Occ_Init | Description of the existing occupancy type at the time of the job application. This indicates what a site was used for before the proposed job. It is a more descriptive recode of the occupancy code that the applicant submitted to DOB as the initial occupancy type of the building. Note that the applicant may only choose one occupancy code, even in a building containing multiple uses, so this code is only able to describe one of many possible uses. | This field is mapped from the dob_jobapplications field existingoccupancy and translated using the lookup table: https://github.com/NYCPlanning/db-developments/blob/dev/developments_build/data/lookup_occ.csv |
||
Occ_Prop | Description of the proposed occupancy type at the time of the job application. This indicates what a site will be used for after the proposed job is complete. It is a more descriptive recode of the occupancy code that the applicant submitted to DOB as the proposed occupancy type of the building. Note that the applicant may only choose one occupancy code, even in a building containing multiple uses, so this code is only able to describe one of many possible uses. | This field is mapped from the dob_jobapplications field proposedoccupancy and translated using the lookup table: https://github.com/NYCPlanning/db-developments/blob/dev/developments_build/data/lookup_occ.csv |
||
Bldg_Class | Building Class code. A code used by Department of Finance (DOF) and Department of Buildings (DOB) describing the major use of a structure. | For more information see the Building Classifiaction code list on DOF's website: https://www1.nyc.gov/assets/finance/jump/hlpbldgcode.html | ||
Job_Desc | The general description of the work being applied for. This field is free text, and is filled out by the applicant. | This field is mapped from the dob_jobapplications field jobdescription. Administrative records, or jobs with no actual construction work, are removed based on this field in combination with job_type. Test records are removed based on this field. |
||
DateFiled | Date of job status A (pre-filing application). This is the first step in the process for all job applications. The job application # is assigned at this status. This occurs when the applicant submits any part of the application (even a single form), in person or electronically. | This field is mapped from the dob_jobapplications field prefilingdate | ||
DatePermit | Date of job status Q (first partial permit issuance). This is when construction work may begin. This field should be used for identifying the number of permits approved in a given year. | This field is mapped from the earliest date in the dob_permitissuance field issuancedate | ||
DateLstUpd | The date of the last update to the DOB record for the job filing. | This field is mapped from the dob_jobapplications field latestactiondate | ||
DateComplt | DCP's best estimate of completion date for all jobs. For new buildings and alterations, date complete is equal to the date of the earliest certificate of occupancy. For demolitions, date complete is equal to status Q (permit issued), since demolitions do not receive certificates of occupancy. Blank indicates no certificate of occupancy has been issued. Typically, a building can be considered complete at this stage. Large buildings with many units may have units receiving certificates of occupancy over a longer period of time. | This field depends on job_type, date_permitted, and date_statusx For New Buildings and Alterations, this field is mapped from the earliest effectivedate from dob_cofos For Demolitions, this field is mapped from date_permitted for as long as date_statusx IS NOT NULL |
||
ZoningDist1 | The primary zoning district of the tax lot per the applicant at time of application. For more information see http://www1.nyc.gov/site/planning/zoning/about-zoning.page. | This field is the dob_jobapplications field zoningdist1 | ||
ZoningDist2 | The secondary zoning district of the tax lot per the applicant at time of application. | This field is the dob_jobapplications field zoningdist2 | ||
ZoningDist3 | The tertiary zoning district of the tax lot per the applicant at time of application. | This field is the dob_jobapplications field zoningdist3 | ||
SpecialDist1 | The primary special zoning district of the tax lot per the applicant at time of application. Other zoning designations may appear in this field, such as industrial business zones (IBZ), mandatory inclusionary housing (MIH) areas, or other zoning designations. This field is provided by the applicant, and is likely inconsistent. | This field is mapped from the dob_jobapplications field specialdistrict1 | ||
SpecialDist2 | The secondary special zoning district of the tax lot per the applicant at time of application. | This field is mapped from the dob_jobapplications field specialdistrict2 | ||
Landmark | Indicates that the building has been designated as a landmark building by the Landmarks Preservation Commission. | This field is mapped from the dob_jobapplications field landmarked | ||
FloorsInit | The existing number of stories/floors in the building, as reported by the applicant. | This field depends on job_type This field is mapped from the dob_jobapplications field existingnumstories if the record is an Alteration or Demolition Values of '0' are replaced with NULL |
||
FloorsProp | The number of stories/floors in the building after the work is done, as reported by the applicant. | This field depends on job_type This field is mapped from the dob_jobapplications field proposednumstories if the record is an Alteration or New Building Values of '0' are replaced with NULL |
||
Enlargemnt | This indicates if the work to be done under the application will result in a horizontal and/or vertical enlargement, as reported by the applicant. Values include: Horizontal, Vertical, Horizontal and Vertical, [blank]. | This field is mapped from the dob_jobapplications fields horizontalenlrgmt and verticalenlrgmt When horizontalenlrgmt = 'Y' AND verticalenlrgmt <> 'Y' then 'Horizontal' When horizontalenlrgmt <> 'Y' AND verticalenlrgmt = 'Y' then 'Vertical' When horizontalenlrgmt = 'Y' AND verticalenlrgmt = 'Y' then 'Horizontal and Vertical' |
||
Ownership | This indicates whether the property is government owned, the ownership structure, and non-profit status, as reported by the applicant and recoded by DCP. | This field is mapped from dob_jobapplications fields cityowned, ownertype, and nonprofit The three input values are translated into a single value using this lookup table: https://github.com/NYCPlanning/db-developments/blob/dev/developments_build/data/lookup_ownership.csv |
||
CenBlock20 | 2020 Census Block FIPS code. | This is a reformatted version of bctcb2020 that uses county FIPS instead of one-digit borough code. | ||
CenTract20 | 2020 Census Tract FIPS code. | This is a reformatted version of bct2020 that uses county FIPS instead of one-digit borough code. | ||
BCTCB2020 | The Borough - 2020 Census Tract - Census Block code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with Census Block boundaries. | ||
BCT2020 | The Borough - 2020 Census Tract code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with Census Tract boundaries. | ||
NTA2020 | 2020 Neighborhood Tabulation Area (NTA) code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with 2020 NTA boundaries. | ||
NTAName20 | 2020 Neighborhood Tabulation Area (NTA) descriptive name. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with 2020 NTA boundaries. | ||
CDTA2020 | 2020 Community District Tabulation Area (NTA) code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with 2020 CDTA boundaries. | ||
CDTAName20 | 2020 Community District Tabulation Area (NTA) descriptive name. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with 2020 CDTA boundaries. | ||
ComunityDist | NYC Community District code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with community district boundaries. | ||
CouncilDist | NYC City Council District code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with city council district boundaries. | ||
SchoolSubDist | NYC School Subdistrict code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with school subdistrict boundaries. | ||
SchoolCommnty | NYC Community School District code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with community school district boundaries. | ||
SchoolElmntry | NYC Elementary School Zone code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with elementary school zone boundaries. | ||
SchoolMiddle | NYC Middle School Zone code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with middle school zone boundaries. | ||
FireCmpany | NYC Fire Company code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with fire company boundaries. | ||
FireBattln | NYC Fire Battalion code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with fire battalion boundaries. | ||
FireDivsn | NYC Fire Division code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with fire division boundaries. | ||
PolicePcnt | NYC Police Precinct code. | This field comes from Geosupport, as long as the address returns a value and the record is not in TPAD. Otherwise, the value is generated from a spatial join with police precinct boundaries. | ||
PL_FIRM07 | A value of 1 means that some portion of the tax lot falls within the 1% annual chance floodplain as determined by FEMA’s 2007 Flood Insurance Rate Map. Note that buildings on the tax lot may or may not be in the portion of the tax lot that is within the 1% annual chance floodplain. | This field is mapped from dcp_mappluto which is the clipped version of MapPLUTO, by joining on the geo_bbl to the PLUTO bbl | PL is short for PLUTO | |
PL_PFIRM15 | A value of 1 means that some portion of the tax lot falls within the 1% annual chance floodplain as determined by FEMA’s 2015 Preliminary Flood Insurance Rate Map. Note that buildings on the tax lot may or may not be in the portion of the tax lot that is within the 1% annual chance floodplain. | This field is mapped from dcp_mappluto which is the clipped version of MapPLUTO, by joining on the geo_bbl to the PLUTO bbl | PL is short for PLUTO | |
Latitude | Latitude in WGS84 / SRID:4326 | The latitude and longitude value are assigned in the following order. The associated geomsource values are in parentheses: DoITT building footprints center point using dob_bin as a join_key (BIN DOB buildingfootprints or BIN DOB buildingfootprints (historical)) DoITT building footprints center point using geo_bin as a join_key Returned from GeoSupport 1B function (Lat/Lon DCP, Lat/Lon geosupport) BBL centroid from shoreline clipped MapPLUTO using dob_bbl as a join_key (BBL DOB MapPLUTO) latitude and longitude from dob_jobapplications (Lat/Lon DOB) |
||
Longitude | Longitude in WGS84 / SRID:4326 | See the description in latitude above | ||
GeomSource | Source of the geographic coordinates for the record. | This field is dependent on latitude and longitude It is assigned as follows: BIN DOB buildingfootprints: Geometry is the centroid of the DOITT building footprint that has the same BIN as the DOB job BIN BIN DCP geosupport: Geometry is the centroid of the DOITT building footprint that has the same BIN as the BIN returned by Geosupport Lat/Lon geosupport: Geometry is created from the latitude and longitude values returned by Geosupport BBL DOB MapPLUTO: Geometry is the centroid of the Mappluto lot that has the same BBL as the DOB job BBL BIN DOB buildingfootprints (historical): Geometry is the centroid of the historical DOITT building footprint that has the same BIN as the DOB job BIN Lat/Lon DOB: Geometry is created from the latitude and longitude values in the DOB source data Lat/Lon DCP: Geometry is created from latitude and longitude provided in the manual corrections file |
||
DCPEdited | List of fields that were edited by DCP. This only lists fields where original DOB source data was overwritten by DCP, and doesn't include fields where DCP recodes data as part of the standard methodology. | Whenever a field is altered by a manual correction (as opposed to the programatic logic described in this Data Dictionary), the field name gets added to dcpeditfields | ||
Version | Version of the Developments Database. |
This dataset has the following tables:
-
dcp_housingdb
is the only table in this dataset.
Following are some useful SQL queries related to the dataset.
The dcp_housingdb
table has information on all NYC Department of Buildings (DOB)-approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. dcp_housingdb
contains, in particular, information on type of occupancy, length of time, type of job, and the status of the job, which can tell you about how long certain renovations in residential buildings can go on. There are two types of date considerations, which are the columns compltyear
and permityear
as integers, and datefiled
and datecomplt
as datetime objects. For our purposes, we're using the integer columns as the datecomplt
column is not always filled in.
After creating a new column with the length of time, we filter the dataset by types and ownership in order to gain specific insight into types of buildings. Finally, we order the created column in descending order and limit it to 50 rows in order to see the most egregious lengths of time. This script can be modified to answer similar questions by changing the filters, order by, or limit.
select *, compltyear-permityear as constlen from dcp_housingdb
where occprop = 'Residential: 3 or More Units (R-2)'
and jobtype = 'Alteration'
and ownership = 'Private For-Profit: Corporation'
and jobstatus = '5. Completed Construction'
order by constlen desc
limit (50)
As a small preliminary note, the dcp_housing db
dataset is updated annually around June for the previous year. With the current data that we have, we can retrieve aggregated amounts of filings that indicate the project's status. In this particular example, we're also filtering by jobtype
, occprop
, and ownership
, which will tell us about a specific subset of projects. We then group the remaining rows by their job status and list them in descending order to understand which job statuses are more numerous.
select count(*), jobstatus from dcp_housingdb
where occprop = 'Residential: 3 or More Units (R-2)'
and jobtype = 'Alteration'
and ownership = 'Private For-Profit: Corporation'
group by jobstatus
order by count desc
The DCP Housing Database is based on DOB Open Data to identify housing construction. However, not all construction in NYC occurs legally by approval through the DOB. Illegal conversions of the housing stock may be prevalent in certain neighborhoods but are not represented in this data.
DCP uses information such as job description, occupancy codes, and unit change to differentiate hotels and class B units from class A housing units. However, there may be hotels and class B units that DCP could not identify using these means.