-
Notifications
You must be signed in to change notification settings - Fork 56
Dataset: Rent Stabilized Buildings
There are 3 kinds of rent stabilization datasets available:
rentstabrentstab_summaryrentstab_v2
The rentstab dataset can be added to NYCDB by running:
> nycdb --download rentstab
> nycdb --load rentstab
Replace rentstab with any of the above listed datasets to add to the database.
The dataset of registered rent stabilized units by property is compiled by scraping the number of registered units from NYC Department of Finance (DOF) tax bill PDF documents each year. There have been two different projects doing this work and you can read more about their methodologies:
-
talos/nyc-stabliziation-unit-counts (2007-2017)
Liberate NYC DOF tax documents to a machine readable format.
-
JustFixNYC/nyc-doffer (2018-2023)
A tool that scrapes the NYC Department of Finance (DOF) website for financial statements, and provides net operating income and rent stabilization units.
The data is split across three datasets, which are available in CSV format: (rentstab, rentstab_summary, rentstab_v2)
The DOF publishes a variety of fiscal information for a given property, including tax bills every quarter and statements of accounts. Tax bills hold information like property tax due, assessed property value, and number of units in a building.
The information of interests here are the number or rent stabilized units as well as relevant financial information like net operating income.
Note:
ucused in many of the column names stands for "unit count" and the columns likeuc2018are the count of rent stabilized units in that property in that year.ucbblis just the boro-block-lot and theucis just a holdover naming convention from a historic file.
Click here for `rentstab` data dictionary
| column_name | data_type | description |
|---|---|---|
| borough | text | 2-character borough name (MN, BX, BK, QN, SI) |
| ucbbl | char(10) | [Borough-Block-Lot](https://github.com/nycdb/nycdb/wiki/Glossary#borough-block-lot-bbl) (property ID) |
| uc2007 | integer | The rent stabilized unit count in 2007. This is based off of the rent stabilization surcharge dated "4/1/2007", which appears in tax bills starting 2008. The parser sums these counts when a single tax bill includes multiple buildings, but is careful not to double-count if previous years' surcharges reappear. |
| est2007 | boolean | Whether or not this is an estimated unit count. As registration is voluntary, it is common for a building to miss a year, or even several. See [Caveats](https://github.com/talos/nyc-stabilization-unit-counts#caveats) from the original Tax Bills GitHub for information about how estimates are derived. |
| dhcr2007 | boolean | Whether the building appeared on DHCR's list that year. Blank if DHCR did not publish a list for that year. |
| abat2007 | text[] | A list of all abatements and exemptions claimed on that year's tax bill. This includes 421a, J51, 420C (LIHTC), SCRIE, DRIE, and several others. |
| uc2008 | integer | |
| est2008 | boolean | |
| dhcr2008 | boolean | |
| abat2008 | text[] | |
| uc2009 | integer | |
| est2009 | boolean | |
| dhcr2009 | boolean | |
| abat2009 | text[] | |
| uc2010 | integer | |
| est2010 | boolean | |
| dhcr2010 | boolean | |
| abat2010 | text[] | |
| uc2011 | integer | |
| est2011 | boolean | |
| dhcr2011 | boolean | |
| abat2011 | text[] | |
| uc2012 | integer | |
| est2012 | boolean | |
| dhcr2012 | boolean | |
| abat2012 | text[] | |
| uc2013 | integer | |
| est2013 | boolean | |
| dhcr2013 | boolean | |
| abat2013 | text[] | |
| uc2014 | integer | |
| est2014 | boolean | |
| dhcr2014 | boolean | |
| abat2014 | text[] | |
| uc2015 | integer | |
| est2015 | boolean | |
| dhcr2015 | boolean | |
| abat2015 | text[] | |
| uc2016 | integer | |
| est2016 | boolean | |
| dhcr2016 | boolean | |
| abat2016 | text[] | |
| uc2017 | integer | |
| est2017 | boolean | |
| dhcr2017 | boolean | |
| abat2017 | text[] | |
| cd | smallint | The community district, from PLUTO. All remaining columns are from PLUTO. |
| ct2010 | text | Census tract in 2010 census. |
| cb2010 | text | Census block in 2010 census. |
| council | integer | The city council district. |
| zipcode | char(5) | The zip code. |
| address | text | An address for the lot, although it could have several. |
| ownername | text | The name of the lot's owner. Oftentimes just an LLC. |
| numbldgs | smallint | The number of buildings on the lot. |
| numfloors | numeric | The approximate number of floors on the lot's buildings. |
| unitsres | integer | An approximate number of residential units in the lot's buildings. |
| unitstotal | integer | An approximate number of residential & commercial units in the lot's buildings. |
| yearbuilt | smallint | An approximate year built, not particularly accurate. Especially poor quality in older buildings. |
| condono | smallint | The condo number, which links together different lots into a single condo development. |
| lon | numeric | The lot's centerpoint longitude. |
| lat | numeric | The lot's centerpoint latitude. |
Note: all the descriptions have not yet been added for
rentstab_summary, but you can find them in the original documentation.
Click here for `rentstab_summary` data dictionary
| column_name | data_type | description |
|---|---|---|
| ucbbl | char(10) | [Borough-Block-Lot](https://github.com/nycdb/nycdb/wiki/Glossary#borough-block-lot-bbl) (property ID) |
| unitstotal | integer | |
| unitsstab2007 | integer | |
| unitsstab2017 | integer | |
| diff | integer | |
| percentchange | numeric | |
| j51 | text | |
| a421 | text | |
| scrie | text | |
| drie | text | |
| c420 | text | |
| cd | smallint | |
| ct2010 | text | |
| cb2010 | text | |
| council | integer | |
| zipcode | char(5) | |
| address | text | |
| ownername | text | |
| numbldgs | smallint | |
| numfloors | numeric | |
| unitsres | integer | |
| unitstotalpluto | integer | |
| yearbuilt | smallint | |
| condono | smallint | |
| lon | numeric | |
| lat | numeric |
Click here for `rentstab_v2` data dictionary
| column_name | data_type | description |
|---|---|---|
| ucbbl | char(10) | [Borough-Block-Lot](https://github.com/nycdb/nycdb/wiki/Glossary#borough-block-lot-bbl) (property ID) |
| uc2018 | integer | Number of registered rent stabilized units found on tax bill PDF for this property |
| pdfsoa2018 | text | URL for an archived copy of the PDF tax bill used in the scraping process |
| uc2019 | integer | |
| pdfsoa2019 | text | |
| uc2020 | integer | |
| pdfsoa2020 | text | |
| uc2021 | integer | |
| pdfsoa2021 | text | |
| uc2022 | integer | |
| pdfsoa2022 | text | |
| uc2023 | integer | |
| pdfsoa2023 | text |
Each dataset contains 1 table, described below:
-
rentstabis a compilation of information from tax bills as well as relevant property information for rent stablized buildings, including BBLs and number of units. -
rentstab_summarysummarizes information ofrentstabincluding total number of rent stablized units and the change of units between 2007 to 2017. -
rentstab_v2has the count of rent stabilized units from 2018-2021 with corresponding BBLs.
Following are some useful SQL queries related to the dataset.
The rentstab_summary table has statistical information of the change of the count of rent stabilized units. You can pull that number if you know the BBL of the building:
SELECT percentchange FROM rentstab_summary WHERE ucbbl = '2026080032';Obtaining total residential units and total number of rent stabilized units in 2010 and 2020 for each borough
This query will join together building level data from three tables before aggregating the information at the borough level.
The subquery stabilized_units joins together the rentstab table and the rentstab_v2 table in order to get the number of rent stabilized units for each building in 2010 uc2010 and 2020 uc2020.
Then this rent data is joined to the pluto_latest dataset using bbl in order to get information on the total number of residential units unitsres in each building as well as the borough for each building.
The data is then grouped by borough to get total number of residential units sum(unitsres), and the total number of rent stabilized units in 2010 sum(uc2010) and 2020 sum(uc2020) for each borough.
SELECT borough, sum(unitsres) AS total_res_units,
sum(uc2010) AS stabilized_2010_tot, sum(uc2020) AS stabilized_2020_tot
FROM public.pluto_latest AS p
LEFT JOIN
(SELECT ucbbl, uc2010, uc2020
FROM public.rentstab
OUTSIDE JOIN public.rentstab_v2 using(ucbbl)) AS stabilized_units
ON p.bbl = stabilized_units.ucbbl
GROUP BY borough;
The count of rent stabilized units are self-reported by the landlord when they register them with DHCR, and so it's not uncommon for no units to be recorded in some years then reappear later. Given this, it shouldn't be assumed that every time a building goes from a large number of units in one year to none the next that this is a true loss of all those rent stabilized units.