Skip to content

Dataset: Rent Stabilized Buildings

Maxwell Austensen edited this page Mar 6, 2025 · 21 revisions

Rent Stablization

There are 3 kinds of rent stabilization datasets available:

  • rentstab
  • rentstab_summary
  • rentstab_v2

The rentstab dataset can be added to NYCDB by running:

> nycdb --download rentstab
> nycdb --load rentstab

Replace rentstab with any of the above listed datasets to add to the database.

Provenance

The dataset of registered rent stabilized units by property is compiled by scraping the number of registered units from NYC Department of Finance (DOF) tax bill PDF documents each year. There have been two different projects doing this work and you can read more about their methodologies:

  • talos/nyc-stabliziation-unit-counts (2007-2017)

    Liberate NYC DOF tax documents to a machine readable format.

  • JustFixNYC/nyc-doffer (2018-2023)

    A tool that scrapes the NYC Department of Finance (DOF) website for financial statements, and provides net operating income and rent stabilization units.

The data is split across three datasets, which are available in CSV format: (rentstab, rentstab_summary, rentstab_v2)

Context

The DOF publishes a variety of fiscal information for a given property, including tax bills every quarter and statements of accounts. Tax bills hold information like property tax due, assessed property value, and number of units in a building.

The information of interests here are the number or rent stabilized units as well as relevant financial information like net operating income.

Data dictionary

Note: uc used in many of the column names stands for "unit count" and the columns like uc2018 are the count of rent stabilized units in that property in that year. ucbbl is just the boro-block-lot and the uc is just a holdover naming convention from a historic file.

Click here for `rentstab` data dictionary
column_name data_type description
borough text 2-character borough name (MN, BX, BK, QN, SI)
ucbbl char(10) [Borough-Block-Lot](https://github.com/nycdb/nycdb/wiki/Glossary#borough-block-lot-bbl) (property ID)
uc2007 integer The rent stabilized unit count in 2007. This is based off of the rent stabilization surcharge dated "4/1/2007", which appears in tax bills starting 2008. The parser sums these counts when a single tax bill includes multiple buildings, but is careful not to double-count if previous years' surcharges reappear.
est2007 boolean Whether or not this is an estimated unit count. As registration is voluntary, it is common for a building to miss a year, or even several. See [Caveats](https://github.com/talos/nyc-stabilization-unit-counts#caveats) from the original Tax Bills GitHub for information about how estimates are derived.
dhcr2007 boolean Whether the building appeared on DHCR's list that year. Blank if DHCR did not publish a list for that year.
abat2007 text[] A list of all abatements and exemptions claimed on that year's tax bill. This includes 421a, J51, 420C (LIHTC), SCRIE, DRIE, and several others.
uc2008 integer
est2008 boolean
dhcr2008 boolean
abat2008 text[]
uc2009 integer
est2009 boolean
dhcr2009 boolean
abat2009 text[]
uc2010 integer
est2010 boolean
dhcr2010 boolean
abat2010 text[]
uc2011 integer
est2011 boolean
dhcr2011 boolean
abat2011 text[]
uc2012 integer
est2012 boolean
dhcr2012 boolean
abat2012 text[]
uc2013 integer
est2013 boolean
dhcr2013 boolean
abat2013 text[]
uc2014 integer
est2014 boolean
dhcr2014 boolean
abat2014 text[]
uc2015 integer
est2015 boolean
dhcr2015 boolean
abat2015 text[]
uc2016 integer
est2016 boolean
dhcr2016 boolean
abat2016 text[]
uc2017 integer
est2017 boolean
dhcr2017 boolean
abat2017 text[]
cd smallint The community district, from PLUTO. All remaining columns are from PLUTO.
ct2010 text Census tract in 2010 census.
cb2010 text Census block in 2010 census.
council integer The city council district.
zipcode char(5) The zip code.
address text An address for the lot, although it could have several.
ownername text The name of the lot's owner. Oftentimes just an LLC.
numbldgs smallint The number of buildings on the lot.
numfloors numeric The approximate number of floors on the lot's buildings.
unitsres integer An approximate number of residential units in the lot's buildings.
unitstotal integer An approximate number of residential & commercial units in the lot's buildings.
yearbuilt smallint An approximate year built, not particularly accurate. Especially poor quality in older buildings.
condono smallint The condo number, which links together different lots into a single condo development.
lon numeric The lot's centerpoint longitude.
lat numeric The lot's centerpoint latitude.

Note: all the descriptions have not yet been added for rentstab_summary, but you can find them in the original documentation.

Click here for `rentstab_summary` data dictionary
column_name data_type description
ucbbl char(10) [Borough-Block-Lot](https://github.com/nycdb/nycdb/wiki/Glossary#borough-block-lot-bbl) (property ID)
unitstotal integer
unitsstab2007 integer
unitsstab2017 integer
diff integer
percentchange numeric
j51 text
a421 text
scrie text
drie text
c420 text
cd smallint
ct2010 text
cb2010 text
council integer
zipcode char(5)
address text
ownername text
numbldgs smallint
numfloors numeric
unitsres integer
unitstotalpluto integer
yearbuilt smallint
condono smallint
lon numeric
lat numeric
Click here for `rentstab_v2` data dictionary
column_name data_type description
ucbbl char(10) [Borough-Block-Lot](https://github.com/nycdb/nycdb/wiki/Glossary#borough-block-lot-bbl) (property ID)
uc2018 integer Number of registered rent stabilized units found on tax bill PDF for this property
pdfsoa2018 text URL for an archived copy of the PDF tax bill used in the scraping process
uc2019 integer
pdfsoa2019 text
uc2020 integer
pdfsoa2020 text
uc2021 integer
pdfsoa2021 text
uc2022 integer
pdfsoa2022 text
uc2023 integer
pdfsoa2023 text

Tables

Each dataset contains 1 table, described below:

  • rentstab is a compilation of information from tax bills as well as relevant property information for rent stablized buildings, including BBLs and number of units.
  • rentstab_summary summarizes information of rentstab including total number of rent stablized units and the change of units between 2007 to 2017.
  • rentstab_v2 has the count of rent stabilized units from 2018-2021 with corresponding BBLs.

Example queries

Following are some useful SQL queries related to the dataset.

Obtaining change in count of units for specific building

The rentstab_summary table has statistical information of the change of the count of rent stabilized units. You can pull that number if you know the BBL of the building:

SELECT percentchange FROM rentstab_summary WHERE ucbbl = '2026080032';

Obtaining total residential units and total number of rent stabilized units in 2010 and 2020 for each borough

This query will join together building level data from three tables before aggregating the information at the borough level.

The subquery stabilized_units joins together the rentstab table and the rentstab_v2 table in order to get the number of rent stabilized units for each building in 2010 uc2010 and 2020 uc2020.

Then this rent data is joined to the pluto_latest dataset using bbl in order to get information on the total number of residential units unitsres in each building as well as the borough for each building.

The data is then grouped by borough to get total number of residential units sum(unitsres), and the total number of rent stabilized units in 2010 sum(uc2010) and 2020 sum(uc2020) for each borough.

SELECT borough, sum(unitsres) AS total_res_units, 
       sum(uc2010) AS stabilized_2010_tot, sum(uc2020) AS stabilized_2020_tot

     FROM public.pluto_latest AS p
     LEFT JOIN 
           (SELECT ucbbl, uc2010, uc2020
            FROM public.rentstab 
            OUTSIDE JOIN public.rentstab_v2 using(ucbbl)) AS stabilized_units 

     ON p.bbl = stabilized_units.ucbbl
     
     GROUP BY borough;

Known Issues and Data Pitfalls

The count of rent stabilized units are self-reported by the landlord when they register them with DHCR, and so it's not uncommon for no units to be recorded in some years then reappear later. Given this, it shouldn't be assumed that every time a building goes from a large number of units in one year to none the next that this is a true loss of all those rent stabilized units.

Clone this wiki locally