Housing New York workflow

Overview

Goal

Match records from hpd_hny_units_by_building with the DOB records in developments, in order to merge on columns related to the number of affordable units.

Challenge

The relationship between hpd_hny_units_by_project records and developments records are not one-to-one
There are multiple ways to match records
- BIN & BBL
- BBL
- Spatially

General outline of steps

Geocode hny data
Find matches using the three different methods, excluding confidential hny projects and demolitions. All match methods also have the constraint that the total units in the hny record cannot be more than 5 units different than the units proposed for the developments record.
Identify cases where the matches are not one-to-one
Resolve ambiguous matches
- Using logic:
  - Matches with residential new buildings take precedence over matches with non-residential or A1
  - There is a hierarchy of match type: BIN & BBL, then BBL, then spatial
  - An associative join is performed where in a cluster of HNY and DevDB records every single HNY records will be matched with the other records in the cluster
  - Identify records both HNY and DevDB that are also matched with multiple counterpart records and create flags.

Two output tables for HNY and DevDB join

DevDB_hny_lookup
- Even after applying this hierarchy, not all matches are one-to-one, and so we need to resolve this.
  - In cases where one development record matches with many hny records, affordable units get aggregated
  - In cases where one hny record matches with many developments records, the units get assigned to the DOB job that has the lowest job number
  - Resolving many-to-one and one-to-many matches using the above two techniques also resolves many-to-many matches
HNY_devdb_lookup
- this is the table where HNY attributes can be then linked back to DevDB records via job number. It relies on the matches and relationship tables created in step 4 but resolve the relationship with somewhat different logics. Please note the descriptor below is always DevDB first and HNY second e.g. one-to-many means one DevDB records is identified as associated with many HNY records.
  - one-to-one: no need for resolution
  - one-to-many: the HNY ids will be joined together as an array and their units would be summed up and earliest dates will be set to the whole group e.g. project_start_date.
  - many-to-one: the DevDB record with the least job number will be picked to be associated with HNY record
  - many-to-many: first combine the HNY ids by grouping by DevDB record to create the HNY ids array. Then since the same array of HNY ids would be associated with multiple different DevDB record, pick the DevDB with the least job number to be associated with the entire group of HNY ids.

What is the roll of manual research?

Add matches that didn't exist
Remove matches

Corrections process:

We output matches we find automatically, prior to the automatic treatment of one-to-many and many-to-one cases.

job_number	hny_id	hny_project_id	all_counted_units	total_units	...
1234567	abcdef	a	5	10	...

We also automatically populate a table with all HNY records that did not match to a DOB record.

Corrections would involve filling in the action column, and adding rows if necessary.

job_number	hny_id	hny_project_id	action
1234567	abcdef	a	remove
1234567	ghijklm	g	add
7654321	ghijklm	g	remove
1234567	nopqrst	n	add

We apply these changes to the matches file, then use the logic above to

Sum up all hny records that match with a single developments record
Find the developments record with the smallest job_number to assign to a hny record

About
Workflow
- Spatial
- Occupancy
- Units
- Status
- HNY
- Corrections
- QAQC
Previous Releases
19Q4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly