Linking Violations to Building
There are multiple ways to reference the same building. For example, these are all for the same single building:
- 1401 - 1403 West Huron St.
- 1401 W Huron Street
- 652 N Noble
- 17-08-113-015-0000
- 17-08-113-014
- 41.894199, -87.662487
For this project to succeed, there must be a way to match these records to the same building. This page will be used to discuss the method this project will use to link records by the building involved.
There is a GitHub Milestone to track progress.
An initial analysis can be found here.
So far, the algorithm will take these steps to link records with a building:
- Try to match by the address. For example,
1401 W Huron St
matches1401 - 1403 W Huron St
. Due to odds/evens being on opposite sides of the street,1402 W Huron St
would not match. If there is no match, go on to step 2. - Try to match by point-in-polygon or distance-to-polygon.
I tried Step 1 on a few census tracts that have a typical mix of residential, commercial, and industrial buildings. In my analysis, I linked building violation records to their corresponding building in the Building Shapefiles dataset. Here are the results:
- 2076 total records
- 1834 matched (88%)
- 242 not matched (12%)
Step 1 matched more records that I expected, and the matches should be correct since the address is essentially an exact match. To see what kind of records are left, here is a map before matching through Step 1:
And here is a map after Step 1 matching:
It looks like there are three types of situations remaining. These can be seen in the following map:
- Corner Buildings: Point/polygon geospatial analysis can be used to match these
- Vacant Lots or New Construction: Some violations seem to correspond to a lot without a building shapefile. These could be buildings that were demolished after the violation occurred or could be violations for new construction. Point/polygon geospatial analysis would incorrectly link these to the wrong building
- Missing Ranges: Some buildings are missing ranges. For example, there is a large building with an address range of 1822 - 1850 W Chicago Ave. The building shapefile only has 1850 W Chicago Ave as as address, so a record at 1832 W Chicago Ave is not matched through Step 1. Point/polygon geospatial analysis can be used to match these
Possible solution: There will still be a tax parcel shapefile even if the building shapefile is missing. We could use geospatial analysis to link these records to the tax parcel, which could then be easily matched to building shapefiles.