# Baltimore Tax Credit Project

## Requirements

Pandas and geopandas is needed to pull in the data and make modifications.

If you are trying to do this from a machine running Windows, `geopandas` can be difficult to get installed.  I recommend following [this guide](http://geoffboeing.com/2014/09/using-geopandas-windows/) by Geoff Boeing. That guide assumes you already have [Anaconda](http://continuum.io/downloads) installed, which is recommended anyway.

## Overview

### Working with the data

### Visualization

Visualization will be done with **Mapbox** through **Tilemill**.

## Imports and filenames

In [18]:
%matplotlib inline

import pandas as pd
import geopandas as gpd

pd.set_option('display.max_columns', None) # This ensures we can view all the columns
pd.set_option('display.max_rows', None) # Force display of all requested rows

data_dir = "./data/"
property_zip = "Real_Property.zip"
property_shp = "Real_Property.shp"
output_file = "owner_occ_property.geojson"

## Load Data

Data is stored right now in `./data/Real_Property.zip`.

### Unzip data

In [7]:
import zipfile

zfile = zipfile.ZipFile(data_dir + property_zip)
zfile.extractall(data_dir)

### Load shapefile into memory

In [2]:
parcels = gpd.GeoDataFrame.from_file(data_dir + property_shp)
parcels.head()



Unnamed: 0,ARTAXBAS,AR_OWNER,ASSESGRP,ASSESSOR,BFCVIMPR,BFCVLAND,BLDGSQFT,BLDG_NO,BLOCK,BLOCKLOT,CCREDAMT,CITYCRED,CITY_TAX,CURRIMPR,CURRLAND,DEEDBOOK,DEEDPAGE,DHCDUSE1,DHCDUSE2,DHCDUSE3,DHCDUSE4,DISTSWCH,DIST_ID,DWELUNIT,EFF_UNIT,EXMPCODE,EXMPIMPR,EXMPLAND,EXMPTYPE,EXTD_ZIP,FRACTION,FULLADDR,FULLCASH,GRNDRENT,IMPREXMP,LANDEXMP,LDATE,LOT,LOT_SIZE,NEIGHBOR,NO_IMPRV,OBJECTID,OWNER_1,OWNER_2,OWNER_3,OWNER_ABBR,OWNMDE,PERMHOME,PIN,PROPDESC,RESPAGCY,ROOMUNIT,RPDELTAG,SALEDATE,SALEPRIC,SCREDAMT,SDATCODE,SPANFRAC,SPAN_NUM,SRVCCNTR,STATCRED,STATETAX,STDIRPRE,ST_NAME,ST_TYPE,SUBTYPE_GE,ShapeSTAre,ShapeSTLen,TAXBASE,UNIT_NUM,USEGROUP,WARD,YEAR_BUILD,ZIP_CODE,ZONECODE,created_da,created_us,geometry,last_edi_1,last_edite
0,70200,N,2,310,60000,30000,1920,1923,3941A,3941A012,0.0,0,1578.1,56200,14000,FMC12906,181,1123,0,0,0,,,2,0,0,0,0,,3106,,1923 E 32ND ST,70200,0,0,0,7022015,12,24-3X131,COLDSTREAM HOMESTEAD MONTEBELLO,,1132760,"DOMINION RENTALS, LLC",,,,F,N,3941A012,,,0,,9032010,26000,0,11130,,0,4,0,78.62,E,32ND,ST,1,3186.729126,310.665602,70200,,R,9,1924,21218,R-6,2015-07-06T17:31:31.000Z,EGISDATA,POLYGON ((-76.58816450001287 39.32685429146552...,2015-07-06T17:31:31.000Z,EGISDATA
1,10200,N,2,384,20000,5000,0,745,1628,1628 023,0.0,0,229.3,8200,2000,FMC13310,379,1123,6814,0,0,,,2,0,0,0,0,,2630,,745 N KENWOOD AVE,10200,84,0,0,7022015,23,15X70,MADISON-EASTEND,,1132642,"THOMAS, JOHN",,,,L,N,1628023,,,0,,2222011,10000,0,11440,,0,3,0,11.42,N,KENWOOD,AVE,1,996.647705,168.483739,10200,,R,7,1915,21205,R-8,2015-07-06T17:31:31.000Z,EGISDATA,POLYGON ((-76.57775070669615 39.29995842106153...,2015-07-06T17:31:31.000Z,EGISDATA
2,147700,H,1,320,99700,58300,1287,6003,5695E,5695E039,0.0,0,3320.3,87700,60000,FMC15496,164,1111,0,0,0,,,1,0,0,0,0,,2525,,6003 GLENFALLS AVE,147700,0,0,0,7022015,39,54X90-1,GLENHAM-BELHAR,,1132761,"SPENCER, CAROLYN M",,,,F,H,5695E039,,,0,,8062013,92000,0,11110,,0,4,0,165.42,,GLENFALLS,AVE,1,4856.770325,287.900128,147700,,R,27,1951,21206,R-5,2015-07-06T17:31:31.000Z,EGISDATA,POLYGON ((-76.54108672842762 39.34814590666414...,2015-07-06T17:31:31.000Z,EGISDATA
3,228933,N,3,310,120600,80000,2652,908,1880,1880 005,0.0,0,5146.41,185600,100000,FMC12281,224,1123,0,0,0,,,2,0,0,0,0,,4947,,908 S ELLWOOD AVE,285600,0,0,0,7022015,5,18X100,CANTON,,1132643,"KNOTT, BRANNAN H",,,,F,N,1880005,,,0,,12182009,258000,0,11230,,0,2,0,256.4,S,ELLWOOD,AVE,1,1791.297729,235.567251,228933,,R,1,1910,21224,R-8,2015-07-06T17:31:31.000Z,EGISDATA,POLYGON ((-76.57277386208826 39.28202638475474...,2015-07-06T17:31:31.000Z,EGISDATA
4,142900,H,1,305,95200,77000,1448,3004,4455,4455 038,876.72,0,3212.39,80900,62000,FMC05537,292,1111,0,0,0,,,1,0,0,0,0,,4012,,3004 ROCKWOOD AVE,142900,0,0,0,7022015,38,49X144,GLEN,,1132762,"WILLIAMS, CURTIS",,,,F,H,4455038,,,0,,6032004,87000,0,11110,,0,6,0,160.05,,ROCKWOOD,AVE,1,7034.895325,385.7908,142900,,R,27,1929,21215,R-4,2015-07-06T17:31:31.000Z,EGISDATA,POLYGON ((-76.68004942223392 39.35893934895754...,2015-07-06T17:31:31.000Z,EGISDATA


## Filter Data

To make the data easier to work with, going to create a filtered version of the dataset.

It seems like the columns are as follows (this is my best guess, verified when possible):

| Col Name | Desc |
|----------|------|
| **geometry** | Physical location of parcel boundaries (important) |
| | |
| **AR_OWNER** | Owner occupied? `H` = Yes, `N` = No |
| | |
| **BLOCK** | Block number |
| **LOT** | Lot number |
| **WARD** | Ward number |
| **BLOCKLOT** | Combined block and lot fields |
| | |
| **FULLADDR** | Full street address of property |
| **NEIGHBOR** | Neighborhood |
| | |
| **CITY_TAX** | City tax amount |
| **STATETAX** | State tax amount |
| **TAXBASE** | Total taxable value of property |
| | |
| **CCREDAMT** | City homestead tax credit amount |
| **SCREDAMT** | State homestead tax credit amount |
| **CITYCRED** | Not sure |
| **STATCRED** | State homeowner's tax credit amount |
| | |
| **OWNMDE** | Not sure, has values `F`, `L`, or `None`|
| **PERMHOME** | Owner occupied? `H` = Yes, `N` = No|
| **SALEDATE** | Date of most recent sale of property |


In [14]:
col_list = ["AR_OWNER", "BLOCK", "LOT", "WARD", "BLOCKLOT", 
            "FULLADDR", "NEIGHBOR", 
            "CITY_TAX", "STATETAX", "TAXBASE",
            "CCREDAMT", "SCREDAMT", "CITYCRED", "STATCRED", 
            "OWNMDE", "PERMHOME", "SALEDATE"]

filt_parcels = gpd.GeoDataFrame()

filt_parcels["geometry"] = parcels["geometry"]

for col_name in col_list:
    filt_parcels[col_name] = parcels[col_name]
    
filt_parcels.head(n = 200)


Unnamed: 0,geometry,AR_OWNER,BLOCK,LOT,WARD,BLOCKLOT,CITY_TAX,STATETAX,CCREDAMT,SCREDAMT,CITYCRED,STATCRED,NEIGHBOR,FULLADDR,OWNMDE,PERMHOME,SALEDATE
0,POLYGON ((-76.58816450001287 39.32685429146552...,N,3941A,012,9,3941A012,1578.1,78.62,0.0,0.0,0,0.0,COLDSTREAM HOMESTEAD MONTEBELLO,1923 E 32ND ST,F,N,9032010
1,POLYGON ((-76.57775070669615 39.29995842106153...,N,1628,023,7,1628 023,229.3,11.42,0.0,0.0,0,0.0,MADISON-EASTEND,745 N KENWOOD AVE,L,N,2222011
2,POLYGON ((-76.54108672842762 39.34814590666414...,H,5695E,039,27,5695E039,3320.3,165.42,0.0,0.0,0,0.0,GLENHAM-BELHAR,6003 GLENFALLS AVE,F,H,8062013
3,POLYGON ((-76.57277386208826 39.28202638475474...,N,1880,005,1,1880 005,5146.41,256.4,0.0,0.0,0,0.0,CANTON,908 S ELLWOOD AVE,F,N,12182009
4,POLYGON ((-76.68004942223392 39.35893934895754...,H,4455,038,27,4455 038,3212.39,160.05,876.72,0.0,0,0.0,GLEN,3004 ROCKWOOD AVE,F,H,6032004
5,POLYGON ((-76.58703107337121 39.37025410730454...,H,5210F,019,27,5210F019,2675.12,133.28,0.0,0.0,0,0.0,IDLEWOOD,1336 GITTINGS AVE,L,H,7232008
6,POLYGON ((-76.60312792601935 39.32580693932493...,N,4096,029,9,4096 029,0.0,0.0,0.0,0.0,0,0.0,BETTER WAVERLY,943 GORSUCH AVE,F,N,8021999
7,POLYGON ((-76.59525309668561 39.31528933945137...,H,4165,030,8,4165 030,330.46,16.46,0.0,0.0,0,0.0,DARLEY PARK,1612 CLIFTVIEW AVE,F,H,6172011
8,POLYGON ((-76.61786605467449 39.31942610479524...,N,3637,053,12,3637 053,3166.69,157.77,0.0,0.0,0,0.0,CHARLES VILLAGE,4 W 26TH ST,L,N,7301991
9,"POLYGON ((-76.6340391486756 39.31211664855822,...",N,3427,039,13,3427 039,4142.32,206.38,0.0,0.0,0,0.0,RESERVOIR HILL,2064 LINDEN AVE,F,N,5211996


### Check Data

Here I'm going to verify that we can identify properties that are owner occupied.


In [16]:
owned_parcels = filt_parcels[filt_parcels["AR_OWNER"] == "H"]
owned_parcels.head(n = 100)

Unnamed: 0,geometry,AR_OWNER,BLOCK,LOT,WARD,BLOCKLOT,CITY_TAX,STATETAX,CCREDAMT,SCREDAMT,CITYCRED,STATCRED,NEIGHBOR,FULLADDR,OWNMDE,PERMHOME,SALEDATE
2,POLYGON ((-76.54108672842762 39.34814590666414...,H,5695E,039,27,5695E039,3320.3,165.42,0.0,0.0,0,0.0,GLENHAM-BELHAR,6003 GLENFALLS AVE,F,H,8062013
4,POLYGON ((-76.68004942223392 39.35893934895754...,H,4455,038,27,4455 038,3212.39,160.05,876.72,0.0,0,0.0,GLEN,3004 ROCKWOOD AVE,F,H,6032004
5,POLYGON ((-76.58703107337121 39.37025410730454...,H,5210F,019,27,5210F019,2675.12,133.28,0.0,0.0,0,0.0,IDLEWOOD,1336 GITTINGS AVE,L,H,7232008
7,POLYGON ((-76.59525309668561 39.31528933945137...,H,4165,030,8,4165 030,330.46,16.46,0.0,0.0,0,0.0,DARLEY PARK,1612 CLIFTVIEW AVE,F,H,6172011
10,POLYGON ((-76.56944852517457 39.35443981507301...,H,5370,032,27,5370 032,2852.71,142.13,232.26,0.0,0,0.0,HAMILTON HILLS,5311 GRINDON AVE,F,H,2282002
13,POLYGON ((-76.67174735387904 39.33170941740796...,H,3123C,024,15,3123C024,3025.81,150.75,656.3,0.0,0,0.0,ASHBURTON,3337 DOLFIELD AVE,,H,1011797
16,POLYGON ((-76.65752678090873 39.33004231122609...,H,3327F,079,15,3327F079,337.2,16.8,0.0,0.0,0,0.0,PARK CIRCLE,3617 COTTAGE AVE,L,H,1032011
17,POLYGON ((-76.66534261141808 39.34971001269672...,H,4747,073N,27,4747 073N,1787.16,89.04,0.0,0.0,0,0.0,CYLBURN,2727 W GARRISON AVE,F,H,6032010
21,POLYGON ((-76.70343057807651 39.36099886399613...,H,4218P,014,27,4218P014,2902.17,144.59,571.46,0.0,0,0.0,FALLSTAFF,3726 CLARINTH ROAD,F,H,5052000
22,POLYGON ((-76.69256738573311 39.27283998735819...,H,2530C,336,25,2530C336,2261.49,112.67,180.78,0.0,0,0.0,YALE HEIGHTS,627 S BEECHFIELD AVE,F,H,12222006


## Export Data

Now that we have extracted the records we are interested in along with all the relevant fields, we will export the data as a geojson file.

In [19]:
with open(data_dir + output_file,'w') as f:
    f.write(owned_parcels.to_json())