### Tabular data exploration

- [Parking permits](https://data.somervillema.gov/City-Services/City-of-Somerville-Parking-Permits/xavb-4s9w) between January 1, 2017 and December 31, 2018 
- Registered vehicles - confidential file from Cortni

In [1]:
# import libraries
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
import geopandas
import warnings
warnings.filterwarnings("ignore")

#### 1. Parking Permits

Original file cleanup ->

In [2]:
# read in full dataset for parking permits 
parking_permits = pd.read_csv('../data/City_of_Somerville_Parking_Permits.csv')

In [3]:
parking_permits.head()

Unnamed: 0,type_code,type_name,issued,effective,expiration,st_addr,unit_num,city,state,zip_code
0,WD,Moving Van,02/23/2017 12:00:00 AM,03/01/2017 12:00:00 AM,03/01/2017 12:00:00 AM,69 ADAMS ST,1.0,SOMERVILLE,MA,2145.0
1,G,Visitor,05/22/2017 12:00:00 AM,04/01/2017 12:00:00 AM,04/30/2018 12:00:00 AM,37 SEWALL ST,,SOMERVILLE,MA,2145.0
2,G,Visitor,05/22/2017 12:00:00 AM,04/01/2017 12:00:00 AM,04/30/2018 12:00:00 AM,37 SEWALL ST,,SOMERVILLE,MA,2145.0
3,G,Visitor,07/07/2017 12:00:00 AM,07/06/2017 12:00:00 AM,06/30/2018 12:00:00 AM,25 BEACON ST,5.0,SOMERVILLE,MA,2143.0
4,G,Visitor,07/07/2017 12:00:00 AM,07/06/2017 12:00:00 AM,06/30/2018 12:00:00 AM,25 BEACON ST,5.0,SOMERVILLE,MA,2143.0


In [4]:
# strip extra white space
parking_permits['type_name'] = parking_permits.type_name.str.strip()
parking_permits['st_addr'] = parking_permits.st_addr.str.strip()
parking_permits['st_addr']= parking_permits['st_addr'].apply(lambda mystring: ' '.join(mystring.split()))

# convert issued date to datetime
parking_permits['issued'] = pd.to_datetime(parking_permits['issued'])

Looking at unique types of parking permits - we only care about residential permits - so:

- 'Residential'
- 'Resident - No charge replacement' -- these are replacement permits for those that have been lost, so let's ignore this
- 'New Mass Resident' -- this is a temp permit and valid only for 1 month, so let's get rid of it 

In [6]:
# parking_permits.type_name.unique()

In [6]:
print('Size of parking permit data \t\t\t {}\nSize of RESIDENTIAL parking permit data \t {}'.format(
    parking_permits.shape, 
    parking_permits[parking_permits.type_name.isin(
        ['Residential']
    )].shape))

Size of parking permit data 			 (172689, 10)
Size of RESIDENTIAL parking permit data 	 (66789, 10)


We only care about permits issued in 2018 - so filtering here:


In [7]:
# parking_permits.issued year is 2018
parking_permits = parking_permits[parking_permits['issued'].dt.year == 2018]

Now make df for Residential permits only ->

In [8]:
# subset data to only residential parking permits
residential_permits = parking_permits[parking_permits.type_name.isin(
    ['Residential', ])]

In [9]:
residential_permits.head()

Unnamed: 0,type_code,type_name,issued,effective,expiration,st_addr,unit_num,city,state,zip_code
87800,A,Residential,2018-07-10,07/10/2018 12:00:00 AM,02/28/2019 12:00:00 AM,26 BOSTON AV,,SOMERVILLE,MA,2144.0
87806,A,Residential,2018-07-18,07/18/2018 12:00:00 AM,05/31/2019 12:00:00 AM,24 WALNUT ST,1.0,SOMERVILLE,MA,2143.0
87812,A,Residential,2018-08-02,08/02/2018 12:00:00 AM,02/28/2019 12:00:00 AM,16 LESLEY AV,,SOMERVILLE,MA,2144.0
87815,A,Residential,2018-02-15,02/28/2018 12:00:00 AM,02/28/2019 12:00:00 AM,19 TRULL ST,2.0,SOMERVILLE,MA,2145.0
87816,A,Residential,2018-06-07,06/07/2018 12:00:00 AM,07/31/2019 12:00:00 AM,34 ILLINOIS AV,1.0,SOMERVILLE,MA,2145.0


Now aggregate by st address ->

In [10]:
res_permits_by_st_addr = residential_permits.groupby('st_addr').aggregate({'issued':len}).reset_index()
res_permits_by_st_addr.columns=['st_addr', 'residential_permits_issued']
print('number of unique street addresses: {}'.format(res_permits_by_st_addr.shape[0]))

number of unique street addresses: 12617


In [11]:
res_permits_by_st_addr.head(10)

Unnamed: 0,st_addr,residential_permits_issued
0,1 ALDERSEY ST,1
1,1 AVON ST,3
2,1 BEACON ST,1
3,1 BELMONT SQ,3
4,1 BENTON RD,3
5,1 BRADLEY ST,5
6,1 CAPEN CAP,2
7,1 CAPEN ST,14
8,1 CARVER ST,4
9,1 CEDAR ST,2


**Noisy label option 1**  
Number of residential permits issued by street address.

Issues:
- clear inconsistency, such as for 1 Aldersey St, which has 3 garage doors from [google street view](https://www.google.com/maps/place/1+Aldersey+St,+Somerville,+MA+02143/@42.382985,-71.0960374,3a,75y,21.92h,88.7t/data=!3m6!1e1!3m4!1suVgqBBiLUdBI5VRy9pYyYA!2e0!7i16384!8i8192!4m5!3m4!1s0x89e370cca2b22e2d:0x5dbed58b8d9c69f9!8m2!3d42.3830618!4d-71.0958082)
- Data only available for 12617 addresses - need to cross check this

> Can we get info on number of units or whether the house is designated as single or multi-family from Somerville?

#### 2. Registered vehicles
From Cortni: 

> The spreadsheet contains one row per garaged vehicle in the City. Each unique license plate has an anonymized ID (e.g. COS_1). With vehicle registrations, you'll want to make sure you don't double count cars that share a license plate (e.g. Tom had car A for first half of the year, then traded it in for Car B and moved license plate to the new vehicle). In other words, organize data by # of unique license plates per property. Also, keep in mind the caveat we discussed that the addresses on this list are billing addresses, not the garaging address. So you will see some non-Somerville addresses or a car dealership that leases vehicles (excise is billed to dealer who charges lessee). There is also a PDF attached with a key for plate types. 

In [24]:
registered_vehicles = pd.read_excel('../data/COPY_Registered_Vehicles_16_17.xlsx', sheet_name='raw')
print('full dataset size: ',registered_vehicles.shape)
# remove 2016 registrations - car registrations are valid for two years in MA
registered_vehicles = registered_vehicles[registered_vehicles.Year == 2017]
print('2017 dataset size: ',registered_vehicles.shape)

full dataset size:  (102132, 9)
2017 dataset size:  (51583, 9)


In [13]:
# strip extra white space
registered_vehicles['Plate.Type'] = registered_vehicles['Plate.Type'].str.strip()
registered_vehicles['Address'] = registered_vehicles['Address'].str.strip()
registered_vehicles['Address']= registered_vehicles['Address'].apply(lambda mystring: ' '.join(mystring.split()))

In [14]:
# restrict to plate type categories
categories = ['PAN', 'PAR', 'PAS', 'PAV', 'PAY']
registered_vehicles = registered_vehicles[registered_vehicles['Plate.Type'].isin(categories)]
print('dataset size after restricting to passenger and student vehicles: ', registered_vehicles.shape)
# remove PO box registrations - can't match that up to an address
registered_vehicles = registered_vehicles[~registered_vehicles['Address'].str.contains("PO BOX")]
print('dataset size after removing PO boxes: ', registered_vehicles.shape)
# replace address things like st, rd, ln, etc.
registered_vehicles['Address'] = registered_vehicles['Address'].str.replace(' STREET',' ST')
registered_vehicles['Address'] = registered_vehicles['Address'].str.replace(' ROAD',' RD')
registered_vehicles['Address'] = registered_vehicles['Address'].str.replace(' LANE',' LN')
registered_vehicles['Address'] = registered_vehicles['Address'].str.replace(' DRIVE',' DR')
registered_vehicles['Address'] = registered_vehicles['Address'].str.replace(' AVE',' AV')
registered_vehicles['Address'] = registered_vehicles['Address'].str.replace(' AVNUE',' AV')
registered_vehicles['Address'] = registered_vehicles['Address'].str.replace(' BLVD',' BLV')

dataset size after restricting to passenger and student vehicles:  (46789, 9)
dataset size after removing PO boxes:  (45775, 9)


In [15]:
# ACCOUNT FOR DUPLICATE CARS: 
# group by address, city, unit, plate ID (in case of cars that share a license plate)
count_of_duplicates = registered_vehicles.groupby(['Address', 'City', 'Unit', 'PlateID']).aggregate(
    {'EV':len}).reset_index()
count_of_duplicates.rename(columns={'EV':'duplicate_count'}, inplace=True)
print('accounted for {} duplicates'.format(count_of_duplicates[count_of_duplicates.duplicate_count >1].shape[0]))

accounted for 29 duplicates


In [16]:
# now get registered vehicles by address
registered_vehicles_by_addr = count_of_duplicates.groupby(
    ['Address', 'City']).aggregate({'PlateID': len}).reset_index()
registered_vehicles_by_addr.rename(columns={'PlateID':'num_registered_vehicles'}, inplace=True)
print('There are {} unique addresses after preliminary data cleaning.\
 (more cleaning to come)'.format(registered_vehicles_by_addr.shape[0]))

There are 20281 unique addresses after preliminary data cleaning. (more cleaning to come)


>> need to clean this more if we decide 

In [20]:
registered_vehicles_by_addr.City.unique()

array(['               ', 'SOMERVILLE     ', 'W BOYLSTON     ',
       'E SOMERVILLE   ', 'METHUEN        ', 'BRIGHTON       ',
       'BOSTON         ', 'W SOMERVILLE   ', 'JAMESBURG      ',
       'NATICK         ', 'QUINCY         ', 'ARLINGTON      ',
       'LEXINGTON      ', 'MALDEN         ', 'WOBURN         ',
       'ORLEANS        ', 'MEDFIELD       ', 'FRANKLIN       ',
       'LYNNFIELD      ', 'SALEM          ', 'RANDOLPH       ',
       'ROCKPORT       ', 'TOWNSEND       ', 'WESTFORD       ',
       'CHARLESTOWN    ', 'CHELSEA        ', 'BILLERICA      ',
       'MAYNARD        ', 'WESTON         ', 'HUDSON         ',
       'MATTAPOISETT   ', 'DORCHESTER CENT', 'W ROXBURY      ',
       'MELROSE        ', 'ATKINSON       ', 'PITTSFORD      ',
       'NEWTON         ', 'CHELMSFORD     ', 'MEDFORD        ',
       'LYNN           ', 'SAN  FRANCISCO ', 'CAMBRIDGE      ',
       'REVERE         ', 'GEORGETOWN     ', 'DORCHESTER     ',
       'MARSTONS MILLS ', 'GOSHEN       

In [22]:
cities_to_keep = ['SOMERVILLE     ', 
                  'E SOMERVILLE   ',
                  'W SOMERVILLE   '
                 ]
registered_vehicles_by_addr[registered_vehicles_by_addr.City.isin(cities_to_keep)].head(10)


Unnamed: 0,Address,City,num_registered_vehicles
1,08 GEORGE ST,SOMERVILLE,1
2,1 ALDERSEY ST,SOMERVILLE,2
4,1 AVON ST,SOMERVILLE,3
5,1 BANKS ST,SOMERVILLE,2
6,1 BEACON ST,E SOMERVILLE,1
7,1 BEACON ST,SOMERVILLE,1
8,1 BELMONT SQ UNIT 1,SOMERVILLE,1
9,1 BELMONT SQUARE,SOMERVILLE,2
10,1 BENTON RD,SOMERVILLE,4
11,1 BENTON RD 2,SOMERVILLE,1


Here are some addresses with a large number of registered vehicles. Most addresses in Somerville correspont to large apartment buildings, while some out of state ones correspond to insurance offices. Note that we have 29 cars with no address listed.

In [20]:
# here are some interesting ones - where number of registered vehicles is quite large 
## most somerville ones are apartment buildings, but others include a car insurance offie
registered_vehicles_by_addr[registered_vehicles_by_addr.num_registered_vehicles > 20].head()

Unnamed: 0,Address,City,num_registered_vehicles
0,,,29
52,1 FITCHBURG ST,SOMERVILLE,89
1032,109 HIGHLAND AV,SOMERVILLE,32
1716,1165 SANCTUARY PKWY,ALPHARETTA,21
1773,1188 BROADWAY,SOMERVILLE,23


### 3. Parcel FY19 text data


In [26]:
parcel_data = pd.read_csv('../data/Parcels_FY19/VisionExtract_FY19.txt', error_bad_lines=False)

b'Skipping line 606: expected 49 fields, saw 50\nSkipping line 778: expected 49 fields, saw 50\nSkipping line 1489: expected 49 fields, saw 50\nSkipping line 2018: expected 49 fields, saw 50\nSkipping line 2171: expected 49 fields, saw 50\nSkipping line 2662: expected 49 fields, saw 50\nSkipping line 3037: expected 49 fields, saw 51\nSkipping line 3162: expected 49 fields, saw 50\nSkipping line 3284: expected 49 fields, saw 50\nSkipping line 3343: expected 49 fields, saw 50\nSkipping line 3401: expected 49 fields, saw 50\nSkipping line 3500: expected 49 fields, saw 50\nSkipping line 3875: expected 49 fields, saw 50\nSkipping line 4010: expected 49 fields, saw 50\nSkipping line 4161: expected 49 fields, saw 50\nSkipping line 4217: expected 49 fields, saw 50\nSkipping line 4230: expected 49 fields, saw 50\nSkipping line 4239: expected 49 fields, saw 50\nSkipping line 4477: expected 49 fields, saw 50\nSkipping line 5797: expected 49 fields, saw 50\nSkipping line 5962: expected 49 fields, 

In [27]:
parcel_data.head()

Unnamed: 0,ID,PROP_ID,BLDG_VAL,LAND_VAL,OTHER_VAL,TOTAL_VAL,FY,LOT_SIZE,LS_DATE,LS_PRICE,USE_CODE,SITE_ADDR,ADDR_NUM,FULL_STR,LOCATION,SITE_CITY,SITE_ZIP,OWNER1,OWNER2,OWN_ADDR1,OWN_ADDR2,OWN_CITY,OWN_STATE,OWN_ZIP,OWN_CO,LS_BOOK,LS_PAGE,REG_ID,ZONE,YEAR_BUILT,BLD_AREA,UNITS,RES_AREA,STYLE,STORIES,NUM_ROOMS,LOT_UNITS,CAMA_ID,LOC_ID,MAP,MAP_CUT,BLOCK,BLOCK_CUT,LOT,LOT_CUT,UNIT,UNIT_CUT,MBL,AV PID
0,1,102||D||18|||,493400.0,570400.0,38400.0,1102200.0,2020,0.27,4/30/2009 0:00:00,0.0,340,67 BROADWAY,67,BROADWAY,,SOMERVILLE,,DIGIROLAMO RICHARD G & RALPH TRSTEES,67 BROADWAY REALTY TRUST,P O BOX 281,,SOMERVILLE,MA,2143,,52679,444,,CCD45,1900.0,6842.0,0,4073.0,Office/Apts,2.3,0,,1676,102-D-18,102,,D,,18,,,,102-D-18,1676
1,2,102||D||19|||,327600.0,401000.0,0.0,728600.0,2020,0.1,10/20/2009 0:00:00,405000.0,1040,9 PENNSYLVANIA AVE,9,PENNSYLVANIA AVE,,SOMERVILLE,,LI BRIAN XIONG,,59 MT VERNON STREET,,SOMERVILLE,MA,2145,USA,53702,137,,RA,1900.0,4740.0,0,3002.0,2-Decker,2.8,13,,13963,102-D-19,102,,D,,19,,,,102-D-19,13963
2,3,102||D||1|||,647900.0,281300.0,0.0,929200.0,2020,0.06,9/9/2005 0:00:00,386000.0,1050,11 MAINE AVE,11,MAINE AVE,,SOMERVILLE,,MCGLASHING PAUL,,11 MAINE AVE,,SOMERVILLE,MA,2145,USA,46058,47,,RB,1900.0,4628.0,0,3120.0,3-Decker,3.0,15,,13948,102-D-1,102,,D,,1,,,,102-D-1,13948
3,4,102||D||20|||,459600.0,403800.0,0.0,863400.0,2020,0.1,11/4/1997 0:00:00,198000.0,1050,13 PENNSYLVANIA AVE,13,PENNSYLVANIA AVE,,SOMERVILLE,,DEOLIVEIRA VOLNEIURILS,DEOLIVEIRA VANDERLAN U,13 PENNSYLVANIA AVE,,SOMERVILLE,MA,2145,USA,27841,200,,RA,1900.0,4769.0,0,3206.0,3 fam Conv,2.8,14,,13964,102-D-20,102,,D,,20,,,,102-D-20,13964
4,5,102||D||21|||,485000.0,406000.0,0.0,891000.0,2020,0.1,9/18/1989 0:00:00,1.0,1050,17 PENNSYLVANIA AVE,17,PENNSYLVANIA AVE,,SOMERVILLE,,MIGLIORE VINCENT & CAROL,,17 PENNSYLVANIA AVE,,SOMERVILLE,MA,2145,USA,20080,57,,RA,1900.0,5389.0,0,3142.0,3 fam Conv,2.8,14,,13965,102-D-21,102,,D,,21,,,,102-D-21,13965


In [39]:
selected_cols = ['ID',
#  'PROP_ID',
#  'BLDG_VAL',
#  'LAND_VAL',
#  'OTHER_VAL',
#  'TOTAL_VAL',
# #  'FY',
#  'LOT_SIZE',
#  'LS_DATE',
#  'LS_PRICE',
#  'USE_CODE',
 'SITE_ADDR',
 'ADDR_NUM',
 'FULL_STR',
#  'LOCATION',
 'SITE_CITY',
 'SITE_ZIP',
#  'OWNER1',
#  'OWNER2',
#  'OWN_ADDR1',
#  'OWN_ADDR2',
#  'OWN_CITY',
#  'OWN_STATE',
#  'OWN_ZIP',
#  'OWN_CO',
#  'LS_BOOK',
#  'LS_PAGE',
#  'REG_ID',
#  'ZONE',
 'YEAR_BUILT',
 'BLD_AREA',
 'UNITS',
 'RES_AREA',
 'STYLE',
 'STORIES',
 'NUM_ROOMS',
 'LOT_UNITS',
#  'CAMA_ID',
#  'LOC_ID',
#  'MAP',
#  'MAP_CUT',
#  'BLOCK',
#  'BLOCK_CUT',
#  'LOT',
#  'LOT_CUT',
#  'UNIT',
#  'UNIT_CUT',
#  'MBL',
#  'AV PID'
                ]
parcel_data[selected_cols].head(10)

Unnamed: 0,ID,SITE_ADDR,ADDR_NUM,FULL_STR,SITE_CITY,SITE_ZIP,YEAR_BUILT,BLD_AREA,UNITS,RES_AREA,STYLE,STORIES,NUM_ROOMS,LOT_UNITS
0,1,67 BROADWAY,67,BROADWAY,SOMERVILLE,,1900.0,6842.0,0,4073.0,Office/Apts,2.3,0,
1,2,9 PENNSYLVANIA AVE,9,PENNSYLVANIA AVE,SOMERVILLE,,1900.0,4740.0,0,3002.0,2-Decker,2.8,13,
2,3,11 MAINE AVE,11,MAINE AVE,SOMERVILLE,,1900.0,4628.0,0,3120.0,3-Decker,3.0,15,
3,4,13 PENNSYLVANIA AVE,13,PENNSYLVANIA AVE,SOMERVILLE,,1900.0,4769.0,0,3206.0,3 fam Conv,2.8,14,
4,5,17 PENNSYLVANIA AVE,17,PENNSYLVANIA AVE,SOMERVILLE,,1900.0,5389.0,0,3142.0,3 fam Conv,2.8,14,
5,6,21 PENNSYLVANIA AVE,21,PENNSYLVANIA AVE,SOMERVILLE,,1900.0,5000.0,0,3151.0,Two Family,2.8,11,
6,7,25 PENNSYLVANIA AVE,25,PENNSYLVANIA AVE,SOMERVILLE,,1915.0,4991.0,0,3288.0,Two Family,2.8,15,
7,8,29 PENNSYLVANIA AVE,29,PENNSYLVANIA AVE,SOMERVILLE,,1900.0,4900.0,0,3204.0,3 fam Conv,2.8,15,
8,9,33 PENNSYLVANIA AVE,33,PENNSYLVANIA AVE,SOMERVILLE,,1900.0,4564.0,0,2912.0,Two Family,2.5,11,
9,10,37 PENNSYLVANIA AVE,37,PENNSYLVANIA AVE,SOMERVILLE,,1900.0,4600.0,0,2895.0,Two Family,2.8,12,


In [63]:
# parcel_data[parcel_data.STYLE =='Outbuildings']


In [47]:
keep = [
#     'Office/Apts',
 '2-Decker',
 '3-Decker',
 '3 fam Conv',
 'Two Family',
#  'Vacant Land',
 'Mansard',
#  'Store',
#  'School/College',
 'Two decker',
 'Condominium',
 'Conventional',
 'Family Duplex',
 'Mansard-Apts',
 '2 Fam Conv',
 'Stores/Apt Com',
 'Family Duplex-Apts',
#  'Outbuildings',
 'Mid rise',
 'Two Family-Apts',
#  'Restaurant',
#  'Warehouse',
 'Row Mid',
#  'Office Bldg',
#  'Service Shop',
#  'Research/Devel',
 '3-Decker-Apts',
 'Row End-Apts',
 'Garage/Office',
 'Row End',
 'Row Mid-Apts',
 'Duplex',
 'Fam Conv',
 'Apartments',
 'Victorian',
 'Cottage Bungalow',
 'Conventional-Apts',
 'Double 3D',
 'Three decker',
 'Townhouse end',
 'Townhouse middle',
#  'Retail/Offices',
 'High End Constr',
 '2-Decker-Apts',
 'Convert Warehs/Loft',
#  'Nightclub/Bar',
#  'Clubs/Lodges',
#  'Car Wash',
 'Office/Warehs',
#  'Profess. Bldg',
#  'Hotel',
#  'Truck Terminal',
#  'Pre-Eng Warehs',
#  'Colleges',
 'Dormitory',
#  'Churches',
#  'Telephone Bldg',
 'Indust Condo',
#  'Condo Office',
#  'Supermarkets',
#  'Coin-op CarWsh',
 'Retail Condo',
#  'Fire Station',
#  'Finan Inst.',
#  'Library',
#  'Funeral Home',
 'Low rise',
#  'Other Municip',
#  'Stores/Office',
#  'Bakery',
#  'Dry Cln/Laundr',
#  'Serv Sta 2-bay',
#  'Converted School',
#  'Other State',
#  'Branch Bank',
#  'Theaters Encl.',
 'Mid Rise Apartments',
#  'Light Indust',
 'Cottage',
 'Row Middle',
#  'Serv Sta 3-Bay',
#  'Auto Sales Rpr',
 'Townhouse',
#  'Home for Aged',
#  'Hospitals-Priv',
#  'Commercial Bld',
#  'Skating Arena',
#  'Day Care',
#  'Child Care',
#  'Health Club/Gym',
#  'Supermarket',
#  'City/Town Hall',
#  'Other Federal',
#  'Fast Food Rest',
 'Victorian-Apts',
#  'Converted Municipal',
#  'Comm Warehouse',
#  'Commercial',
#  'Department Str',
 'High Rise Apt',
#  'Shop Center RE',
#  'Food Process',
#  'Hospital',
#  'Schools-Public',
#  'Pkg Garage'
]

In [48]:
res_types = parcel_data[parcel_data.STYLE.isin(keep)]
res_types.shape

(18207, 49)

In [49]:
parcel_data.shape

(19448, 49)

In [60]:
parcel_data[parcel_data.ID == 10026]

Unnamed: 0,ID,PROP_ID,BLDG_VAL,LAND_VAL,OTHER_VAL,TOTAL_VAL,FY,LOT_SIZE,LS_DATE,LS_PRICE,USE_CODE,SITE_ADDR,ADDR_NUM,FULL_STR,LOCATION,SITE_CITY,SITE_ZIP,OWNER1,OWNER2,OWN_ADDR1,OWN_ADDR2,OWN_CITY,OWN_STATE,OWN_ZIP,OWN_CO,LS_BOOK,LS_PAGE,REG_ID,ZONE,YEAR_BUILT,BLD_AREA,UNITS,RES_AREA,STYLE,STORIES,NUM_ROOMS,LOT_UNITS,CAMA_ID,LOC_ID,MAP,MAP_CUT,BLOCK,BLOCK_CUT,LOT,LOT_CUT,UNIT,UNIT_CUT,MBL,AV PID
9988,10026,47||G||1||2|,515700.0,0.0,0.0,515700.0,2020,0.0,7/24/2008 0:00:00,245000.0,1020,108 HEATH ST #2,108,HEATH ST,2,SOMERVILLE,2145.0,LASSALETTA ANTONIO DAVID,,108 HEATH ST #2,,SOMERVILLE,MA,2145,USA,51481,530,,RB,1920.0,1539.0,0,1026.0,Three decker,1.0,5,,106566,47-G-1,47,,G,,1,,2,,47-G-1,106566


In [57]:
parcels = geopandas.read_file('../data/Parcels_FY19')
print('number of parcels: ',parcels.shape[0])

number of parcels:  14095


In [62]:
parcels[parcels.OBJECTID == 10912]

Unnamed: 0,OBJECTID,Map,Block,Lot,MBL,PolyType,AddNum,Street,AddNum2,Street2,AddNum3,Street3,SublotOf,TaxParMBL,Shape_Leng,Shape_Area,geometry
10892,10912,52,B,7,52-B-7,PARCEL,90,SUMMER ST,,,,,,52-B-7,311.855357,5342.749452,"POLYGON ((763867.919 2965446.250, 763822.790 2..."
