# Boulevard Gardens - Data Cleaning
### By Julia-Simone Rutgers

This notebook includes steps to clean two datasets:
- property assessment data for the Wolseley neighbourhood in Winnipeg
- and a shoeleather dataset of neighbourhood properties that have planted boulevard gardens

The data will be used to analyse whether these boulevard gardens are more prevalent in 'wealthier' parts of the neighbourhood. _(In this story, relative 'wealth' is signalled by higher average property values)_


Methodology for the garden data collection is explained later in the document.

In [1]:
# Document set up
import pandas as pd
import numpy as np

In [2]:
# Read in assessment parcels - Wolseley
df = pd.read_csv("Assessment_Parcels_Wolseley.csv")

#### What's the data?

Assessment parcels for properties in the Wolseley neighbourhood area, sourced from [Winnipeg Open Data](https://data.winnipeg.ca/Assessment-Taxation-Corporate/Assessment-Parcels/d4mq-wa44/explore/query/SELECT%0A%20%20%60roll_number%60%2C%0A%20%20%60street_number%60%2C%0A%20%20%60unit_number%60%2C%0A%20%20%60street_direction%60%2C%0A%20%20%60street_name%60%2C%0A%20%20%60street_type%60%2C%0A%20%20%60full_address%60%2C%0A%20%20%60neighbourhood_area%60%2C%0A%20%20%60property_use_code%60%2C%0A%20%20%60assessed_land_area%60%2C%0A%20%20%60zoning%60%2C%0A%20%20%60total_assessed_value%60%2C%0A%20%20%60assessment_date%60%2C%0A%20%20%60detail_url%60%2C%0A%20%20%60current_assessment_year%60%2C%0A%20%20%60property_class_1%60%2C%0A%20%20%60geometry%60%2C%0A%20%20%60centroid_lat%60%2C%0A%20%20%60centroid_lon%60%0AWHERE%20caseless_eq%28%60neighbourhood_area%60%2C%20%22WOLSELEY%22%29/page/column_manager). The data has been filtered to include only Wolseley properties. It has also been trimmed to return only columns for the address (street number, street name, street type, etc.), total assessed property values, zoning/property use identifiers (there are a few), and geometry. The roll number is used as a unique identifier.

In [3]:
df.head()

Unnamed: 0,Roll Number,Street Number,Unit Number,Street Direction,Street Name,Street Type,Full Address,Neighbourhood Area,Property Use Code,Assessed Land Area,Zoning,Total Assessed Value,Assessment Date,Detail URL,Current Assessment Year,Property Class 1,Geometry,Centroid Lat,Centroid Lon
0,12080201000,1430.0,,,PORTAGE,AVENUE,1430 PORTAGE AVENUE,WOLSELEY,PIRPK - PARK WITH BUILDING,488220.0,PR1 - PRKS&REC-PASSIVE,3130000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.19095901066672 49.87922629...,49.879843,-97.192867
1,12080240200,1420.0,,,PORTAGE,AVENUE,1420 PORTAGE AVENUE,WOLSELEY,PIICH - CHURCH,88446.0,R2 - RES - TWO FAMILY,3144000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.19158825669882 49.88158128...,49.881477,-97.191951
2,12080250500,,,,RAGLAN,ROAD,RAGLAN ROAD,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,141149.0,R2 - RES - TWO FAMILY,518000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19107262988514 49.88077998...,49.880161,-97.191705
3,12080260000,542.0,,,RAGLAN,ROAD,542 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,305000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881459,-97.191275
4,12080261000,538.0,,,RAGLAN,ROAD,538 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,364000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881322,-97.191286


#### Data Cleaning:

I am simultaneously looking at these parcels in geojson, where it's a bit easier to understand some of the complexities of the data. 
I want to trim out parcels that are not homes or businesses. I also want to consolidate condos/apartments to one property. Zoning is not the easiest column to use for categorization, as some properties are zoned residential but being used as businesses, or zoned R2 but do not have a building on them (they're greenspace, for example). Property use code is most specific for this kind of analysis.


In [4]:
# First, clean up column names:

df.columns = df.columns.str.lower()
df.columns = df.columns.str.replace(' ', '_')
df.head(1)

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
0,12080201000,1430.0,,,PORTAGE,AVENUE,1430 PORTAGE AVENUE,WOLSELEY,PIRPK - PARK WITH BUILDING,488220.0,PR1 - PRKS&REC-PASSIVE,3130000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.19095901066672 49.87922629...,49.879843,-97.192867


In [5]:
# Grab the basic information about our table:

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2671 entries, 0 to 2670
Data columns (total 19 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   roll_number              2671 non-null   int64  
 1   street_number            2656 non-null   float64
 2   unit_number              148 non-null    object 
 3   street_direction         149 non-null    object 
 4   street_name              2664 non-null   object 
 5   street_type              2655 non-null   object 
 6   full_address             2664 non-null   object 
 7   neighbourhood_area       2671 non-null   object 
 8   property_use_code        2671 non-null   object 
 9   assessed_land_area       2516 non-null   float64
 10  zoning                   2523 non-null   object 
 11  total_assessed_value     2656 non-null   float64
 12  assessment_date          2671 non-null   object 
 13  detail_url               2671 non-null   object 
 14  current_assessment_year 

In [6]:
# From scanning the data, I know that not all properties have a street number (house number)
# From the map, I know many are parks, etc. 
# I want the number so I can keep track of my cleaning success. 
df.street_number.isnull().value_counts()

street_number
False    2656
True       15
Name: count, dtype: int64

In [7]:
# What do those 15 look like?

In [8]:
df[df.street_number.isnull()]

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
2,12080250500,,,,RAGLAN,ROAD,RAGLAN ROAD,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,141149.0,R2 - RES - TWO FAMILY,518000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19107262988514 49.88077998...,49.880161,-97.191705
7,12080264000,,,,RAGLAN,ROAD,RAGLAN ROAD,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,5997.0,R2 - RES - TWO FAMILY,283000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19106840577483 49.88083471...,49.880911,-97.191317
8,12080264500,,,,RAGLAN,ROAD,RAGLAN ROAD,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,2399.0,R2 - RES - TWO FAMILY,230000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19107262988514 49.88077998...,49.880816,-97.191325
551,12081002500,,,,,,,WOLSELEY,REFRL - REFERENCE ROLL,124.0,R2 - RES - TWO FAMILY,,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,,MULTIPOLYGON (((-97.18031804853057 49.88293823...,49.882972,-97.180305
563,12081013500,,,,,,,WOLSELEY,REFRL - REFERENCE ROLL,415.0,R2 - RES - TWO FAMILY,,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,,MULTIPOLYGON (((-97.18038676055855 49.88177416...,49.881762,-97.180398
570,12081020500,,,,,,,WOLSELEY,REFRL - REFERENCE ROLL,1259.0,R2 - RES - TWO FAMILY,,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,,MULTIPOLYGON (((-97.18046084840775 49.88109053...,49.881021,-97.180455
1311,12081918000,,,,RUBY,STREET,RUBY STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,2272.0,R2 - RES - TWO FAMILY,226000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17258277507284 49.88346963...,49.883498,-97.172386
1453,12082098000,,,,RUBY,STREET,RUBY STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,2954.0,R2 - RES - TWO FAMILY,220000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17281033929424 49.88417443...,49.884137,-97.173009
1479,12082127000,,,,EVANSON,STREET,EVANSON STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,24.0,R2 - RES - TWO FAMILY,50.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17045868146052 49.87806295...,49.878068,-97.170628
1503,12082156200,,,,,,,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,1874.0,R2 - RES - TWO FAMILY,3117.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17045887411923 49.87821965...,49.878291,-97.170413


In [9]:
# Let's look at the unique property_use_code
df.property_use_code.unique()

array(['PIRPK - PARK WITH BUILDING', 'PIICH - CHURCH',
       'VRES1 - VACANT RESIDENTIAL 1', 'RESSD - DETACHED SINGLE DWELLING',
       'CMOFF - OFFICE', 'CMRST - STORE',
       'RESMC - MULTIFAMILY CONVERSION',
       'RESSU - RESIDENTIAL SECONDARY UNIT',
       'RESMA - MULTIPLE ATTACHED UNITS',
       'CMVSR - VEHICLE SERVICE RELATED', 'RESDU - DUPLEX',
       'CMFBK - BANK', 'PIISC - SCHOOL', 'CMRCV - CONVENIENCE STORE',
       'CMRRE - RESTAURANT', 'REFRL - REFERENCE ROLL',
       'RESAP - APARTMENTS', 'CMSTP - STRIP MALL',
       'RESGC - RESIDENTIAL GROUP CARE', 'CMPSP - SURFACE PARKING',
       'RESMU - RESIDENTIAL MULTIPLE USE',
       'CMCMU - COMMERCIAL MULTIPLE USE', 'PIRCC - COMMUNITY CENTRE',
       'RESTR - TRIPLEX', 'CNCMP - CONDO COMPLEX',
       'CNRES - CONDO RESIDENTIAL', 'CMOMC - MEDICAL OFFICE CLINIC',
       'RESAM - APARTMENTS MULTIPLE USE',
       'PIIGC - NON-RESIDENTIAL GROUP CARE',
       'CMMRH - COMMERCIAL ROW HOUSE',
       'RESMB - RESIDENTIAL MULTIPLE 

In [10]:
# From looking at the mapped version, I know vacant residentials are mostly parks or non-buildings. 
# Does the count of vacants match the null street numbers count?

df.property_use_code.str.contains("VRES").value_counts()

property_use_code
False    2648
True       23
Name: count, dtype: int64

In [11]:
# There are more! Okay. Let's see them too.
df[df.property_use_code.str.contains("VRES")]

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
2,12080250500,,,,RAGLAN,ROAD,RAGLAN ROAD,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,141149.0,R2 - RES - TWO FAMILY,518000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19107262988514 49.88077998...,49.880161,-97.191705
7,12080264000,,,,RAGLAN,ROAD,RAGLAN ROAD,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,5997.0,R2 - RES - TWO FAMILY,283000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19106840577483 49.88083471...,49.880911,-97.191317
8,12080264500,,,,RAGLAN,ROAD,RAGLAN ROAD,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,2399.0,R2 - RES - TWO FAMILY,230000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19107262988514 49.88077998...,49.880816,-97.191325
34,12080305500,534.0,,,CRAIG,STREET,534 CRAIG STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,3475.0,R2 - RES - TWO FAMILY,214000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18977953520537 49.88170755...,49.88178,-97.189913
236,12080530000,1270.0,,,WOLSELEY,AVENUE,1270 WOLSELEY AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,18092.0,R2 - RES - TWO FAMILY,522000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18702788621498 49.87923470...,49.879481,-97.187268
701,12081184000,1064.0,,,PALMERSTON,AVENUE,1064 PALMERSTON AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,5250.0,R2 - RES - TWO FAMILY,442000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17784448030837 49.87792767...,49.878106,-97.177408
704,12081189100,104.0,,S,GARFIELD,STREET,104 GARFIELD STREET S,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,7125.0,R2 - RES - TWO FAMILY,438000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17770600326995 49.87890874...,49.878673,-97.178267
919,12081435000,1020.0,,,PALMERSTON,AVENUE,1020 PALMERSTON AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,4255.0,R2 - RES - TWO FAMILY,430000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17589973184508 49.87693016...,49.877107,-97.17571
989,12081510000,35.0,,,AUBREY,STREET,35 AUBREY STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,25799.0,R2 - RES - TWO FAMILY,400000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17520487593019 49.87856533...,49.878172,-97.175041
990,12081519200,139.0,,,AUBREY,STREET,139 AUBREY STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,16324.0,R2 - RES - TWO FAMILY,369000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17516781601665 49.87905534...,49.878804,-97.174993


In [12]:
# After looking at the map, there are a handful of outliers that I don't want to get rid of.
# Instead, I'll start by clearing out the ones with no street number

df_clean = df[df.street_number.notnull()]

In [13]:
df_clean.shape

(2656, 19)

In [14]:
# I also want to remove schools, churches and gov buildings. The idea is to look at residences in the analysis. 
# I am still debating the inclusion of businesses, so I won't abandon them YET
# To get rid of these insitutional buildings, I want to look at the property_class.

df.property_class_1.unique()

array(['OTHER', 'INSTITUTIONAL', 'RESIDENTIAL 1', nan, 'RESIDENTIAL 2',
       'RESIDENTIAL 3'], dtype=object)

In [15]:
# Look at all 'Institutional' class properties
df[df['property_class_1'] == 'INSTITUTIONAL']

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
1,12080240200,1420.0,,,PORTAGE,AVENUE,1420 PORTAGE AVENUE,WOLSELEY,PIICH - CHURCH,88446.0,R2 - RES - TWO FAMILY,3144000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.19158825669882 49.88158128...,49.881477,-97.191951
293,12080611100,511.0,,,CLIFTON,STREET,511 CLIFTON STREET,WOLSELEY,PIISC - SCHOOL,142947.0,R2 - RES - TWO FAMILY,4922000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.18622195092647 49.88026701...,49.881248,-97.185715
622,12081074500,533.0,,,GREENWOOD,PLACE,533 GREENWOOD PLACE,WOLSELEY,RESAP - APARTMENTS,25625.0,RMFL - RES - MULTI-FAMILY,6197000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.17897832370008 49.88296342...,49.883264,-97.179204
974,12081492000,292.0,,,AUBREY,STREET,292 AUBREY STREET,WOLSELEY,RESGC - RESIDENTIAL GROUP CARE,2916.0,R2 - RES - TWO FAMILY,329000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.17513829718177 49.88313103...,49.883181,-97.175328
1286,12081881000,930.0,,,PORTAGE,AVENUE,930 PORTAGE AVENUE,WOLSELEY,CMOFF - OFFICE,12001.0,C2 - COM - COMMUNITY,2607000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.17206154954306 49.88480405...,49.884667,-97.171859
1384,12081995000,960.0,,,WOLSELEY,AVENUE,960 WOLSELEY AVENUE,WOLSELEY,PIISC - SCHOOL,145582.0,R2 - RES - TWO FAMILY,10953000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.17269621703373 49.87740719...,49.878391,-97.172534
1448,12082093000,236.0,,,RUBY,STREET,236 RUBY STREET,WOLSELEY,RESGC - RESIDENTIAL GROUP CARE,3026.0,R2 - RES - TWO FAMILY,394000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.17284390469798 49.88372575...,49.883686,-97.173042
1746,12082460000,141.0,,,ARLINGTON,STREET,141 ARLINGTON STREET,WOLSELEY,PIIDC - DAY CARE,4026.0,C1 - COM - NEIGHBOURHOOD,312000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16914362760043 49.88138213...,49.881444,-97.169352
1862,12082596000,240.0,,,HOME,STREET,240 HOME STREET,WOLSELEY,PIICH - CHURCH,8233.0,R2 - RES - TWO FAMILY,699000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16885735853265 49.88410122...,49.883982,-97.168653
1865,12082608100,870.0,,,PORTAGE,AVENUE,870 PORTAGE AVENUE,WOLSELEY,CMOFF - OFFICE,19218.0,C2 - COM - COMMUNITY,2716000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16879112658728 49.88556918...,49.885371,-97.168553


In [16]:
# How many are there?
len(df[df['property_class_1'] == 'INSTITUTIONAL'])

24

In [17]:
# And what are the use codes?
df[df['property_class_1'] == 'INSTITUTIONAL'].property_use_code.unique()

array(['PIICH - CHURCH', 'PIISC - SCHOOL', 'RESAP - APARTMENTS',
       'RESGC - RESIDENTIAL GROUP CARE', 'CMOFF - OFFICE',
       'PIIDC - DAY CARE'], dtype=object)

In [18]:
# The residential apartment is a care home. I want to keep that for now. (I know it has a garden)
# Instead of cutting all he institutions, I'll go by use codes. Churches and schools can go.

df[(df['property_use_code']== 'PIICH - CHURCH') | (df['property_use_code'] == 'PIISC - SCHOOL')]

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
1,12080240200,1420.0,,,PORTAGE,AVENUE,1420 PORTAGE AVENUE,WOLSELEY,PIICH - CHURCH,88446.0,R2 - RES - TWO FAMILY,3144000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.19158825669882 49.88158128...,49.881477,-97.191951
293,12080611100,511.0,,,CLIFTON,STREET,511 CLIFTON STREET,WOLSELEY,PIISC - SCHOOL,142947.0,R2 - RES - TWO FAMILY,4922000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.18622195092647 49.88026701...,49.881248,-97.185715
1384,12081995000,960.0,,,WOLSELEY,AVENUE,960 WOLSELEY AVENUE,WOLSELEY,PIISC - SCHOOL,145582.0,R2 - RES - TWO FAMILY,10953000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.17269621703373 49.87740719...,49.878391,-97.172534
1862,12082596000,240.0,,,HOME,STREET,240 HOME STREET,WOLSELEY,PIICH - CHURCH,8233.0,R2 - RES - TWO FAMILY,699000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16885735853265 49.88410122...,49.883982,-97.168653
1891,12082683000,160.0,,,ETHELBERT,STREET,160 ETHELBERT STREET,WOLSELEY,PIICH - CHURCH,14440.0,R2 - RES - TWO FAMILY,1186000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16742358180947 49.88205189...,49.881843,-97.167636
2057,12082874000,61.0,,,PICARDY,PLACE,61 PICARDY PLACE,WOLSELEY,PIICH - CHURCH,17724.0,R2 - RES - TWO FAMILY,1458000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.1660317879726 49.885701349...,49.885917,-97.165698
2065,12082900100,784.0,,,WOLSELEY,AVENUE,784 WOLSELEY AVENUE,WOLSELEY,PIISC - SCHOOL,25961.0,R2 - RES - TWO FAMILY,2229000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16389012833476 49.87895676...,49.879396,-97.163576
2137,12082978500,65.0,,,WALNUT,STREET,65 WALNUT STREET,WOLSELEY,PIICH - CHURCH,21933.0,R2 - RES - TWO FAMILY,1571000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16343834640622 49.88072380...,49.880433,-97.163685
2250,12083128100,790.0,,,HONEYMAN,AVENUE,790 HONEYMAN AVENUE,WOLSELEY,PIICH - CHURCH,12176.0,R2 - RES - TWO FAMILY,1432000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16406904965707 49.88531208...,49.885107,-97.163963
2311,12090018000,750.0,,,WOLSELEY,AVENUE,750 WOLSELEY AVENUE,WOLSELEY,PIISC - SCHOOL,85251.0,R2 - RES - TWO FAMILY,3879000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,INSTITUTIONAL,MULTIPOLYGON (((-97.16121195057586 49.88023677...,49.879952,-97.162081


In [19]:
len(df[(df['property_use_code']== 'PIICH - CHURCH') | (df['property_use_code'] == 'PIISC - SCHOOL')])

12

In [20]:
# First, gather the indexes
institutions = df_clean[(df_clean['property_use_code']== 'PIICH - CHURCH')|(df_clean['property_use_code'] == 'PIISC - SCHOOL')].index

In [21]:
# And drop them
df_clean = df_clean.drop(institutions, axis=0)

In [29]:
# Let's look at what's left...
df_clean.property_use_code.unique()

array(['RESSD - DETACHED SINGLE DWELLING', 'CMOFF - OFFICE',
       'CMRST - STORE', 'VRES1 - VACANT RESIDENTIAL 1',
       'RESMC - MULTIFAMILY CONVERSION',
       'RESSU - RESIDENTIAL SECONDARY UNIT',
       'RESMA - MULTIPLE ATTACHED UNITS', 'RESDU - DUPLEX',
       'CMFBK - BANK', 'CMRCV - CONVENIENCE STORE', 'CMRRE - RESTAURANT',
       'RESAP - APARTMENTS', 'CMSTP - STRIP MALL',
       'RESGC - RESIDENTIAL GROUP CARE',
       'RESMU - RESIDENTIAL MULTIPLE USE',
       'CMCMU - COMMERCIAL MULTIPLE USE', 'PIRCC - COMMUNITY CENTRE',
       'RESTR - TRIPLEX', 'CNCMP - CONDO COMPLEX',
       'CNRES - CONDO RESIDENTIAL', 'CMOMC - MEDICAL OFFICE CLINIC',
       'RESAM - APARTMENTS MULTIPLE USE',
       'PIIGC - NON-RESIDENTIAL GROUP CARE',
       'CMMRH - COMMERCIAL ROW HOUSE',
       'RESMB - RESIDENTIAL MULTIPLE BUILDINGS', 'PIIDC - DAY CARE',
       'RESSS - SIDE BY SIDE', 'RESRM - ROOMING HOUSE',
       'CNAPT - CONDO APARTMENT', 'CNCOM - CONDO COMMERCIAL'],
      dtype=object)

In [30]:
# Cool. I also don't need parks with buildings, garages (vehicle service), parking lots, government buildings...

df[df['property_use_code'].isin(['PIRPK - PARK WITH BUILDING','CMOGV - GOVERNMENT OFFICE','CMPSP - SURFACE PARKING','CMVSR - VEHICLE SERVICE RELATED'])]

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
0,12080201000,1430.0,,,PORTAGE,AVENUE,1430 PORTAGE AVENUE,WOLSELEY,PIRPK - PARK WITH BUILDING,488220.0,PR1 - PRKS&REC-PASSIVE,3130000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.19095901066672 49.87922629...,49.879843,-97.192867
205,12080494000,1284.0,,,PORTAGE,AVENUE,1284 PORTAGE AVENUE,WOLSELEY,CMVSR - VEHICLE SERVICE RELATED,22077.0,C2 - COM - COMMUNITY,1304000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18643989008773 49.88286027...,49.882627,-97.186904
475,12080900500,1150.0,,,PORTAGE,AVENUE,1150 PORTAGE AVENUE,WOLSELEY,CMVSR - VEHICLE SERVICE RELATED,13854.0,C2 - COM - COMMUNITY,1041000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18165194695997 49.88328777...,49.883404,-97.18191
1045,12081585000,329.0,,,AUBREY,STREET,329 AUBREY STREET,WOLSELEY,CMPSP - SURFACE PARKING,2252.0,C2 - COM - COMMUNITY,101000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17478857869735 49.88404541...,49.884073,-97.174593
1046,12081586000,970.0,,,PORTAGE,AVENUE,970 PORTAGE AVENUE,WOLSELEY,CMVSR - VEHICLE SERVICE RELATED,14710.0,C2 - COM - COMMUNITY,1306000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17424499864477 49.88420159...,49.884335,-97.174497
1868,12082620100,821.0,,,PRESTON,AVENUE,821 PRESTON AVENUE,WOLSELEY,PIRPK - PARK WITH BUILDING,260483.0,R2 - RES - TWO FAMILY,7596000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.16636899776381 49.88615911...,49.885067,-97.167231
2301,12083215100,800.0,,,PORTAGE,AVENUE,800 PORTAGE AVENUE,WOLSELEY,CMOGV - GOVERNMENT OFFICE,34646.0,C2 - COM - COMMUNITY,9226000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.16513644970286 49.88653281...,49.886307,-97.165389
2308,12090012100,20.0,,,MARYLAND,STREET,20 MARYLAND STREET,WOLSELEY,CMVSR - VEHICLE SERVICE RELATED,12669.0,C2 - COM - COMMUNITY,874000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.16126663952822 49.87946390...,49.879294,-97.1615
2509,12090263100,712.0,,,PORTAGE,AVENUE,712 PORTAGE AVENUE,WOLSELEY,CMVSR - VEHICLE SERVICE RELATED,13821.0,C2 - COM - COMMUNITY,1193000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.16161712605434 49.88718639...,49.887276,-97.161863


In [31]:
# This is a very fun learning process in terms of figuring out the most efficient ways to clean.
# Is in might have been useful from the start!

len(df[df['property_use_code'].isin(['PIRPK - PARK WITH BUILDING','CMOGV - GOVERNMENT OFFICE','CMPSP - SURFACE PARKING','CMVSR - VEHICLE SERVICE RELATED'])])

9

In [32]:
# Let's drop these 9

extras = df_clean[df_clean['property_use_code'].isin(['PIRPK - PARK WITH BUILDING','CMOGV - GOVERNMENT OFFICE','CMPSP - SURFACE PARKING','CMVSR - VEHICLE SERVICE RELATED'])].index
df_clean = df_clean.drop(extras, axis=0)

In [33]:
df_clean.shape

(2635, 19)

In [34]:
# We should now be left with business and residential properties. Let's look!
df_clean.property_use_code.unique()

array(['RESSD - DETACHED SINGLE DWELLING', 'CMOFF - OFFICE',
       'CMRST - STORE', 'VRES1 - VACANT RESIDENTIAL 1',
       'RESMC - MULTIFAMILY CONVERSION',
       'RESSU - RESIDENTIAL SECONDARY UNIT',
       'RESMA - MULTIPLE ATTACHED UNITS', 'RESDU - DUPLEX',
       'CMFBK - BANK', 'CMRCV - CONVENIENCE STORE', 'CMRRE - RESTAURANT',
       'RESAP - APARTMENTS', 'CMSTP - STRIP MALL',
       'RESGC - RESIDENTIAL GROUP CARE',
       'RESMU - RESIDENTIAL MULTIPLE USE',
       'CMCMU - COMMERCIAL MULTIPLE USE', 'PIRCC - COMMUNITY CENTRE',
       'RESTR - TRIPLEX', 'CNCMP - CONDO COMPLEX',
       'CNRES - CONDO RESIDENTIAL', 'CMOMC - MEDICAL OFFICE CLINIC',
       'RESAM - APARTMENTS MULTIPLE USE',
       'PIIGC - NON-RESIDENTIAL GROUP CARE',
       'CMMRH - COMMERCIAL ROW HOUSE',
       'RESMB - RESIDENTIAL MULTIPLE BUILDINGS', 'PIIDC - DAY CARE',
       'RESSS - SIDE BY SIDE', 'RESRM - ROOMING HOUSE',
       'CNAPT - CONDO APARTMENT', 'CNCOM - CONDO COMMERCIAL'],
      dtype=object)

In [35]:
# First I'm going to split off the businesses and other non-residential properties(i.e. the community centre and daycare). 

non_residential = df_clean[df_clean['property_use_code'].isin([
    'CMOFF - OFFICE','CMRST - STORE','CMFBK - BANK','CMRCV - CONVENIENCE STORE', 
    'CMRRE - RESTAURANT','CMSTP - STRIP MALL','CMCMU - COMMERCIAL MULTIPLE USE', 
    'PIRCC - COMMUNITY CENTRE','CMOMC - MEDICAL OFFICE CLINIC', 'PIIGC - NON-RESIDENTIAL GROUP CARE',
       'CMMRH - COMMERCIAL ROW HOUSE', 'PIIDC - DAY CARE','CNCOM - CONDO COMMERCIAL'])]

In [36]:
non_residential.head(20)

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
11,12080268100,1412.0,,,PORTAGE,AVENUE,1412 PORTAGE AVENUE,WOLSELEY,CMOFF - OFFICE,13115.0,C2 - COM - COMMUNITY,2046000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.19032087073921 49.88211032...,49.881879,-97.190528
28,12080293000,1308.0,,,PORTAGE,AVENUE,1308 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,6114.0,C2 - COM - COMMUNITY,759000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18896224280488 49.88207636...,49.88225,-97.188831
29,12080295000,1314.0,,,PORTAGE,AVENUE,1314 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,2902.0,C2 - COM - COMMUNITY,225000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18893901019646 49.88237736...,49.882216,-97.189008
30,12080296100,1318.0,,,PORTAGE,AVENUE,1318 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,2318.0,C2 - COM - COMMUNITY,278000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18905277488889 49.88235538...,49.882196,-97.18911
31,12080297100,1320.0,,,PORTAGE,AVENUE,1320 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,3692.0,C2 - COM - COMMUNITY,300000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18914366106259 49.88233782...,49.882173,-97.189228
32,12080298000,1324.0,,,PORTAGE,AVENUE,1324 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,6011.0,C2 - COM - COMMUNITY,581000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18928843325554 49.88230984...,49.882137,-97.189418
33,12080300000,1330.0,,,PORTAGE,AVENUE,1330 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,17473.0,C2 - COM - COMMUNITY,1994000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.19032087073955 49.88211032...,49.881976,-97.190072
172,12080456200,1300.0,,,PORTAGE,AVENUE,1300 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,23696.0,C2 - COM - COMMUNITY,2917000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.1875547118509 49.882644865...,49.882405,-97.188031
206,12080500000,545.0,,S,TELFER,STREET,545 TELFER STREET S,WOLSELEY,CMRST - STORE,4055.0,R2 - RES - TWO FAMILY,327000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18737125272823 49.88227052...,49.882324,-97.187135
280,12080595000,1250.0,,,PORTAGE,AVENUE,1250 PORTAGE AVENUE,WOLSELEY,CMFBK - BANK,12274.0,C2 - COM - COMMUNITY,996000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18577961661313 49.88260929...,49.882785,-97.185968


In [37]:
# I'm going to list all rows because I want to double check any that I don't feel confident about

pd.set_option('display.max_rows', None)
non_residential

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
11,12080268100,1412.0,,,PORTAGE,AVENUE,1412 PORTAGE AVENUE,WOLSELEY,CMOFF - OFFICE,13115.0,C2 - COM - COMMUNITY,2046000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.19032087073921 49.88211032...,49.881879,-97.190528
28,12080293000,1308.0,,,PORTAGE,AVENUE,1308 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,6114.0,C2 - COM - COMMUNITY,759000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18896224280488 49.88207636...,49.88225,-97.188831
29,12080295000,1314.0,,,PORTAGE,AVENUE,1314 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,2902.0,C2 - COM - COMMUNITY,225000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18893901019646 49.88237736...,49.882216,-97.189008
30,12080296100,1318.0,,,PORTAGE,AVENUE,1318 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,2318.0,C2 - COM - COMMUNITY,278000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18905277488889 49.88235538...,49.882196,-97.18911
31,12080297100,1320.0,,,PORTAGE,AVENUE,1320 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,3692.0,C2 - COM - COMMUNITY,300000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18914366106259 49.88233782...,49.882173,-97.189228
32,12080298000,1324.0,,,PORTAGE,AVENUE,1324 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,6011.0,C2 - COM - COMMUNITY,581000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18928843325554 49.88230984...,49.882137,-97.189418
33,12080300000,1330.0,,,PORTAGE,AVENUE,1330 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,17473.0,C2 - COM - COMMUNITY,1994000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.19032087073955 49.88211032...,49.881976,-97.190072
172,12080456200,1300.0,,,PORTAGE,AVENUE,1300 PORTAGE AVENUE,WOLSELEY,CMRST - STORE,23696.0,C2 - COM - COMMUNITY,2917000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.1875547118509 49.882644865...,49.882405,-97.188031
206,12080500000,545.0,,S,TELFER,STREET,545 TELFER STREET S,WOLSELEY,CMRST - STORE,4055.0,R2 - RES - TWO FAMILY,327000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18737125272823 49.88227052...,49.882324,-97.187135
280,12080595000,1250.0,,,PORTAGE,AVENUE,1250 PORTAGE AVENUE,WOLSELEY,CMFBK - BANK,12274.0,C2 - COM - COMMUNITY,996000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.18577961661313 49.88260929...,49.882785,-97.185968


In [40]:
pd.reset_option('display.max_rows')

In [41]:
# After reviewing these addresses on the map, I'm going to keep the rowhouses, and the rest can be trimmed from the residential data
non_residential = df_clean[df_clean['property_use_code'].isin([
    'CMOFF - OFFICE','CMRST - STORE','CMFBK - BANK','CMRCV - CONVENIENCE STORE', 
    'CMRRE - RESTAURANT','CMSTP - STRIP MALL','CMCMU - COMMERCIAL MULTIPLE USE', 
    'PIRCC - COMMUNITY CENTRE','CMOMC - MEDICAL OFFICE CLINIC', 'PIIGC - NON-RESIDENTIAL GROUP CARE',
    'PIIDC - DAY CARE','CNCOM - CONDO COMMERCIAL'])]

In [42]:
non_residential.index

Index([  11,   28,   29,   30,   31,   32,   33,  172,  206,  280,  294,  352,
        403,  510,  540,  541,  542,  543,  544,  664,  694,  889,  890,  891,
        892,  984, 1093, 1094, 1095, 1096, 1097, 1195, 1279, 1285, 1286, 1337,
       1359, 1360, 1361, 1456, 1604, 1606, 1618, 1642, 1643, 1644, 1746, 1864,
       1865, 2163, 2164, 2165, 2318, 2398, 2418, 2506, 2508, 2519, 2520, 2524,
       2659],
      dtype='int64')

In [43]:
residential = df_clean.drop(non_residential.index, axis=0)

In [44]:
residential

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
3,12080260000,542.0,,,RAGLAN,ROAD,542 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,305000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881459,-97.191275
4,12080261000,538.0,,,RAGLAN,ROAD,538 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,364000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881322,-97.191286
5,12080262000,528.0,,,RAGLAN,ROAD,528 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,386000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19104728492096 49.88110837...,49.881185,-97.191296
6,12080263000,522.0,,,RAGLAN,ROAD,522 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,376000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19105784468728 49.88097154...,49.881048,-97.191307
9,12080265000,1338.0,,,WOLSELEY,AVENUE,1338 WOLSELEY AVENUE,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,18720.0,R2 - RES - TWO FAMILY,1385000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19051852774297 49.87921209...,49.878845,-97.190766
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2666,12097805100,54.0,401,,MARYLAND,STREET,401-54 MARYLAND STREET,WOLSELEY,CNAPT - CONDO APARTMENT,,,237000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2667,12097805105,54.0,402,,MARYLAND,STREET,402-54 MARYLAND STREET,WOLSELEY,CNAPT - CONDO APARTMENT,,,225000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2668,12097806720,205.0,,,ARLINGTON,STREET,205 ARLINGTON STREET,WOLSELEY,RESAP - APARTMENTS,10643.0,R2 - RES - TWO FAMILY,1445000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 2,MULTIPOLYGON (((-97.16939763322243 49.88371190...,49.883560,-97.169196
2669,12097809455,510.0,,,NEWMAN,STREET,510 NEWMAN STREET,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,2863.0,R2 - RES - TWO FAMILY,469000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18414267002349 49.88215925...,49.882111,-97.183947


In [45]:
# Nice! I now have a dataframe with only the residential buildings in Wolseley! 
residential.property_use_code.unique()

array(['RESSD - DETACHED SINGLE DWELLING', 'VRES1 - VACANT RESIDENTIAL 1',
       'RESMC - MULTIFAMILY CONVERSION',
       'RESSU - RESIDENTIAL SECONDARY UNIT',
       'RESMA - MULTIPLE ATTACHED UNITS', 'RESDU - DUPLEX',
       'RESAP - APARTMENTS', 'RESGC - RESIDENTIAL GROUP CARE',
       'RESMU - RESIDENTIAL MULTIPLE USE', 'RESTR - TRIPLEX',
       'CNCMP - CONDO COMPLEX', 'CNRES - CONDO RESIDENTIAL',
       'RESAM - APARTMENTS MULTIPLE USE', 'CMMRH - COMMERCIAL ROW HOUSE',
       'RESMB - RESIDENTIAL MULTIPLE BUILDINGS', 'RESSS - SIDE BY SIDE',
       'RESRM - ROOMING HOUSE', 'CNAPT - CONDO APARTMENT'], dtype=object)

In [46]:
# What's left in the vacant properties?

residential[residential['property_use_code']=='VRES1 - VACANT RESIDENTIAL 1']

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
34,12080305500,534.0,,,CRAIG,STREET,534 CRAIG STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,3475.0,R2 - RES - TWO FAMILY,214000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18977953520537 49.88170755...,49.88178,-97.189913
236,12080530000,1270.0,,,WOLSELEY,AVENUE,1270 WOLSELEY AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,18092.0,R2 - RES - TWO FAMILY,522000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18702788621498 49.87923470...,49.879481,-97.187268
701,12081184000,1064.0,,,PALMERSTON,AVENUE,1064 PALMERSTON AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,5250.0,R2 - RES - TWO FAMILY,442000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17784448030837 49.87792767...,49.878106,-97.177408
704,12081189100,104.0,,S,GARFIELD,STREET,104 GARFIELD STREET S,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,7125.0,R2 - RES - TWO FAMILY,438000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17770600326995 49.87890874...,49.878673,-97.178267
919,12081435000,1020.0,,,PALMERSTON,AVENUE,1020 PALMERSTON AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,4255.0,R2 - RES - TWO FAMILY,430000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17589973184508 49.87693016...,49.877107,-97.17571
989,12081510000,35.0,,,AUBREY,STREET,35 AUBREY STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,25799.0,R2 - RES - TWO FAMILY,400000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17520487593019 49.87856533...,49.878172,-97.175041
990,12081519200,139.0,,,AUBREY,STREET,139 AUBREY STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,16324.0,R2 - RES - TWO FAMILY,369000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17516781601665 49.87905534...,49.878804,-97.174993
1222,12081806000,49.0,,,LENORE,STREET,49 LENORE STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,3051.0,R2 - RES - TWO FAMILY,222000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17137844360599 49.87916789...,49.879129,-97.171578
1223,12081807000,55.0,,,LENORE,STREET,55 LENORE STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,6101.0,R2 - RES - TWO FAMILY,250000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17177019162443 49.87917997...,49.879264,-97.171568
1596,12082275000,165.0,,,EVANSON,STREET,165 EVANSON STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,2444.0,R2 - RES - TWO FAMILY,231000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17014995135631 49.88237649...,49.882424,-97.170313


In [47]:
# Here's something I don't understand about the city data...
# SEVERAL on this list are parks! And yet they have the use code 'VRES', they're zoned R2, and they have a class of residential 1. 
# Maybe a question for the data people...

# But in the meantime, there aren't very many so I'm doing this manually. 
# I am going to drop any rows that refer to a genuinely vacant property, based on cross referencing to maps.

vacants_index = [34, 236, 919, 989, 990, 1222, 1223, 1596, 1915, 1968, 2462]
residential = residential.drop(vacants_index, axis=0)

In [48]:
# Amazing! Residential is real! 

residential

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
3,12080260000,542.0,,,RAGLAN,ROAD,542 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,305000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881459,-97.191275
4,12080261000,538.0,,,RAGLAN,ROAD,538 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,364000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881322,-97.191286
5,12080262000,528.0,,,RAGLAN,ROAD,528 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,386000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19104728492096 49.88110837...,49.881185,-97.191296
6,12080263000,522.0,,,RAGLAN,ROAD,522 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,376000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19105784468728 49.88097154...,49.881048,-97.191307
9,12080265000,1338.0,,,WOLSELEY,AVENUE,1338 WOLSELEY AVENUE,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,18720.0,R2 - RES - TWO FAMILY,1385000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19051852774297 49.87921209...,49.878845,-97.190766
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2666,12097805100,54.0,401,,MARYLAND,STREET,401-54 MARYLAND STREET,WOLSELEY,CNAPT - CONDO APARTMENT,,,237000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2667,12097805105,54.0,402,,MARYLAND,STREET,402-54 MARYLAND STREET,WOLSELEY,CNAPT - CONDO APARTMENT,,,225000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2668,12097806720,205.0,,,ARLINGTON,STREET,205 ARLINGTON STREET,WOLSELEY,RESAP - APARTMENTS,10643.0,R2 - RES - TWO FAMILY,1445000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 2,MULTIPOLYGON (((-97.16939763322243 49.88371190...,49.883560,-97.169196
2669,12097809455,510.0,,,NEWMAN,STREET,510 NEWMAN STREET,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,2863.0,R2 - RES - TWO FAMILY,469000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18414267002349 49.88215925...,49.882111,-97.183947


In [49]:
# Dealing with condos/apartments feels more complex. I will maybe have to come back to that.

In [50]:
# First... going cylcing to collect more garden data! Bye!

In [51]:
# Cleaning cont...
# I want to make this even smaller by getting rid of a few more columns

residential.columns

Index(['roll_number', 'street_number', 'unit_number', 'street_direction',
       'street_name', 'street_type', 'full_address', 'neighbourhood_area',
       'property_use_code', 'assessed_land_area', 'zoning',
       'total_assessed_value', 'assessment_date', 'detail_url',
       'current_assessment_year', 'property_class_1', 'geometry',
       'centroid_lat', 'centroid_lon'],
      dtype='object')

In [52]:
residential = residential[['roll_number', 'street_number', 'unit_number','street_name', 'street_type', 'full_address','property_use_code', 'assessed_land_area', 'zoning', 'total_assessed_value','property_class_1', 'geometry', 'centroid_lat', 'centroid_lon']]

In [53]:
residential

Unnamed: 0,roll_number,street_number,unit_number,street_name,street_type,full_address,property_use_code,assessed_land_area,zoning,total_assessed_value,property_class_1,geometry,centroid_lat,centroid_lon
3,12080260000,542.0,,RAGLAN,ROAD,542 RAGLAN ROAD,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,305000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881459,-97.191275
4,12080261000,538.0,,RAGLAN,ROAD,538 RAGLAN ROAD,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,364000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881322,-97.191286
5,12080262000,528.0,,RAGLAN,ROAD,528 RAGLAN ROAD,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,386000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19104728492096 49.88110837...,49.881185,-97.191296
6,12080263000,522.0,,RAGLAN,ROAD,522 RAGLAN ROAD,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,376000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19105784468728 49.88097154...,49.881048,-97.191307
9,12080265000,1338.0,,WOLSELEY,AVENUE,1338 WOLSELEY AVENUE,RESSD - DETACHED SINGLE DWELLING,18720.0,R2 - RES - TWO FAMILY,1385000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19051852774297 49.87921209...,49.878845,-97.190766
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2666,12097805100,54.0,401,MARYLAND,STREET,401-54 MARYLAND STREET,CNAPT - CONDO APARTMENT,,,237000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2667,12097805105,54.0,402,MARYLAND,STREET,402-54 MARYLAND STREET,CNAPT - CONDO APARTMENT,,,225000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2668,12097806720,205.0,,ARLINGTON,STREET,205 ARLINGTON STREET,RESAP - APARTMENTS,10643.0,R2 - RES - TWO FAMILY,1445000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.16939763322243 49.88371190...,49.883560,-97.169196
2669,12097809455,510.0,,NEWMAN,STREET,510 NEWMAN STREET,RESSD - DETACHED SINGLE DWELLING,2863.0,R2 - RES - TWO FAMILY,469000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.18414267002349 49.88215925...,49.882111,-97.183947


In [54]:
residential.to_csv("wolseley_residential.csv",encoding='utf-8')

### CORRECTION:

I'm realizing I don't want to strip all the commercial properties. I'm going back to the df_clean dataframe (which still includes just businesses and residental) to clean up some vacant properties and work from there, ignoring residential.

In [55]:
df_clean

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
3,12080260000,542.0,,,RAGLAN,ROAD,542 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,305000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881459,-97.191275
4,12080261000,538.0,,,RAGLAN,ROAD,538 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,364000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881322,-97.191286
5,12080262000,528.0,,,RAGLAN,ROAD,528 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,386000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19104728492096 49.88110837...,49.881185,-97.191296
6,12080263000,522.0,,,RAGLAN,ROAD,522 RAGLAN ROAD,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,376000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19105784468728 49.88097154...,49.881048,-97.191307
9,12080265000,1338.0,,,WOLSELEY,AVENUE,1338 WOLSELEY AVENUE,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,18720.0,R2 - RES - TWO FAMILY,1385000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.19051852774297 49.87921209...,49.878845,-97.190766
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2666,12097805100,54.0,401,,MARYLAND,STREET,401-54 MARYLAND STREET,WOLSELEY,CNAPT - CONDO APARTMENT,,,237000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2667,12097805105,54.0,402,,MARYLAND,STREET,402-54 MARYLAND STREET,WOLSELEY,CNAPT - CONDO APARTMENT,,,225000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2668,12097806720,205.0,,,ARLINGTON,STREET,205 ARLINGTON STREET,WOLSELEY,RESAP - APARTMENTS,10643.0,R2 - RES - TWO FAMILY,1445000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 2,MULTIPOLYGON (((-97.16939763322243 49.88371190...,49.883560,-97.169196
2669,12097809455,510.0,,,NEWMAN,STREET,510 NEWMAN STREET,WOLSELEY,RESSD - DETACHED SINGLE DWELLING,2863.0,R2 - RES - TWO FAMILY,469000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/AsmtPub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18414267002349 49.88215925...,49.882111,-97.183947


In [56]:
# Filtering to look at vacant residential and ensure the index remains the same
df_clean[df_clean['property_use_code']=='VRES1 - VACANT RESIDENTIAL 1']

Unnamed: 0,roll_number,street_number,unit_number,street_direction,street_name,street_type,full_address,neighbourhood_area,property_use_code,assessed_land_area,zoning,total_assessed_value,assessment_date,detail_url,current_assessment_year,property_class_1,geometry,centroid_lat,centroid_lon
34,12080305500,534.0,,,CRAIG,STREET,534 CRAIG STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,3475.0,R2 - RES - TWO FAMILY,214000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18977953520537 49.88170755...,49.88178,-97.189913
236,12080530000,1270.0,,,WOLSELEY,AVENUE,1270 WOLSELEY AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,18092.0,R2 - RES - TWO FAMILY,522000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.18702788621498 49.87923470...,49.879481,-97.187268
701,12081184000,1064.0,,,PALMERSTON,AVENUE,1064 PALMERSTON AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,5250.0,R2 - RES - TWO FAMILY,442000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17784448030837 49.87792767...,49.878106,-97.177408
704,12081189100,104.0,,S,GARFIELD,STREET,104 GARFIELD STREET S,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,7125.0,R2 - RES - TWO FAMILY,438000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17770600326995 49.87890874...,49.878673,-97.178267
919,12081435000,1020.0,,,PALMERSTON,AVENUE,1020 PALMERSTON AVENUE,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,4255.0,R2 - RES - TWO FAMILY,430000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17589973184508 49.87693016...,49.877107,-97.17571
989,12081510000,35.0,,,AUBREY,STREET,35 AUBREY STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,25799.0,R2 - RES - TWO FAMILY,400000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17520487593019 49.87856533...,49.878172,-97.175041
990,12081519200,139.0,,,AUBREY,STREET,139 AUBREY STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,16324.0,R2 - RES - TWO FAMILY,369000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17516781601665 49.87905534...,49.878804,-97.174993
1222,12081806000,49.0,,,LENORE,STREET,49 LENORE STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,3051.0,R2 - RES - TWO FAMILY,222000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17137844360599 49.87916789...,49.879129,-97.171578
1223,12081807000,55.0,,,LENORE,STREET,55 LENORE STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,6101.0,R2 - RES - TWO FAMILY,250000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,OTHER,MULTIPOLYGON (((-97.17177019162443 49.87917997...,49.879264,-97.171568
1596,12082275000,165.0,,,EVANSON,STREET,165 EVANSON STREET,WOLSELEY,VRES1 - VACANT RESIDENTIAL 1,2444.0,R2 - RES - TWO FAMILY,231000.0,04/01/2023 12:00:00 AM,http://www.winnipegassessment.com/asmtpub/engl...,2026,RESIDENTIAL 1,MULTIPOLYGON (((-97.17014995135631 49.88237649...,49.882424,-97.170313


In [57]:
# Which it is! Drop vacants in new clean data frame 'wolseley'
wolseley = df_clean.drop(vacants_index, axis=0)

In [58]:
# And trim off excess columns
wolseley = wolseley[['roll_number', 'street_number', 'unit_number','street_name', 'street_type', 'full_address','property_use_code', 'assessed_land_area', 'zoning', 'total_assessed_value','property_class_1', 'geometry', 'centroid_lat', 'centroid_lon']]

In [59]:
wolseley

Unnamed: 0,roll_number,street_number,unit_number,street_name,street_type,full_address,property_use_code,assessed_land_area,zoning,total_assessed_value,property_class_1,geometry,centroid_lat,centroid_lon
3,12080260000,542.0,,RAGLAN,ROAD,542 RAGLAN ROAD,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,305000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881459,-97.191275
4,12080261000,538.0,,RAGLAN,ROAD,538 RAGLAN ROAD,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,364000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19102616382078 49.88138204...,49.881322,-97.191286
5,12080262000,528.0,,RAGLAN,ROAD,528 RAGLAN ROAD,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,386000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19104728492096 49.88110837...,49.881185,-97.191296
6,12080263000,522.0,,RAGLAN,ROAD,522 RAGLAN ROAD,RESSD - DETACHED SINGLE DWELLING,5997.0,R2 - RES - TWO FAMILY,376000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19105784468728 49.88097154...,49.881048,-97.191307
9,12080265000,1338.0,,WOLSELEY,AVENUE,1338 WOLSELEY AVENUE,RESSD - DETACHED SINGLE DWELLING,18720.0,R2 - RES - TWO FAMILY,1385000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.19051852774297 49.87921209...,49.878845,-97.190766
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2666,12097805100,54.0,401,MARYLAND,STREET,401-54 MARYLAND STREET,CNAPT - CONDO APARTMENT,,,237000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2667,12097805105,54.0,402,MARYLAND,STREET,402-54 MARYLAND STREET,CNAPT - CONDO APARTMENT,,,225000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2668,12097806720,205.0,,ARLINGTON,STREET,205 ARLINGTON STREET,RESAP - APARTMENTS,10643.0,R2 - RES - TWO FAMILY,1445000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.16939763322243 49.88371190...,49.883560,-97.169196
2669,12097809455,510.0,,NEWMAN,STREET,510 NEWMAN STREET,RESSD - DETACHED SINGLE DWELLING,2863.0,R2 - RES - TWO FAMILY,469000.0,RESIDENTIAL 1,MULTIPOLYGON (((-97.18414267002349 49.88215925...,49.882111,-97.183947


In [60]:
wolseley.to_csv("wolseley_all.csv", encoding = "utf8")

## CONDO AND APARTMENTS

Here's the situation: 

Some of these addresses are condo buildings or apartment buildings, while others are individual units. 

I ultimately want to analyze both average property values and concentration of gardens on each street relative to the number of dwelligs. 

Which means I need to sort out the most accurate way to deal with these multi-unit buildings.

In [61]:
# Let's see what we're working with first

wolseley.property_use_code.value_counts()

property_use_code
RESSD - DETACHED SINGLE DWELLING          1942
RESMC - MULTIFAMILY CONVERSION             351
CNAPT - CONDO APARTMENT                    118
RESDU - DUPLEX                              45
RESAP - APARTMENTS                          35
CMRST - STORE                               28
CNRES - CONDO RESIDENTIAL                   26
CMOFF - OFFICE                              16
CNCMP - CONDO COMPLEX                        9
RESSS - SIDE BY SIDE                         8
RESSU - RESIDENTIAL SECONDARY UNIT           4
RESMA - MULTIPLE ATTACHED UNITS              4
RESTR - TRIPLEX                              4
RESGC - RESIDENTIAL GROUP CARE               4
CMCMU - COMMERCIAL MULTIPLE USE              3
RESMU - RESIDENTIAL MULTIPLE USE             3
VRES1 - VACANT RESIDENTIAL 1                 3
CMSTP - STRIP MALL                           3
CMRRE - RESTAURANT                           3
RESAM - APARTMENTS MULTIPLE USE              2
CMMRH - COMMERCIAL ROW HOUSE              

In [62]:
# Does everything listed as a 'Condo Apartment' have a unit number?

wolseley[wolseley['property_use_code'] == 'CNAPT - CONDO APARTMENT'].unit_number.isna().value_counts()

unit_number
False    118
Name: count, dtype: int64

In [63]:
# What are the 'Condo Complex' and 'Condo Residential' categories? 

wolseley[wolseley['property_use_code'] == 'CNCMP - CONDO COMPLEX']

Unnamed: 0,roll_number,street_number,unit_number,street_name,street_type,full_address,property_use_code,assessed_land_area,zoning,total_assessed_value,property_class_1,geometry,centroid_lat,centroid_lon
1227,12081813100,81.0,,LENORE,STREET,81 LENORE STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
2319,12090045000,70.0,,MARYLAND,STREET,70 MARYLAND STREET,CNCMP - CONDO COMPLEX,,RMFM - RES - MULTI-FAMILY,,,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386
2527,12097530900,272.0,,HOME,STREET,272 HOME STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.16879391698427 49.88495781...,49.884951,-97.168581
2530,12097627600,828.0,,PRESTON,AVENUE,828 PRESTON AVENUE,CNCMP - CONDO COMPLEX,,,,,MULTIPOLYGON (((-97.16815998526683 49.88397379...,49.883809,-97.167931
2567,12097644000,246.0,,HOME,STREET,246 HOME STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.16885735853232 49.88410122...,49.884208,-97.168636
2608,12097696700,52.0,,FAWCETT,AVENUE,52 FAWCETT AVENUE,CNCMP - CONDO COMPLEX,,,,,MULTIPOLYGON (((-97.16318119679728 49.88423481...,49.884095,-97.163083
2623,12097705900,504.0,,DOMINION,STREET,504 DOMINION STREET,CNCMP - CONDO COMPLEX,,,,,MULTIPOLYGON (((-97.17899854702819 49.88181744...,49.881743,-97.178793
2632,12097775000,28.0,,WOODROW,PLACE,28 WOODROW PLACE,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.16196857341524 49.87948783...,49.879221,-97.161889
2658,12097805060,54.0,,MARYLAND,STREET,54 MARYLAND STREET,CNCMP - CONDO COMPLEX,,RMFM - RES - MULTI-FAMILY,,,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407


In [64]:
wolseley[wolseley['property_use_code'] == 'CNRES - CONDO RESIDENTIAL']

Unnamed: 0,roll_number,street_number,unit_number,street_name,street_type,full_address,property_use_code,assessed_land_area,zoning,total_assessed_value,property_class_1,geometry,centroid_lat,centroid_lon
1228,12081813200,81.0,1.0,LENORE,STREET,1-81 LENORE STREET,CNRES - CONDO RESIDENTIAL,,,265000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
1229,12081813300,83.0,3.0,LENORE,STREET,3-83 LENORE STREET,CNRES - CONDO RESIDENTIAL,,,282000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
1230,12081813400,81.0,2.0,LENORE,STREET,2-81 LENORE STREET,CNRES - CONDO RESIDENTIAL,,,291000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
1231,12081813500,83.0,4.0,LENORE,STREET,4-83 LENORE STREET,CNRES - CONDO RESIDENTIAL,,,287000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
2320,12090045100,70.0,101.0,MARYLAND,STREET,101-70 MARYLAND STREET,CNRES - CONDO RESIDENTIAL,,,198000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386
2321,12090045200,70.0,102.0,MARYLAND,STREET,102-70 MARYLAND STREET,CNRES - CONDO RESIDENTIAL,,,194000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386
2322,12090045300,70.0,103.0,MARYLAND,STREET,103-70 MARYLAND STREET,CNRES - CONDO RESIDENTIAL,,,197000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386
2323,12090045400,70.0,104.0,MARYLAND,STREET,104-70 MARYLAND STREET,CNRES - CONDO RESIDENTIAL,,,194000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386
2324,12090045500,70.0,105.0,MARYLAND,STREET,105-70 MARYLAND STREET,CNRES - CONDO RESIDENTIAL,,,196000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386
2325,12090045600,70.0,106.0,MARYLAND,STREET,106-70 MARYLAND STREET,CNRES - CONDO RESIDENTIAL,,,197000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386


Here's what we've got:

Condo complex refers to the larger condo building itself. It has no property value, and represents any number of units 
Condo residential and condo apartments refers to individual units, which do have values, but would count as multiple dwellings on a street, for example.

The way I am collecting garden data is by address number. There are a few cases where this _does_ refer to a multi-unit building. For example, across the street from me is a seniors facility where the ladies love gardening, and have planted on the boulevard  on all three sides of the building. There are SEVERAL units, and I can't guess who is responsible for each garden (I mean, in some cases I know, but it's because they are my neighbours), so the gardens are all attributed to the one building. 

In this situation, I want to have one row for that condo building that I can attribute the garden to, but still average the property values of all the units. I also need to decide whether I am counting the number of _buildings_ on a street for the relative garden concentration, or the number of _dwellings_ (which I would define as each individual unit). 

So I need to make an editorial decision.

At this point in time, my idea is this:

Wolseley is a mixed density neighbourhood. There are commercial properties in residential buildings, condos, apartments, duplexes, triplexes, multi-use houses, etc. It would be futile to try and untangle ALL of these use classes and try to attribute a garden to one individual in an apartment, a condo, an assisted living facility, etc. The most important data in this story is the number of gardens on the block. Therefore, I am going to treat each BUILDING as a distinct object, regardless of how many different units it contains. It is, after all, the building — **the LOT** — which has access to a boulevard. 

This means the aggregation I need to do is assigning an average property value to a multi-unit building so that data is still captured. 

From what I can see in the geojson of this data, the issue is specific to condos, because condo units are owned and sold individually and therefore can have individual property values, whereas an apartment building has a value for the whole plot the building itself is on (note to self that this will skew the property values on blocks with apartments, but that's a problem for later). That means I only need to do this aggregation for condos.

In [65]:
# Steps:
# Find all the condo complexes and the associated apartments
# For each of the complexes, find the mean property value for associated apartments
# Save that average value to the property value column for the complex
# Drop apartment rows

In [66]:
wolseley[wolseley['property_use_code'].isin(['CNRES - CONDO RESIDENTIAL','CNAPT - CONDO APARTMENT','CNCMP - CONDO COMPLEX'])]

Unnamed: 0,roll_number,street_number,unit_number,street_name,street_type,full_address,property_use_code,assessed_land_area,zoning,total_assessed_value,property_class_1,geometry,centroid_lat,centroid_lon
1227,12081813100,81.0,,LENORE,STREET,81 LENORE STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
1228,12081813200,81.0,1,LENORE,STREET,1-81 LENORE STREET,CNRES - CONDO RESIDENTIAL,,,265000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
1229,12081813300,83.0,3,LENORE,STREET,3-83 LENORE STREET,CNRES - CONDO RESIDENTIAL,,,282000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
1230,12081813400,81.0,2,LENORE,STREET,2-81 LENORE STREET,CNRES - CONDO RESIDENTIAL,,,291000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
1231,12081813500,83.0,4,LENORE,STREET,4-83 LENORE STREET,CNRES - CONDO RESIDENTIAL,,,287000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2663,12097805085,54.0,301,MARYLAND,STREET,301-54 MARYLAND STREET,CNAPT - CONDO APARTMENT,,,222000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2664,12097805090,54.0,302,MARYLAND,STREET,302-54 MARYLAND STREET,CNAPT - CONDO APARTMENT,,,167000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2665,12097805095,54.0,303,MARYLAND,STREET,303-54 MARYLAND STREET,CNAPT - CONDO APARTMENT,,,192000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407
2666,12097805100,54.0,401,MARYLAND,STREET,401-54 MARYLAND STREET,CNAPT - CONDO APARTMENT,,,237000.0,RESIDENTIAL 3,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407


In [67]:
condos = wolseley[wolseley['property_use_code'].isin(['CNRES - CONDO RESIDENTIAL','CNAPT - CONDO APARTMENT','CNCMP - CONDO COMPLEX'])]

In [68]:
condos.groupby(['street_name', 'street_number']).total_assessed_value.mean()

street_name  street_number
DOMINION     504.0            214500.000000
             506.0            219000.000000
FAWCETT      52.0             124071.428571
HOME         246.0            107428.571429
             272.0            182466.666667
LENORE       81.0             278000.000000
             83.0             284500.000000
MARYLAND     54.0             205125.000000
             70.0             196000.000000
PRESTON      828.0            142527.777778
PURCELL      3.0              238000.000000
             5.0              238000.000000
WOODROW      28.0              65350.000000
Name: total_assessed_value, dtype: float64

In [69]:
condos[condos['unit_number'].isna()]

Unnamed: 0,roll_number,street_number,unit_number,street_name,street_type,full_address,property_use_code,assessed_land_area,zoning,total_assessed_value,property_class_1,geometry,centroid_lat,centroid_lon
1227,12081813100,81.0,,LENORE,STREET,81 LENORE STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
2319,12090045000,70.0,,MARYLAND,STREET,70 MARYLAND STREET,CNCMP - CONDO COMPLEX,,RMFM - RES - MULTI-FAMILY,,,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386
2527,12097530900,272.0,,HOME,STREET,272 HOME STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.16879391698427 49.88495781...,49.884951,-97.168581
2530,12097627600,828.0,,PRESTON,AVENUE,828 PRESTON AVENUE,CNCMP - CONDO COMPLEX,,,,,MULTIPOLYGON (((-97.16815998526683 49.88397379...,49.883809,-97.167931
2567,12097644000,246.0,,HOME,STREET,246 HOME STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.16885735853232 49.88410122...,49.884208,-97.168636
2608,12097696700,52.0,,FAWCETT,AVENUE,52 FAWCETT AVENUE,CNCMP - CONDO COMPLEX,,,,,MULTIPOLYGON (((-97.16318119679728 49.88423481...,49.884095,-97.163083
2623,12097705900,504.0,,DOMINION,STREET,504 DOMINION STREET,CNCMP - CONDO COMPLEX,,,,,MULTIPOLYGON (((-97.17899854702819 49.88181744...,49.881743,-97.178793
2630,12097751400,3.0,,PURCELL,AVENUE,3 PURCELL AVENUE,CNRES - CONDO RESIDENTIAL,,,238000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.16144826624206 49.88290017...,49.88284,-97.161247
2631,12097751500,5.0,,PURCELL,AVENUE,5 PURCELL AVENUE,CNRES - CONDO RESIDENTIAL,,,238000.0,RESIDENTIAL 2,MULTIPOLYGON (((-97.16144826624206 49.88290017...,49.88284,-97.161247
2632,12097775000,28.0,,WOODROW,PLACE,28 WOODROW PLACE,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,,,MULTIPOLYGON (((-97.16196857341524 49.87948783...,49.879221,-97.161889


In [70]:
condos.groupby(['street_name']).total_assessed_value.mean()

street_name
DOMINION    216750.000000
FAWCETT     124071.428571
HOME        138694.444444
LENORE      281250.000000
MARYLAND    198607.142857
PRESTON     142527.777778
PURCELL     238000.000000
WOODROW      65350.000000
Name: total_assessed_value, dtype: float64

In [71]:
# Couple scenarios at play here: 
# The condos on Dominion and Lenore are one complex but the units have two street addresses.
# Whereas Home and Maryland have two separate complexes
# I feel the best approach here will be manual because I have no clue what else to do!

In [72]:
# street_name  number         total_assessed_value    index
# WOODROW      28.0              65350.000000          2632
# PRESTON      828.0            142527.777778          2530
# FAWCETT      52.0             124071.428571          2608
# HOME         246.0            107428.571429          2567
# HOME         272.0            182466.666667          2527
# MARYLAND     54.0             205125.000000          2658
# MARYLAND     70.0             196000.000000          2319
# DOMINION     504.0            216750.000000          2623
# LENORE       81.0             281250.000000          1227

In [73]:
wolseley.loc[2632, 'total_assessed_value'] = 65350.000000
wolseley.loc[2530, 'total_assessed_value'] = 142527.777778
wolseley.loc[2608, 'total_assessed_value'] = 124071.428571
wolseley.loc[2567, 'total_assessed_value'] = 107428.571429
wolseley.loc[2527, 'total_assessed_value'] = 182466.666667
wolseley.loc[2658, 'total_assessed_value'] = 205125.000000
wolseley.loc[2319, 'total_assessed_value'] = 196000.000000
wolseley.loc[2623, 'total_assessed_value'] = 216750.000000 
wolseley.loc[1227, 'total_assessed_value'] = 281250.000000

In [74]:
wolseley[wolseley['property_use_code']=='CNCMP - CONDO COMPLEX']

Unnamed: 0,roll_number,street_number,unit_number,street_name,street_type,full_address,property_use_code,assessed_land_area,zoning,total_assessed_value,property_class_1,geometry,centroid_lat,centroid_lon
1227,12081813100,81.0,,LENORE,STREET,81 LENORE STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,281250.0,,MULTIPOLYGON (((-97.17170754959602 49.88003182...,49.879933,-97.171493
2319,12090045000,70.0,,MARYLAND,STREET,70 MARYLAND STREET,CNCMP - CONDO COMPLEX,,RMFM - RES - MULTI-FAMILY,196000.0,,MULTIPOLYGON (((-97.1615777722684 49.881076916...,49.88087,-97.161386
2527,12097530900,272.0,,HOME,STREET,272 HOME STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,182466.666667,,MULTIPOLYGON (((-97.16879391698427 49.88495781...,49.884951,-97.168581
2530,12097627600,828.0,,PRESTON,AVENUE,828 PRESTON AVENUE,CNCMP - CONDO COMPLEX,,,142527.777778,,MULTIPOLYGON (((-97.16815998526683 49.88397379...,49.883809,-97.167931
2567,12097644000,246.0,,HOME,STREET,246 HOME STREET,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,107428.571429,,MULTIPOLYGON (((-97.16885735853232 49.88410122...,49.884208,-97.168636
2608,12097696700,52.0,,FAWCETT,AVENUE,52 FAWCETT AVENUE,CNCMP - CONDO COMPLEX,,,124071.428571,,MULTIPOLYGON (((-97.16318119679728 49.88423481...,49.884095,-97.163083
2623,12097705900,504.0,,DOMINION,STREET,504 DOMINION STREET,CNCMP - CONDO COMPLEX,,,216750.0,,MULTIPOLYGON (((-97.17899854702819 49.88181744...,49.881743,-97.178793
2632,12097775000,28.0,,WOODROW,PLACE,28 WOODROW PLACE,CNCMP - CONDO COMPLEX,,R2 - RES - TWO FAMILY,65350.0,,MULTIPOLYGON (((-97.16196857341524 49.87948783...,49.879221,-97.161889
2658,12097805060,54.0,,MARYLAND,STREET,54 MARYLAND STREET,CNCMP - CONDO COMPLEX,,RMFM - RES - MULTI-FAMILY,205125.0,,MULTIPOLYGON (((-97.16161542279819 49.88054499...,49.880481,-97.161407


In [75]:
# Now to drop the condo apartments

condo_apartments = wolseley[wolseley['property_use_code']== 'CNAPT - CONDO APARTMENT'].index
wolseley= wolseley.drop(condo_apartments,axis=0)

In [76]:
condo_apartments_2 = wolseley[(wolseley['property_use_code']== 'CNRES - CONDO RESIDENTIAL') & (wolseley['street_name']!= 'PURCELL')].index
wolseley= wolseley.drop(condo_apartments_2,axis=0)

In [77]:
wolseley.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2482 entries, 3 to 2670
Data columns (total 14 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   roll_number           2482 non-null   int64  
 1   street_number         2482 non-null   float64
 2   unit_number           6 non-null      object 
 3   street_name           2482 non-null   object 
 4   street_type           2473 non-null   object 
 5   full_address          2482 non-null   object 
 6   property_use_code     2482 non-null   object 
 7   assessed_land_area    2470 non-null   float64
 8   zoning                2476 non-null   object 
 9   total_assessed_value  2482 non-null   float64
 10  property_class_1      2473 non-null   object 
 11  geometry              2482 non-null   object 
 12  centroid_lat          2482 non-null   float64
 13  centroid_lon          2482 non-null   float64
dtypes: float64(5), int64(1), object(8)
memory usage: 290.9+ KB


In [78]:
# I think this is close enough. At least for a first project.

## Merging garden data:

The time has come to read in the data that really matters - the garden data!

### **Garden methodology:**

Over the course of three days, I walked or cycled the length of every street in the Wolseley neighbourhood, using a paper street map as a guide, and recorded the street number and street name of each boulevard garden I saw. 

There are some important methodology notes to keep in mind:

- A boulevard garden is defined as a planted garden on the city-owned portion of the roadway. In most cases, this is the strip of vegetation between the road and the sidewalk. In the case of a street with no sidewalk, this is defined as vegetation that reaches all the way to the curb with no visible grass between the property line and the roadway.
- A boulevard garden is defined as vegetation and other decoration where more than one type of vegetation has been intentionally planted. This is to exclude, most notably, patches of tiger lilies around the base of a tree, which are prolific in the neighbourhood, but may or may not have been planted by the resident. This is, of course, a subjective assessment. There is a chance some gardens were missed.
- Corner properties with gardens on both sides of the street have been recorded as one garden, but the location of the garden(s) is included in the notes.

This data includes a notes field. Throughout the collection process, if there were gardens that were uncertain, street numbers that were difficult to see or any other comment, it was recorded in the notes. Any questions about a particular point were later cross-referenced with satellite imagery, the latest available street view data or a second check when necessary.



In [106]:
# Read in gardens
gardens = pd.read_csv("wolseley_has_garden.csv")

In [107]:
gardens

Unnamed: 0,street_number,street_name,street_type,full_address,has_garden,notes
0,38,ALLOWAY,AVENUE,38 ALLOWAY AVENUE,True,
1,51,ALLOWAY,AVENUE,51 ALLOWAY AVENUE,True,
2,200,ARLINGTON,STREET,200 ARLINGTON STREET,True,old grace
3,14,ARLINGTON,STREET,14 ARLINGTON STREET,True,
4,16,ARLINGTON,STREET,16 ARLINGTON STREET,True,
5,21,ARLINGTON,STREET,21 ARLINGTON STREET,True,
6,25,ARLINGTON,STREET,25 ARLINGTON STREET,True,
7,29,ARLINGTON,STREET,29 ARLINGTON STREET,True,
8,31,ARLINGTON,STREET,31 ARLINGTON STREET,True,
9,34,ARLINGTON,STREET,34 ARLINGTON STREET,True,overgrown but appears intentional


In [108]:
# Need to make sure all the address columns are aligned for a merge
gardens.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 358 entries, 0 to 357
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   street_number  358 non-null    object
 1   street_name    358 non-null    object
 2   street_type    358 non-null    object
 3   full_address   358 non-null    object
 4   has_garden     358 non-null    bool  
 5   notes          42 non-null     object
dtypes: bool(1), object(5)
memory usage: 14.5+ KB


In [109]:
wolseley.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2482 entries, 3 to 2670
Data columns (total 14 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   roll_number           2482 non-null   int64  
 1   street_number         2482 non-null   object 
 2   unit_number           6 non-null      object 
 3   street_name           2482 non-null   object 
 4   street_type           2473 non-null   object 
 5   full_address          2482 non-null   object 
 6   property_use_code     2482 non-null   object 
 7   assessed_land_area    2470 non-null   float64
 8   zoning                2476 non-null   object 
 9   total_assessed_value  2482 non-null   float64
 10  property_class_1      2473 non-null   object 
 11  geometry              2482 non-null   object 
 12  centroid_lat          2482 non-null   float64
 13  centroid_lon          2482 non-null   float64
dtypes: float64(4), int64(1), object(9)
memory usage: 290.9+ KB


In [110]:
# Wolseley street number should be converted to string
wolseley = wolseley.astype({'street_number': 'object'})

In [142]:
# I am hoping to merge this sheet with the wolseley sheet, with all rows included and using 'full_address' to match

wolseley_gardens = pd.merge(wolseley, gardens, on=['full_address'], how='outer')

In [143]:
wolseley_gardens.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2484 entries, 0 to 2483
Data columns (total 19 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   roll_number           2482 non-null   float64
 1   street_number_x       2482 non-null   object 
 2   unit_number           6 non-null      object 
 3   street_name_x         2482 non-null   object 
 4   street_type_x         2473 non-null   object 
 5   full_address          2484 non-null   object 
 6   property_use_code     2482 non-null   object 
 7   assessed_land_area    2470 non-null   float64
 8   zoning                2476 non-null   object 
 9   total_assessed_value  2482 non-null   float64
 10  property_class_1      2473 non-null   object 
 11  geometry              2482 non-null   object 
 12  centroid_lat          2482 non-null   float64
 13  centroid_lon          2482 non-null   float64
 14  street_number_y       358 non-null    object 
 15  street_name_y        

In [131]:
# We have a few extra rows (7 in total) and I want to see them!

# wolseley_gardens[wolseley_gardens.roll_number.isna()]

#### Notes on extra rows:

- There is no 146 Ethelbert in the roll data, it's actually 144 Ethelbert (a duplex).
- Same is true for 82 (actually 80, a duplex)
- Same for 169 Canora (actually 167 per the roll)
- 91 Walnut is a typo (it's 81, thought I had fixed that already...)
- 927 Wolseley is also a duplex (should be 925)


I am going to fix these in the original spreadsheet and try again!
The old grace rows can stay, as that's an effor to reflect this building has gardens on 3 sides

In [132]:
# YAY! It's finally merged!

### Cleaning merged data for analysis:

In [144]:
# All I really need is: 
# roll_number, street_number, street_name, full_address, has_garden, total_assessed_value, property_use_code, geometry, centroid_lat, centroid_lon

wolseley_gardens = wolseley_gardens[['roll_number','full_address', 'has_garden', 'total_assessed_value', 'street_number_x', 'street_name_x', 'property_use_code', 'geometry', 'centroid_lat', 'centroid_lon']]

In [145]:
wolseley_gardens

Unnamed: 0,roll_number,full_address,has_garden,total_assessed_value,street_number_x,street_name_x,property_use_code,geometry,centroid_lat,centroid_lon
0,12081780000.0,1 LENORE STREET,,232000.0,1.0,LENORE,RESSD - DETACHED SINGLE DWELLING,MULTIPOLYGON (((-97.17150633999906 49.87741594...,49.87738,-97.1717
1,12082690000.0,1-130 ETHELBERT STREET,,452000.0,130.0,ETHELBERT,RESMC - MULTIFAMILY CONVERSION,MULTIPOLYGON (((-97.1679152061011 49.880671003...,49.880723,-97.167717
2,12082320000.0,1-156 EVANSON STREET,,643000.0,156.0,EVANSON,RESTR - TRIPLEX,MULTIPOLYGON (((-97.17075433751634 49.88213778...,49.8821,-97.170927
3,12082580000.0,1-230 HOME STREET,,504000.0,230.0,HOME,RESMC - MULTIFAMILY CONVERSION,MULTIPOLYGON (((-97.16846992448545 49.88357347...,49.883535,-97.168686
4,12082340000.0,1-273 EVANSON STREET,,486000.0,273.0,EVANSON,RESDU - DUPLEX,MULTIPOLYGON (((-97.17033900169139 49.88433314...,49.884369,-97.170166
5,12097510000.0,1-547 NEWMAN STREET,,232000.0,547.0,NEWMAN,VRES1 - VACANT RESIDENTIAL 1,MULTIPOLYGON (((-97.1830534062627 49.883064951...,49.882989,-97.183207
6,12082180000.0,10 EVANSON STREET,,328000.0,10.0,EVANSON,RESSD - DETACHED SINGLE DWELLING,MULTIPOLYGON (((-97.17109391375934 49.87749338...,49.877571,-97.171258
7,12097750000.0,10 PICARDY PLACE,,741000.0,10.0,PICARDY,RESAP - APARTMENTS,MULTIPOLYGON (((-97.16369307990463 49.88615200...,49.886021,-97.163606
8,12082020000.0,10 RUBY STREET,,351000.0,10.0,RUBY,RESSD - DETACHED SINGLE DWELLING,MULTIPOLYGON (((-97.17368135140494 49.87779150...,49.877731,-97.173489
9,12082240000.0,100 ARLINGTON STREET,,274000.0,100.0,ARLINGTON,RESSD - DETACHED SINGLE DWELLING,MULTIPOLYGON (((-97.17023563025946 49.88033884...,49.880377,-97.170069


In [163]:
# Lovely! Three more steps:
# First, set has_garden null values to false

wolseley_gardens.has_garden.fillna(value=False, inplace=True)

In [159]:
# Next, fix the annoying format for both property values and roll numbers

# I found this on the internet (shrug)

pd.set_option('display.float_format', '{:.0f}'.format)

# To turn it off use: pd.reset_option('display.float_format')

In [169]:
# Finally, rename the street name and number columns

wolseley_gardens.rename(columns={'street_name_x':'street_name','street_number_x':'street_number'}, inplace=True)

In [170]:
wolseley_gardens

Unnamed: 0,roll_number,full_address,has_garden,total_assessed_value,street_number,street_name,property_use_code,geometry,centroid_lat,centroid_lon
0,12081776000.0,1 LENORE STREET,False,232000.0,1.0,LENORE,RESSD - DETACHED SINGLE DWELLING,MULTIPOLYGON (((-97.17150633999906 49.87741594...,50.0,-97.0
1,12082691000.0,1-130 ETHELBERT STREET,False,452000.0,130.0,ETHELBERT,RESMC - MULTIFAMILY CONVERSION,MULTIPOLYGON (((-97.1679152061011 49.880671003...,50.0,-97.0
2,12082318000.0,1-156 EVANSON STREET,False,643000.0,156.0,EVANSON,RESTR - TRIPLEX,MULTIPOLYGON (((-97.17075433751634 49.88213778...,50.0,-97.0
3,12082580000.0,1-230 HOME STREET,False,504000.0,230.0,HOME,RESMC - MULTIFAMILY CONVERSION,MULTIPOLYGON (((-97.16846992448545 49.88357347...,50.0,-97.0
4,12082345000.0,1-273 EVANSON STREET,False,486000.0,273.0,EVANSON,RESDU - DUPLEX,MULTIPOLYGON (((-97.17033900169139 49.88433314...,50.0,-97.0
5,12097511500.0,1-547 NEWMAN STREET,False,232000.0,547.0,NEWMAN,VRES1 - VACANT RESIDENTIAL 1,MULTIPOLYGON (((-97.1830534062627 49.883064951...,50.0,-97.0
6,12082176000.0,10 EVANSON STREET,False,328000.0,10.0,EVANSON,RESSD - DETACHED SINGLE DWELLING,MULTIPOLYGON (((-97.17109391375934 49.87749338...,50.0,-97.0
7,12097750000.0,10 PICARDY PLACE,False,741000.0,10.0,PICARDY,RESAP - APARTMENTS,MULTIPOLYGON (((-97.16369307990463 49.88615200...,50.0,-97.0
8,12082022000.0,10 RUBY STREET,False,351000.0,10.0,RUBY,RESSD - DETACHED SINGLE DWELLING,MULTIPOLYGON (((-97.17368135140494 49.87779150...,50.0,-97.0
9,12082235000.0,100 ARLINGTON STREET,False,274000.0,100.0,ARLINGTON,RESSD - DETACHED SINGLE DWELLING,MULTIPOLYGON (((-97.17023563025946 49.88033884...,50.0,-97.0


In [171]:
# YES YES YES!! Days of work!! VINDICATED!!
# Let's make this a csv

wolseley_gardens.to_csv("wolseley_gardens_final.csv", index=False)

### For analysis, see second notebook