# Whose names are on our cities?
*June 27, 2022*

StatsCan maintains open databases that pull together lots of different bits of city infrastructure (schools, health care facilities, arts and cultural buildings etc). These are open databases, which means they may not be 100% complete, but they give us a good opportunity to see whose name is stamped all over our cities and its infrastructure.

Let's start by importing pandas.

In [114]:
import pandas as pd

Now we'll import the [schools dataset](https://www.statcan.gc.ca/en/lode/databases/odef).

In [115]:
schools = pd.read_csv("data/data-canada-schools.csv", encoding="latin-1")
schools = schools[["Facility_Name", "Facility_Type", "Street_No", "Street_Name", "City", "Prov_Terr", "Latitude", "Longitude"]]
schools["type"] = "SCHOOL"

schools.head(3)

Unnamed: 0,Facility_Name,Facility_Type,Street_No,Street_Name,City,Prov_Terr,Latitude,Longitude,type
0,Special Centre,Provincial,10044.0,108 street,Edmonton,AB,..,..,SCHOOL
1,Prairiehome Colony School,Public,6302.0,56 street,Taber,AB,49.80021032,-112.1398926,SCHOOL
2,Spring Valley Colony School,Public,,,Cardston,AB,..,..,SCHOOL


Then [the dataset for arts and cultural spaces](https://www.statcan.gc.ca/en/lode/databases/odcaf).

In [116]:
arts = pd.read_csv("data/data-canada-arts.csv", encoding="latin-1")
arts["facility_type"] = arts["ODCAF_Facility_Type"]
arts["type"] = "ARTS"
arts = arts[["Facility_Name", "facility_type", "type", "Street_No", "Street_Name", "City", "Prov_Terr", "Latitude", "Longitude"]]

arts.head()

Unnamed: 0,Facility_Name,facility_type,type,Street_No,Street_Name,City,Prov_Terr,Latitude,Longitude
0,#Hashtag Gallery,gallery,ARTS,801,dundas st w,toronto,on,43.65169472,-79.40803272
1,'Ksan Historical Village & Museum,museum,ARTS,1500,62 hwy,hazelton,bc,55.2645508,-127.6428124
2,'School Days' Museum,museum,ARTS,427,queen st,fredericton,nb,45.963283,-66.6419017
3,10 Austin Street,heritage or historic site,ARTS,10,austin st,moncton,nb,46.09247776,-64.78022946
4,10 Gates Dancing Inc.,miscellaneous,ARTS,..,..,ottawa,on,45.40856224,-75.71536766


Now [health care facilities](https://www.statcan.gc.ca/en/lode/databases/odhf).

In [117]:
health = pd.read_csv("data/data-canada-health.csv", encoding="latin-1")
health["facility_type"] = health["odhf_facility_type"]
health = health[["facility_name", "facility_type", "street_no", "street_name", "city", "province", "latitude", "longitude"]]
health["type"] = "SCHOOL"

health.head(3)

Unnamed: 0,facility_name,facility_type,street_no,street_name,city,province,latitude,longitude,type
0,Advanced Facial & Nasal Surgery Centre,Ambulatory health care services,,,edmonton,ab,,,SCHOOL
1,Alberta Children's Hospital,Hospitals,28.0,oki dr nw,calgary,ab,51.074582,-114.148426,SCHOOL
2,Devon General Hospital,Hospitals,101.0,erie st s,devon,ab,53.351493,-113.730785,SCHOOL


Now [sports and recreational spaces](https://www.statcan.gc.ca/en/lode/databases/odrsf), which I believe includes parks.

In [118]:
sports = pd.read_csv("data/data-canada-sports.csv", encoding="latin-1")
sports["facility_type"] = sports["ODRSF_facility_type"]
sports["type"] = "SPORTS"
sports = sports[["Facility_Name", "facility_type", "type", "Street_No", "Street_Name", "City", "Prov_Terr", "Latitude", "Longitude"]]


sports.head()

  sports = pd.read_csv("data/data-canada-sports.csv", encoding="latin-1")


Unnamed: 0,Facility_Name,facility_type,type,Street_No,Street_Name,City,Prov_Terr,Latitude,Longitude
0,11Bhe,playground,SPORTS,..,..,brantford,on,43.16758482,-80.24294547
1,221 Queen,pool,SPORTS,221,queen,kitchener,on,43.4471723,-80.49142712
2,3 Bondaries Trails,trail,SPORTS,32,des-pins,rivière bleue,qc,47.43583,-69.042971
3,A.C.R. Trail Park,park,SPORTS,5120,36,..,ab,52.256321,-113.816851
4,A.C.T. Aquatics And Recreation Centre Pool,pool,SPORTS,..,..,..,ab,53.5561459,-113.3861247


Now we'll bring all of these datasets together into one dataset, and convert each column to uppercase so we don't accidentally make any case sensitivity errors. We'll also strip some punctuation, and leading/trailing spaces.

In [119]:
data = pd.concat([arts, schools, sports, health])

for col in data.columns:
    data[col] = data[col].astype(str).str.upper().str.replace("\.", "", regex=True).str.strip()
    
data.shape

(215950, 17)

Now we define a list of names we want to search for. These names include some well-known architects of residential schools, but also some lesser known names that were found after a brief bit of research. Here are some of the lesser-known names found:
* [Frank Oliver](https://en.wikipedia.org/wiki/Frank_Oliver_(politician))
* [Vital-Justin Grandin](https://en.wikipedia.org/wiki/Vital-Justin_Grandin)
* [Nicholas Flood Davin](https://en.wikipedia.org/wiki/Nicholas_Flood_Davin)
* [Charles Bagot](https://en.wikipedia.org/wiki/Charles_Bagot)
* [James Girty](https://www.thecanadianencyclopedia.ca/en/article/black-enslavement)
* [James Hayt](https://www.thecanadianencyclopedia.ca/en/article/black-enslavement)
* [Joshua Mauger](https://www.thecanadianencyclopedia.ca/en/article/black-enslavement)
* [Walter Patterson](https://www.thecanadianencyclopedia.ca/en/article/black-enslavement)
* [William Jarvis](https://www.thecanadianencyclopedia.ca/en/article/black-enslavement)
* [Joseph Brant](https://www.thecanadianencyclopedia.ca/en/article/black-enslavement)
* [John McDonnell](https://www.thecanadianencyclopedia.ca/en/article/black-enslavement)
* [Peter Van Alstine](https://www.thecanadianencyclopedia.ca/en/article/black-enslavement)


In [120]:
names = [
    "DUNDAS",
    "RUSSELL",
    "MACDONALD",
    "RYERSON",
    "CORNWALLIS",
    "MCGILL",
    "EGERTON",
    "BOWELL",
    "REED",
    "BAGOT",
    "DAVIN",
    "LANGEVIN",
    "GRANDIN",
    "OLIVER",
    "GIRTY",
    "HAYT",
    "MAUGER",
    "JARVIS",
    "BRANT",
    "VAN ALSTINE",
    "MCDONNELL",
    "PATTERSON"
    ]

Now we'll construct a regex search string from this list of names, and filter for facility names that include these names.

In [122]:
regex_string = r"\b|\b".join(names)
regex_string = r"\b" + regex_string + r"\b"

subset = data[data["Facility_Name"].str.contains(regex_string, regex=True)]

subset.head()

Unnamed: 0,Facility_Name,facility_type,type,Street_No,Street_Name,City,Prov_Terr,Latitude,Longitude,Facility_Type,facility_name,street_no,street_name,city,province,latitude,longitude
253,ANNAPOLIS VALLEY MACDONALD MUSEUM,MUSEUM,ARTS,21,SCHOOL ST,MIDDLETON,NS,44943776.0,-650711594.0,NAN,NAN,NAN,NAN,NAN,NAN,NAN,NAN
1374,BIBLIOTHÈQUE MUNICIPALE DE SAINTE-HÉLÈNE-DE-BAGOT,LIBRARY OR ARCHIVES,ARTS,384,6E AVENUE,LAVAL,QC,457301698.0,-727362407.0,NAN,NAN,NAN,NAN,NAN,NAN,NAN,NAN
1599,BRANT COUNTY,LIBRARY OR ARCHIVES,ARTS,12,WILLIAM STREET,PARIS,ON,,,NAN,NAN,NAN,NAN,NAN,NAN,NAN,NAN
1600,BRANT MUSEUM AND ARCHIVES/MARTLEVILLE HOUSE MU...,MUSEUM,ARTS,57,CHARLOTTE ST,BRANTFORD,ON,431418448.0,-8026080928.0,NAN,NAN,NAN,NAN,NAN,NAN,NAN,NAN
1678,"BURK'S FALLS, ARMOUR & RYERSON UNION",LIBRARY OR ARCHIVES,ARTS,39,COPELAND STREET,BURKS FALLS,ON,456189589.0,-794066687.0,NAN,NAN,NAN,NAN,NAN,NAN,NAN,NAN


It's important to note here that not all of these hits are valid - some may be named for other MacDonalds, for instance, so this really just gives us a first look, and doesn't give us anything conclusive!

\-30\-