# BC COVID-19 Data by Health Authority

1. [Feature Server JSON](https://services1.arcgis.com/xeMpV7tU1t4KD3Ei/arcgis/rest/services/COVID19_Cases_by_BC_Health_Authority/FeatureServer/0?f=pjson) `id 1,2,3,4,5`

2. [Case Details](http://www.bccdc.ca/health-info/diseases-conditions/covid-19/data)

---

In [3]:
import pandas as pd
import geopandas as gpd
import requests
from shapely.geometry import shape

In [3]:
r = requests.get("https://services1.arcgis.com/xeMpV7tU1t4KD3Ei/arcgis/rest/services/COVID19_Cases_by_BC_Health_Authority/FeatureServer/0/query?where=&objectIds=1%2C2%2C3%2C4%2C5&time=&geometry=&geometryType=esriGeometryPolygon&inSR=&spatialRel=esriSpatialRelIntersects&resultType=none&distance=0.0&units=esriSRUnit_Meter&returnGeodetic=false&outFields=&returnGeometry=true&returnCentroid=false&featureEncoding=esriDefault&multipatchOption=xyFootprint&maxAllowableOffset=&geometryPrecision=&outSR=&datumTransformation=&applyVCSProjection=false&returnIdsOnly=false&returnUniqueIdsOnly=false&returnCountOnly=false&returnExtentOnly=false&returnQueryGeometry=true&returnDistinctValues=false&cacheHint=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&having=&resultOffset=&resultRecordCount=&returnZ=false&returnM=false&returnExceededLimitFeatures=true&quantizationParameters=&sqlFormat=none&f=pgeojson&token=")
r.raise_for_status()

data = r.json()

In [4]:
gdf = gpd.GeoDataFrame.from_features(data["features"])
print(gdf.head())

                                            geometry            HA_Name
0  POLYGON ((-120.45314 52.99326, -120.42102 52.9...           Interior
1  MULTIPOLYGON (((-122.93676 49.31128, -122.9365...             Fraser
2  MULTIPOLYGON (((-123.10903 49.28623, -123.0231...  Vancouver Coastal
3  MULTIPOLYGON (((-126.66725 51.19230, -126.6705...   Vancouver Island
4  MULTIPOLYGON (((-129.76908 55.45115, -129.7657...           Northern


In [11]:
with open('../data/bc-health-authorities.geojson', 'w') as f:
    f.write(gdf.to_json())

---

2. **Case Details**

In [9]:
pd.read_csv('../data/BCCDC_COVID19_Dashboard_Case_Details.csv')

Unnamed: 0,Reported_Date,HA,Sex,Age_Group,Classification_Reported
0,2020-01-26,Out of Canada,M,40-49,Lab-diagnosed
1,2020-02-02,Vancouver Coastal,F,50-59,Lab-diagnosed
2,2020-02-05,Out of Canada,F,20-29,Lab-diagnosed
3,2020-02-05,Out of Canada,M,30-39,Lab-diagnosed
4,2020-02-11,Interior,F,30-39,Lab-diagnosed
...,...,...,...,...,...
3295,2020-07-19,Vancouver Island,M,80-89,Lab-diagnosed
3296,2020-07-19,Vancouver Island,F,60-69,Lab-diagnosed
3297,2020-07-19,Vancouver Island,F,50-59,Lab-diagnosed
3298,2020-07-19,Out of Canada,F,30-39,Lab-diagnosed


---
# BC COVID Data for LTCs

1. [long term facilities](https://www.seniorsadvocatebc.ca/quickfacts/search/%20)
2. [outbreaks](http://www.phsa.ca/current-outbreaks)

---

1. **Long term facilities**

In [4]:
from bs4 import BeautifulSoup

In [18]:
# user defined exception errors
class Error(Exception):
    """Base class for other exceptions"""
    pass

class InvalidHome(Error):
    """Raised when return is not a home name or there is a duplicate"""
    pass

class InvalidAddress(Error):
    """Raised when return is not a long-term care home address"""
    pass

In [253]:
address = []
home = []
links = []
    
for i in range(1,31):
    re = requests.get('https://www.seniorsadvocatebc.ca/quickfacts/search/%20/'+ str(i))
    s = BeautifulSoup(re.text,'html5lib')

    containers = s.find_all("div")

    for each in containers:
        try:
             # home address
            if 'Address:' not in each.find('p').getText():
                raise InvalidAddress
            else:
                address.append(each.find('p').getText())

            # home name
            if each.find('a').getText() in ['Home', '\n\t\t\t\t\t\t\t«\n\t\t\t\t\t\t', 
                                            'Data Sources', 'Visit us on Facebook', 
                                            'Email the OSA', '']:
                raise InvalidHome
            elif each.find('a').getText() in home:
                # remove duplicates (if home name already exists)
                raise InvalidHome
            else:
                home.append(each.find('a').getText())

            # home url
            if each.find('a').get('href') in links:
                # remove duplicates (if url already exists)
                raise InvalidHome
            else:
                links.append(each.find('a').get('href'))

        except AttributeError:
            pass
        except InvalidHome:
            pass
        except InvalidAddress:
            pass

In [24]:
ltc = pd.DataFrame({'Home':home, 'Link':links, 'Address':address})

In [25]:
# clean and separate address info
ltc.Address = ltc.Address.str.replace('\n\t\t\t\t',' ')
ltc['city/postal'] = ltc.Address.str.extract(r'City/postal:(.*) Phone:')
ltc['phone'] = ltc.Address.str.extract(r'Phone:(.*)\t\t\t')
ltc.Address = ltc.Address.str.extract(r'Address:(.*) City/postal:')

In [26]:
ltc

Unnamed: 0,Home,Link,Address,city/postal,phone
0,Acropolis Manor,https://www.seniorsadvocatebc.ca/quickfacts/lo...,1325 Summit Avenue,Prince Rupert V8J4C1,(250) 622-6400
1,Aberdeen Hospital,https://www.seniorsadvocatebc.ca/quickfacts/lo...,1450 Hillside Ave.,Victoria V8T2B7,(250) 370-5648
2,Adanac Park Lodge,https://www.seniorsadvocatebc.ca/quickfacts/lo...,851 Boundary Road,Vancouver V5K4T2,(604) 299-7567
3,Acacia Ty Mawr,https://www.seniorsadvocatebc.ca/quickfacts/lo...,2655 E Shawnigan Lake Rd,Shawnigan Lake V0R2W0,(250) 743-2124
4,Ayre Manor,https://www.seniorsadvocatebc.ca/quickfacts/lo...,6764 Ayre Rd,Sooke V9Z1K1,(250) 642-1750
...,...,...,...,...,...
292,Yucalta Lodge,https://www.seniorsadvocatebc.ca/quickfacts/lo...,555 - 2nd Avenue,Campbell River V9W3V1,(250) 850-2900
293,Youville Residence,https://www.seniorsadvocatebc.ca/quickfacts/lo...,4950 Heather Street,Vancouver V5Z3L9,(604) 261-9371
294,Yaletown House,https://www.seniorsadvocatebc.ca/quickfacts/lo...,1099 Cambie Street,Vancouver V6B5A8,(604) 689-0022
295,Wrinch Memorial Hospital,https://www.seniorsadvocatebc.ca/quickfacts/lo...,2510 West Hwy 62,Hazelton V0J1Y0,(250) 842-5211


In [27]:
ltc.to_csv('../data/bc_ltc.csv')

2. **Outbreaks**

---

***Northern***

In [247]:
re = requests.get('https://www.northernhealth.ca/health-topics/current-outbreaks')
s = BeautifulSoup(re.text,'html5lib')
table =s.find_all('td')

In [248]:
d = ['city', 'facility', 'outbreaktype', 'datedeclared']
outbreak = {}
for each,i in zip(table,d):
    outbreak[i] = each.text.strip()

In [250]:
northern = pd.DataFrame(outbreak, index = [0])

***Interior Health***

In [103]:
ih = requests.get('https://www.interiorhealth.ca/YourEnvironment/CommunicableDiseaseControl/Outbreaks/Pages/default.aspx')
ih = BeautifulSoup(ih.text,'html5lib')
outbreaks = ih.find_all('table')[7]

In [137]:
outbreaks.find('td').text

'There are no items to show in this view of the "Outbreaks" list.'

***Vancouver Island***

In [149]:
vi = requests.get('https://www.healthspace.ca/Clients/VIHA/VIHA_Website.nsf/Outbreak')
vi = BeautifulSoup(vi.text,'html5lib')
outbreaks = vi.find('td')

In [150]:
outbreaks.getText()

' Currently, there are no reported outbreaks in VIHA hospitals or long term care facilities.'

***Vancouver Coastal Health***

In [266]:
from PyPDF2 import PdfFileReader
import io
import re

In [183]:
vch = requests.get('http://www.vch.ca/Documents/facility-outbreak-bulletin.pdf')

In [187]:
with io.BytesIO(vch.content) as f:
            pdf = PdfFileReader(f)
            information = pdf.getDocumentInfo()
            pages = pdf.getNumPages()
            for i in range(0, pages):
                page = pdf.getPage(i)
                page_content = page.extractText()           
                print(page_content)

Facility Outbreak Bulletin
This bulletin lists ongoing and recently ended outbreaks in licensed long-term and acute care
facilities throughout Vancouver Coastal Health, as of 03:28 PM, 16-Jul-2020
DISEASE
LOCATION
RESTRICTIONS
IMPOSED
RESTRICTIONS
LIFTED
FACILITY
COVID-19
1081 Burrard Street, Vancouver
16-Jul-20
St. Paul's Hospital, NICU
COVID-19
7801 Argyle St, Vancouver
09-Jun-20
Holy Family Hospital, LTCF (Rehabilitation Unit 
declared over)
COVID-19
1645 W 14th Ave, Vancouver
12-Apr-20
13-Jun-20
South Granville Park Lodge
COVID-19
2444 Burr Pl, North Vancouver
30-Mar-20
05-Jun-20
Berkley Care Centre (formerly Kiwanis Care Centre)
Page 1 of 1
Red text denotes updates from previously issued bulletin
Grey text indicates that restrictions have been lifted
Restriction Imposed: 
Restrictions Lifted: 
               Date which outbreak measures were introduced
            Date which outbreak measures were discontinued



In [564]:
# replace line breaks with spaces
# vch_info = re.sub('\n', ' ', page_content)

In [563]:
# # extract outbreak info only
# match = re.findall(r'COVID-19 (.*?)Page [0-9] of [0-9]', vch_info)

# # add appropriate linebreaks between info and facilities
# def partition(x):
#     for row in [match2]:
#         m = re.sub('Vancouver', 'Vancouver\n', row)
#         m = m.replace('COVID-19', '\nCOVID-19\n')
#         m = re.sub('-20', '-20\n', m)
#     print(m)
    
# partition(match)

# # split data
# info = m.splitlines()

In [531]:
# create a dictionary
# keys = ['address', 'city', 'restrictionsimposed', 'restrictionslifted', 'outbreaktype']
# vch_dict = {}

# for i, k in zip(info, keys):
#     vch_dict[k] = i

# vch_dict

----

In [536]:
spl = page_content.split('\n')

In [520]:
spldf = pd.Series(spl)

In [540]:
spldf.iloc[10:29]

10                                             COVID-19
11                       1081 Burrard Street, Vancouver
12                                            16-Jul-20
13                            St. Paul's Hospital, NICU
14                                             COVID-19
15                            7801 Argyle St, Vancouver
16                                            09-Jun-20
17     Holy Family Hospital, LTCF (Rehabilitation Unit 
18                                       declared over)
19                                             COVID-19
20                           1645 W 14th Ave, Vancouver
21                                            12-Apr-20
22                                            13-Jun-20
23                           South Granville Park Lodge
24                                             COVID-19
25                        2444 Burr Pl, North Vancouver
26                                            30-Mar-20
27                                            05

In [560]:
vch_outbreaks = pd.DataFrame({'info' : spldf.iloc[10:29]})

In [561]:
vch_outbreaks['details'] = ['outbreaktype', 'address', 'restrictionsimposed', 'facility',
                           'outbreaktype', 'address', 'restrictionsimposed', 'facility', 'status',
                           'outbreaktype', 'address', 'restrictionsimposed', 'restrictionslifted', 'facility',
                           'outbreaktype', 'address','restrictionsimposed', 'restrictionslifted', 'facility']

In [562]:
vch_outbreaks

Unnamed: 0,info,details
10,COVID-19,outbreaktype
11,"1081 Burrard Street, Vancouver",address
12,16-Jul-20,restrictionsimposed
13,"St. Paul's Hospital, NICU",facility
14,COVID-19,outbreaktype
15,"7801 Argyle St, Vancouver",address
16,09-Jun-20,restrictionsimposed
17,"Holy Family Hospital, LTCF (Rehabilitation Unit",facility
18,declared over),status
19,COVID-19,outbreaktype


In [573]:
vch_pivot = vch_outbreaks.pivot_table(values='info',
                                     columns='details', 
                                     aggfunc=lambda x: ','.join(x))

In [574]:
vch_pivot

details,address,facility,outbreaktype,restrictionsimposed,restrictionslifted,status
info,"1081 Burrard Street, Vancouver,7801 Argyle St,...","St. Paul's Hospital, NICU,Holy Family Hospital...","COVID-19,COVID-19,COVID-19,COVID-19","16-Jul-20,09-Jun-20,12-Apr-20,30-Mar-20","13-Jun-20,05-Jun-20",declared over)


In [575]:
# vanccoastal = pd.DataFrame(vch_dict, index=[0])

***Fraser***

In [232]:
f = requests.get('https://www.fraserhealth.ca/patients-and-visitors/current-outbreaks#.XxiKNy0ZNQI')
fraser = BeautifulSoup(f.text,'html5lib')
table = fraser.find_all('td')

In [243]:
d = ['datedeclared', 'facility', 'facilitytype', 'unit', 'outbreaktype']
outbreaks = {}
for td,k in zip(table,d):
    try:
        outbreaks[k] = td.find('h5').text.strip()
    except AttributeError:
        pass

In [244]:
outbreaks

{'datedeclared': 'June 16, 2020',
 'facility': 'Mission Memorial Hospital',
 'facilitytype': 'Hospital',
 'unit': '',
 'outbreaktype': 'COVID-19'}

In [251]:
fraser = pd.DataFrame(outbreaks, index = [0])

### Merge facilities

In [576]:
merge1 = pd.merge(northern, fraser, how = 'outer')

In [577]:
merge2 = pd.merge(merge1, vch_pivot, how = 'outer')

In [578]:
merge2

Unnamed: 0,city,facility,outbreaktype,datedeclared,facilitytype,unit,address,restrictionsimposed,restrictionslifted,status
0,Terrace,Terraceview Lodge - Lakelse Unit,Respiratory Illness,"April 9, 2020\n\n\t\t\tDeclared over: April 13...",,,,,,
1,,Mission Memorial Hospital,COVID-19,"June 16, 2020",Hospital,,,,,
2,,"St. Paul's Hospital, NICU,Holy Family Hospital...","COVID-19,COVID-19,COVID-19,COVID-19",,,,"1081 Burrard Street, Vancouver,7801 Argyle St,...","16-Jul-20,09-Jun-20,12-Apr-20,30-Mar-20","13-Jun-20,05-Jun-20",declared over)


# Caveats:

1. VCH outbreak status is colour coded and requires custom scripts for each new outbreak