# Scraping LTC proposals
*May 10, 2023*

The goal is to pull all the proposals for LTC homes from [this page](https://www.ontario.ca/page/ontarios-long-term-care-licensing-public-consultation-registry#section-18). Start by importing the modules we'll need.

In [3]:
import pandas as pd
from bs4 import BeautifulSoup
import requests

Now we make a request for the HTML from the page, then run it through beautifulsoup.

In [46]:
r = requests.get("https://www.ontario.ca/page/ontarios-long-term-care-licensing-public-consultation-registry#section-18").content

soup = BeautifulSoup(r, 'html.parser')

<!DOCTYPE html>

Because each LTC proposal is not nested, we'll need to find the header for each section, and then work our way through the siblings that come after for the information we need.

We'll search for all link tags within h2 tags that have a not null name that is not "archive". Then we go through and get the parent of each of those, so we're left with the all the h2 tags that head up an active proposal.

In [56]:
all = soup.select("h2 > a[name][id!=archive]")
all = [item.parent for item in all]
all

[<h2><a id="24-013" name="24-013">Neighbourhood Better Living — Project #24-013</a></h2>,
 <h2><a id="24-015" name="24-015">Project Iris — Project #24-015</a></h2>,
 <h2><a id="24-021" name="24-021">Delhi Long Term Care Centre — Project #24-021</a></h2>,
 <h2><a id="24-020" name="24-020">Finlandia Hoivakoti Nursing Home — Project #24-020</a></h2>,
 <h2><a id="23-064" name="23-064">Mohawks Bay of Quinte — Project #23-064</a></h2>,
 <h2><a id="23-067" name="23-067">Chateau Park Long Term Care Home — Project #23-067</a></h2>,
 <h2><a id="24-018" name="24-018">Groves Park Lodge — Project #24-018</a></h2>,
 <h2><a id="24-017" name="24-017">IOOF Seniors Home — Project #24-017</a></h2>,
 <h2><a id="24-016" name="24-016">peopleCare - Tillsonburg — Project #24-016</a></h2>,
 <h2><a id="24-011" name="24-011">Trillium Villa Nursing Home — Project #24-011</a></h2>,
 <h2><a id="24-003" name="24-003">Extendicare Ottawa #2 — Project #24-003</a></h2>,
 <h2><a id="24-005" name="24-005">Southbridge Otta

Now we're going to loop through each element, and get various siblings 

In [64]:
df = []

for item in all:

    next_sib = item.next_sibling
    second_sib = next_sib.next_sibling
    third_sib = second_sib.next_sibling
    fourth_sib = third_sib.next_sibling

    data = {
        "name": [item.text],
        "description": [item.next_sibling.text],
        "closing_date": [item.next_sibling.next_sibling.text],
        "description2": [item.next_sibling.next_sibling.next_sibling.next_sibling.text]
    }

    data = pd.DataFrame(data)
    df.append(data)
        
df = pd.concat(df)

df

Unnamed: 0,name,description,closing_date,description2
0,Neighbourhood Better Living — Project #24-013,The development of a 160-bed long-term care ho...,"Closing date: June 17, 2023",The Ministry of Long-Term Care is reviewing a ...
0,Project Iris — Project #24-015,The licence transfer of 16 homes from an exist...,"Closing date: June 8, 2023",The Ministry of Long-Term Care is reviewing a ...
0,Delhi Long Term Care Centre — Project #24-021,The redevelopment of a 60-bed long-term care h...,"Closing date: June 7, 2023",The Ministry of Long-Term Care is reviewing a ...
0,Finlandia Hoivakoti Nursing Home — Project #24...,The redevelopment of a 112-bed long-term care ...,"Closing date: May 27, 2023",The Ministry of Long-Term Care is reviewing a ...
0,Mohawks Bay of Quinte — Project #23-064,The development of a new 128-bed long-term car...,"Closing date: May 27, 2023",The Ministry of Long-Term Care is reviewing a ...
0,Chateau Park Long Term Care Home — Project #23...,The redevelopment of a 59-bed long-term care h...,"Closing date: May 25, 2023",The Ministry of Long-Term Care is reviewing a ...
0,Groves Park Lodge — Project #24-018,The redevelopment of a 100-bed long-term care ...,"Closing date: May 22, 2023",The Ministry of Long-Term Care is reviewing a ...
0,IOOF Seniors Home — Project #24-017,The redevelopment of a 162-bed long-term care ...,"Closing date: May 17, 2023",The Ministry of Long-Term Care is reviewing a ...
0,peopleCare - Tillsonburg — Project #24-016,The development of a 160-bed long-term care ho...,"Closing date: May 17, 2023",The Ministry of Long-Term Care is reviewing a ...
0,Trillium Villa Nursing Home — Project #24-011,The development of a 160-bed long-term care ho...,"Closing date: May 17, 2023",The Ministry of Long-Term Care is reviewing a ...
