# Nursing Home Data - Scraping from PDF
Nursing homes in the Dudley area has been provided by [Dudley council](https://www.dudley.gov.uk/residents/care-and-health/adult-health-social-care/housing-with-care-and-support/care-homes-residential-and-nursing/) in a PDF avalaible in this [link](https://www.dudley.gov.uk/media/ktclgusy/2023-24_approved_care_home_providers_within_the_dudley_borough_available_to_the_public.pdf) 

In [2]:
import pandas as pd
import camelot

# Temporarily set max rows to 100
pd.set_option('display.max_rows', 100)

# Read all tables from all pages
tables = camelot.read_pdf('nhomes.pdf', pages='all')

# Initialize an empty DataFrame
all_tables = pd.DataFrame()

# Loop through each table and append it to the all_tables DataFrame
for table in tables:
    df = table.df  # get the table as a DataFrame
    df = df.iloc[1:]  # skip the first row
    all_tables = pd.concat([all_tables, df])

# Reset the index of the final DataFrame
all_tables.reset_index(drop=True, inplace=True)

# Label the columns
headers = tables[1].df.iloc[0]
all_tables.columns = headers

In [3]:
# Split the 'Name & Address' column into 'Post Code' and 'Care Home Name'
all_tables['Post Code'] = all_tables['Name & Address'].str.split('\n').str[-1]
all_tables['Care Home Name'] = all_tables['Name & Address'].str.split('\n').str[0]

# Splitting the Age Range column into 'Min Age' and 'Max Age'
all_tables['Min Age'] = all_tables['Age Range'].str.extract('(\d{2})')
all_tables['Max Age'] = all_tables['Age Range'].str.extract('-(\d{2})')

In [4]:
all_tables.head(100)

Unnamed: 0,Name & Address,Email,Telephone No,Age Range,CQC Registered For,Provide \nNursing,Capacity,Post Code,Care Home Name,Min Age,Max Age
0,Abbeygate Care \nCentre \n2 Leys Road \nBrockm...,abbeygatecare1@gmail.com,01384 571295,Age 65+,Dementia \nMental Health Condition \nOld Age \...,No,17.0,DY5 3UR,Abbeygate Care,65.0,
1,Abbeymere \n12 Eggington Road \nWollaston \nSt...,abbeymere@karelink.co.uk,01384 395195,Ages 65+,Dementia \nMental Health Condition \nOld Age \...,No,18.0,DY8 2QJ,Abbeymere,65.0,
2,Allenbrook Nursing \nHome \n209 Spies Lane \nH...,manager@allenbrooknursing\nhome.co.uk,0121 422 5844,Ages 55+,Dementia \nMental Health Condition \nNo Medica...,Yes,36.0,B62 9SJ,Allenbrook Nursing,55.0,
3,Amberley Care Home \n481-483 Stourbridge \nRoa...,amberleycarehome@hotmail\n.co.uk,01384 482365,Ages 65+,Dementia \nOld Age,No,25.0,DY5 1LB,Amberley Care Home,65.0,
4,Arcare For Forte \n440 Birmingham New \nRoad \...,ksharma@arcarehomes.co.u\nk,01902 880108,18+,Learning Disability \nMental Health Condition ...,No,9.0,WV14 9QB,Arcare For Forte,18.0,
5,Ashbourne Care Ltd \nLightwood Road \nDudley \...,ashbourne.m@fshc.co.uk,01384 242200,Ages 65+,Dementia \nOld Age,No,38.0,DY1 2RS,Ashbourne Care Ltd,65.0,
6,Ashgrove Nursing \nHome \n9 Dudley Wood Road \...,cea@ashgrovecare.com,01384 413913,Ages 65+,Dementia \nOld Age \nSensory Impairment,Yes,57.0,DY2 0DA,Ashgrove Nursing,65.0,
7,Avondale \n45 Norton Road \nNorton \nStourbri...,avondaleresthome@hotmail.\ncom,01384 442731,Ages 65+,Old Age,No,15.0,DY8 2AH,Avondale,65.0,
8,Beatrice House \n25 Bell Street \nPensnett \nB...,beatricehouse@alphonsusse\nrvices.co.uk,01384 482963,Not Stated,Learning Disability,No,3.0,DY5 4HG,Beatrice House,,
9,Belvidere \n41-43 Stourbridge \nRoad \nHolly H...,belvidere@gmail.com,01384 211850,Ages 55+,Dementia \nDetention Under Mental \nHealth Act...,No,28.0,DY1 2DH,Belvidere,55.0,
