This is a dataset of Assisted Living, Nursing and Residential Care facilities in Oregon, open as of January, 2017. For each, we have:
1. <i>facility_id:</i> Unique ID used to join to complaints
2. <i>fac_ccmunumber:</i> Unique ID used to join to ownership history
3. <i>facility_type:</i> NF - Nursing Facility; RCF - Residential Care Facility; ALF - Assisted Living Facility
4. <i>fac_capacity:</i> Number of beds facility is licensed to have. Not necessarily the number of beds facility does have.
5. <i>offline:</i> created in munging notebook, a count of complaints that DO NOT appear when facility is searched on state's complaint search website (https://apps.state.or.us/cf2/spd/facility_complaints/).
6. <i>online:</i> created in munging notebook, a count of complaints that DO appear when facility is searched on state's complaint search website (https://apps.state.or.us/cf2/spd/facility_complaints/).

In [2]:
import pandas as pd
import numpy as np
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

In [3]:
df = pd.read_csv('/Users/fzarkhin/OneDrive - Advance Central Services, Inc/fproj/github/database-story/data/processed/facilities-before-state-updates.csv')

<h3>How many facilities have accurate records online?</h3>

Those that have no offline records.

In [4]:
df[(df['offline'].isnull())].count()[0]

57

<h3>How many facilities have inaccurate records online?<h/3>

Those that have offline records.

In [5]:
df[(df['offline'].notnull())].count()[0]

585

<h3>How many facilities had more than double the number of complaints shown online?</h3>

In [6]:
df[(df['offline']>df['online']) & (df['online'].notnull())].count()[0]

357

<h3>How many facilities show zero complaints online but have complaints offline?</h3>

In [7]:
df[(df['online'].isnull()) & (df['offline'].notnull())].count()[0]

60

<h3>How many facilities have complaints and are accurate online?</h3>

In [8]:
df[(df['online'].notnull()) & (df['offline'].isnull())].count()[0]

14

<h3>How many facilities have complaints?</h3>

In [9]:
df[(df['online'].notnull()) | df['offline'].notnull()].count()[0]

599

<h3>What percent of facilities have accurate records online?</h3>

In [10]:
df[(df['offline'].isnull())].count()[0]/df.count()[0]*100

8.8785046728971952

<h3>What is the total capacity of all facilities with inaccurate records?</h3>

In [11]:
df[df['offline'].notnull()].sum()['fac_capacity']

35238.0

In [14]:
df[df['facility_name'].str.contains('Springfield')]

Unnamed: 0,facility_id,fac_ccmunumber,facility_type,fac_capacity,facility_name,offline,online
16,385077,385077,NF,136.0,Marquis Springfield,30.0,7.0
473,70A299,70A299,ALF,150.0,Brookdale Springfield Briarwood,43.0,47.0
611,70M226,70M226,ALF,62.0,Brookdale Springfield Woodside,24.0,7.0


In [12]:
#df#['fac_capacity'].sum()

In [17]:
df[(df['online'].isnull())].sort_values('offline',ascending=False)

Unnamed: 0,facility_id,fac_ccmunumber,facility_type,fac_capacity,facility_name,offline,online
117,385275,385275,NF,51.0,Sheridan Care Center,32.0,
359,50R408,50R408,RCF,15.0,Bee Hive Homes of Baker City,15.0,
337,50R385,50R385,RCF,24.0,Bonaventure of Salem Memory Care,14.0,
334,50R382,50R382,RCF,48.0,Washington Gardens Memory Care,13.0,
493,70A320,70A320,ALF,65.0,Bonaventure of Salem Assisted Living,12.0,
483,70A309,70A309,ALF,65.0,Hawthorne Gardens Senior Living Community,12.0,
495,70A322,70A322,ALF,19.0,Wallowa Valley Senior Living,11.0,
56,385181,385181,NF,20.0,"East Cascade Retirement Community, LLC",10.0,
507,70M010,70M010,ALF,45.0,Brookside Place,10.0,
335,50R383,50R383,RCF,28.0,Royalton Place Memory Care,9.0,
