# Facilities

District-level information was obtained online, through the Ministry of Finance online "Fiscal Transfers" database <cite data-cite="otims_2018"><a href="https://github.com/alexgoodell/uganda-model/blob/master/refs/cite-md/otims_2018.md">(otims_2018)</a></cite>. Health facility information was obtained from the 2017 Health Facility Master List, which we found on the document-sharing website Scribd.com <cite data-cite="mohdhi2017"><a href="https://github.com/alexgoodell/uganda-model/blob/master/refs/cite-md/mohdhi2017.md">(mohdhi2017)</a></cite>. Data was extracted with the Tabula software <cite data-cite="tabula_2018"><a href="https://github.com/alexgoodell/uganda-model/blob/master/refs/cite-md/tabula_2018.md">(tabula_2018)</a></cite>.

### Master facility list pre-processing
After extraction, data from the master facility list were compared against summary tables from the same document to identify any missing data. Changes were needed in four districts:
1. The Health Facility Master List claimed that there were 32 operating health facilities in the __Amuru__ district, but did not list any. For this district, 26 facilities were identified from the budgeting office data <cite data-cite="otims_2016"><a href="https://github.com/alexgoodell/uganda-model/blob/master/refs/cite-md/otims_2016.md">(otims_2016)</a></cite>. An additional three facilities (for a total of 29) were identified from 2012 health infastructure data: ST. AUGUSTINE Health Center II, Otwee Health Center III, and Pabbo Health Centre III <cite data-cite="hid2012"><a href="https://github.com/alexgoodell/uganda-model/blob/master/refs/cite-md/hid2012.md">(hid2012)</a></cite>. There were assigned a sub-county NHPI of SC8ZISZD6 (Amuru town council, Amoyokoma Parish)
2. There were also no facilities listed for the __Kibaale__ district, though the document listed 16 facilities (7 of which are HCIII, 1 HC IV, 1 hospital) in a summary table (Table 8). This is a new district, created in July 2016, when the previous Kibaale District was split into Kagadi, Kakumiro and Kibaale. Eight facilities were identified in the budgeting data <cite data-cite="otims_2016"><a href="https://github.com/alexgoodell/uganda-model/blob/master/refs/cite-md/otims_2016.md">(otims_2016)</a></cite>, and two additional were identified with the infastructure data <cite data-cite="hid2012"><a href="https://github.com/alexgoodell/uganda-model/blob/master/refs/cite-md/hid2012.md">(hid2012)</a></cite>. This process thus identified a total of 5 HC III's, 1 HC IV, and 1 hospital. There were assigned a sub-country of SCLMS8IG1 (Kibaale town council, Ruguuza Parish).
3. The __Mityana__ district was missing its 24th facility, reporting 65 of 66 facilities. According to table 5, this district has a hospital. After reviewing the budgeting data, Mityana Hospital was identified and added to our database. It was assigned a random sub-county within Mityana: SC96Q5BX1 (Bulera Subcounty, Kakonde Parish)
4. Three facilites in the __Mukono__ district (68: Royal Family Clinic Clinic, 88:	Trinity Clinic Clinic, 91: Vine Medical centre Clinic) did not have a sub-county NHPI. These facilities were assigned a sub-county NHPI of SCTMPU7S6 (for the Mukono Municipality, Nantaburirwa Parish).

In [12]:
# Import dependencies
import sys, os
sys.path.append(os.path.join(os.path.dirname(''), '..'))
from lib.dependencies import *
import config

In [13]:
# Let's start a dataframe with district information. 
# I've gotten most of this data from http://www.budget.go.ug/fiscal_transfers
# The following is the code that was used to create the file -- no longer in use, just grab the CSV

'''
# get the population data as skeleton - create dataframe
url = 'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/MjA3'
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page, "html5lib")
table = soup.select_one("table.dataset")
# headers = [th.text.encode("utf-8") for th in table.select("tr th")]
districts = pd.read_html(str(table))[0]
districts = districts.set_index('Vote code')

# urls for the remaining columns we want
urls = [ 
# hard to reach
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/OTA3',
# pop per health fac    
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/Mjc3',
# rural pop
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/NjA3',
# public hospitals
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/MDQ4',
# is district?
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/MzU2',
# poverty headcount
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/NDMy',
# is municiplality
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/NDc2',
# distance from kampala (cities)
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/Njc2',
# distance from kampala (districts)
'http://www.budget.go.ug/fiscal_transfers/variable/show_dataset/var/OTU2'
]

for url in urls:
    page = urllib2.urlopen(url).read()
    soup = BeautifulSoup(page, "html5lib")
    table = soup.select_one("table.dataset")
    nd = pd.read_html(str(table))[0]
    nd = nd.set_index('Vote code')
    districts = pd.concat([districts, nd], axis=1)

# Remove duplicated columns
districts = districts.loc[:,~districts.columns.duplicated()]

# Export
districts.to_csv("districts_20-March-2018.csv")

'''

districts = pd.read_csv('../../data/budget-office/districts_20-March-2018.csv')

rename = {
'Population (2015/16)': 'pop',
'Hard to Reach, Hard to Stay (2015/16)': 'is_hard_to_reach',
'Population per health facility (2015/16)': 'pop_per_health_fac',
'Rural Population (2015/16)': 'rural_pop',
'Constant (Public or PNFP Hospitals) (2015/16)': 'public_hosp',
'Constant (District) (2015/16)': 'is_district', 
'Poverty Headcount (2015/16)': 'poverty_pop',
'Constant (Municipality) (2015/16)': 'is_city',
'Distance from Kampala (Municipality) (2015/16)': 'distance_from_kampala_city',
'Distance from Kampala (District) (2015/16)': 'distance_from_kampala_district' 
}

# Rename columns to more useful 
districts = districts.rename(columns=rename)

# Combined distance from kampala (different columns for same variable)
districts['distance_from_kampala'] = districts['distance_from_kampala_district'] + districts['distance_from_kampala_city'] 
districts = districts.drop(columns=['distance_from_kampala_district','distance_from_kampala_city'])
districts = districts.set_index('Vote code')


districts.head(5).T

IOError: File ../../data/budget-office/districts_20-March-2018.csv does not exist

In [113]:
# Import the data from the mohdhi2017 facilities list - using excel because CSV caused 
# line-splitting problems (some cells have returns)
facilities = pd.read_excel("../../data/moh-div-health-info/facilities.xlsx", sheet_name="facilities", header=0)

# make splitter function
splitter = lambda x: pd.Series([i for i in reversed(x.split('/'))])

# split HSDT Code (see page 7 of mohdhi2017 for details of HSDT codes)
# note: after data cleaning, some facilities required manual setting of subcounty NHPI.
# This was accomplished by setting the facility HSDT code to SCXXXXXX/NA/NA, where X's = SC NHPI
facilities[['parish_level_fac_id', 'parish_nhpi', 'subcounty_nhpi']] = facilities['HSDT Code'].apply(splitter)

# drop uneeded columns
facilities = facilities.drop(columns={'District_name'})

# Load the subcounty list - used to link facilities to their district
subcounties = pd.read_csv("../../data/moh-div-health-info/subcounties.csv")
rename = {'NHPI Code': 'subcounty_nhpi', 'Name': 'subcounty_name', 
          'HSDT Code': 'subcounty_full_hsdt_code', 'County': 'county_name', 
          'District': 'district_name',
          'Subregion': 'subregion_name'}
subcounties = subcounties.rename(columns=rename)
subcounties = subcounties.drop(columns={'#'})
subcounties.subcounty_nhpi = subcounties.subcounty_nhpi.str.strip()

# Add the subcounty data to the facilites list
facilities = facilities.join(subcounties.set_index('subcounty_nhpi'), on="subcounty_nhpi", how="left")

facilities.head(2).T

Unnamed: 0,0,1
db_id,0,1
id_within_district,1,2
HSD,Labwor HSD,Labwor HSD
Name,Abim General Hospital,Adea Health Centre II
Level,Hospital,HC II
Authority,MOH,MOH
Ownership,Govt,Govt
NHPI Code,HFA6Q7GB2,HFZW8MEX9
HSDT Code,SCL79ULU0/PA95VLAL8/8001,SC5AFV6Y8/PALG8WW26/8001
Source,mohdhi2017,mohdhi2017
