## [Issue #7](https://github.com/yeemey/haackwell/issues/7)

### Map pattern of polluted, damaged, stolen, etc.

1. Split out free text data from 'STATUS' column into more specific columns.
2. Apply polluted etc. trends to map.

#### Questions

1. Are mechanical pump failures/pollution related to pump type/make?

In [1]:
import pandas as pd
import re
import matplotlib
import matplotlib.pyplot as plt

In [2]:
# Water Point Data Exchange
wpdx_data = 'https://www.hydroshare.org/django_irods/download/d3659dcf575d4090801a74d1ce096d7c/data/contents/WPDx_Well_Function_Upd_151224_xy161117.csv'
wpdx_df = pd.read_csv(wpdx_data)
well_bkdown_df = pd.read_csv('/Users/ymseah/Google Drive/GeoHackWeek/haackwell/dat/well-data-2001-2015-no-rainwater.csv')

In [4]:
wpdx_df

Unnamed: 0,WELL_ID,LAT_DD,LONG_DD,FUNC,STATUS,COD_FCN,COD_QTY,COD_RESRCE,ADM1,ADM2,ACTIVITY,COUNTRY,WATERSRC,WATERTECH,INSTALLED,MGMT,PAY,SOURCE,RPT_DATE
0,362092,5.982436,-8.180609,Yes,Working but with problems. Well polluted|Under...,2,1,0,Grand Gedeh,Tchien,zmbpw,LR,Manual pump on hand-dug well,Vergnet,0,,No water committee,WASH Liberia,21/01/2011
1,362100,5.899207,-8.173315,Yes,Working but with problems. Well polluted|Under...,2,1,0,Grand Gedeh,Tchien,z4ja5,LR,Manual pump on hand-dug well,Vergnet,1986,,No water committee,WASH Liberia,21/01/2011
2,357349,5.802157,-9.645714,Yes,Working but with problems. Not priming,2,1,0,Rivercess,Norwein,yjryl,LR,Manual pump on hand-dug well,Afridev,2008,,No water committee,WASH Liberia,02/02/2011
3,489514,-0.541100,34.375820,No,Drought|No operation in the dry season,999,0,1,Homa Bay,Mbita,Yao Oinga,KE,,Surface water,0,0,No payment system,Engineering Sciences & Global Development,24/01/2011
4,357595,5.716055,-9.618187,No,Broken Down System. low water table,0,1,1,Rivercess,Norwein,y1ccy,LR,Manual pump on hand-dug well,Afridev,2005,,No water committee,WASH Liberia,03/02/2011
5,489266,-0.730000,34.366000,No,No fuel|No operation at least once a week,2,1,0,Homa Bay,Ndhiwa,Water Kiosk,KE,,Gravity-fed communal standpipe,1995,Private Operator/Delegated Management,Per Bucket,Engineering Sciences & Global Development,18/02/2011
6,489625,-0.450333,34.009880,No,Low yield|No operation in the dry season,2,1,1,Homa Bay,Mbita,Wakula Dispensary BH,KE,,Borehole with hand pump,0,Institutional Management,No payment system,Engineering Sciences & Global Development,02/10/2011
7,364570,5.231378,-9.141873,Yes,Working but with problems. Well polluted|Under...,2,1,0,Sinoe,Sanquin Dist#2,v5dph,LR,Manual pump on hand-dug well,Afridev,2010,Community Management,Yes but only in case of breakdown,WASH Liberia,27/01/2011
8,361779,5.225134,-8.121493,Yes,Working but with problems. insufficient water,2,1,0,River Gee,Karforh,v42mh,LR,Manual pump on hand-dug well,Afridev,2009,,No water committee,WASH Liberia,18/02/2011
9,361780,5.225422,-8.119787,Yes,Working but with problems. insufficient water,2,1,0,River Gee,Karforh,v42mf,LR,Manual pump on hand-dug well,Afridev,2009,,No water committee,WASH Liberia,18/02/2011


In [20]:
well_status_df = wpdx_df[['WELL_ID', 'LAT_DD', 'LONG_DD', 'FUNC', 'STATUS', 'WATERSRC', 'WATERTECH']].copy()
well_status_df.set_index('WELL_ID', inplace=True)
well_status_df

Unnamed: 0_level_0,LAT_DD,LONG_DD,FUNC,STATUS,WATERSRC,WATERTECH
WELL_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
362092,5.982436,-8.180609,Yes,Working but with problems. Well polluted|Under...,Manual pump on hand-dug well,Vergnet
362100,5.899207,-8.173315,Yes,Working but with problems. Well polluted|Under...,Manual pump on hand-dug well,Vergnet
357349,5.802157,-9.645714,Yes,Working but with problems. Not priming,Manual pump on hand-dug well,Afridev
489514,-0.541100,34.375820,No,Drought|No operation in the dry season,,Surface water
357595,5.716055,-9.618187,No,Broken Down System. low water table,Manual pump on hand-dug well,Afridev
489266,-0.730000,34.366000,No,No fuel|No operation at least once a week,,Gravity-fed communal standpipe
489625,-0.450333,34.009880,No,Low yield|No operation in the dry season,,Borehole with hand pump
364570,5.231378,-9.141873,Yes,Working but with problems. Well polluted|Under...,Manual pump on hand-dug well,Afridev
361779,5.225134,-8.121493,Yes,Working but with problems. insufficient water,Manual pump on hand-dug well,Afridev
361780,5.225422,-8.119787,Yes,Working but with problems. insufficient water,Manual pump on hand-dug well,Afridev


In [6]:
well_status_df['STATUS'].value_counts()

DRY                                                                                                                                    3708
Status:Functional|Quantity:Insufficient|Quality:Soft                                                                                   3323
Functional ( in use)                                                                                                                   2971
Status:Not functional|Quantity:Insufficient|Quality:Soft                                                                               1572
Status:Functional|Quantity:Seasonal|Quality:Soft                                                                                        394
Status:Not functional|Quantity:Dry|Quality:Soft                                                                                         305
Status:Functional|Quantity:Insufficient|Quality:Salty                                                                                   262
Status:Not functiona

In [46]:
well_status = zip(well_status_df.index.tolist(), well_status_df['STATUS'].tolist())
well_status_dict = {}
for well_id, status in well_status:
    if re.search('Status:Not functional', status) or re.search('Status:Functional', status):
        status_dict = {x.split(':')[0]: x.split(':')[1] for x in status.split('|')}
        #print(status_dict)
        well_status_dict[well_id] = status_dict
print(well_status_dict)

{267515: {'Status': 'Functional', 'Breakdown Year': '1978', 'Reason Not Functioning': 'Abandoned pumping scheme', 'Quantity': 'Dry', 'Quality': 'Soft'}, 276237: {'Status': 'Functional', 'Breakdown Year': '1978', 'Reason Not Functioning': 'Replacement of pump machine', 'Quantity': 'Insufficient', 'Quality': 'Soft'}, 276235: {'Status': 'Functional', 'Breakdown Year': '1978', 'Reason Not Functioning': 'Replacement of pump machine', 'Quantity': 'Insufficient', 'Quality': 'Soft'}, 276241: {'Status': 'Functional', 'Breakdown Year': '1978', 'Reason Not Functioning': 'Replacement of pump machine', 'Quantity': 'Insufficient', 'Quality': 'Soft'}, 276286: {'Status': 'Functional', 'Breakdown Year': '1978', 'Reason Not Functioning': 'Replacement of pump machine', 'Quantity': 'Insufficient', 'Quality': 'Soft'}, 276283: {'Status': 'Functional', 'Breakdown Year': '1978', 'Reason Not Functioning': 'Replacement of pump machine', 'Quantity': 'Insufficient', 'Quality': 'Soft'}, 276285: {'Status': 'Functio

In [26]:
well_status = zip(well_status_df.index.tolist(), well_status_df['STATUS'].tolist())
print(list(well_status))

[(362092, 'Working but with problems. Well polluted|Under construction'), (362100, 'Working but with problems. Well polluted|Under construction'), (357349, 'Working but with problems. Not priming'), (489514, 'Drought|No operation in the dry season'), (357595, 'Broken Down System. low water table'), (489266, 'No fuel|No operation at least once a week'), (489625, 'Low yield|No operation in the dry season'), (364570, 'Working but with problems. Well polluted|Under construction'), (361779, 'Working but with problems. insufficient water'), (361780, 'Working but with problems. insufficient water'), (365177, 'Working but with problems. low pressure'), (489624, 'Equipment not-function|No operation in the dry season'), (359723, 'Broken Down System. low water'), (365567, 'Working but with problems. Require Redigging|Require Redigging'), (359873, 'Working but with problems. low pressire'), (359934, 'Working but with problems. low pressure'), (363124, 'Working but with problems. water polluted'), 