Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a list of keywords for STATUS messages to better organize well data #3

Open
chase-dwelle opened this issue Sep 12, 2017 · 3 comments

Comments

@chase-dwelle
Copy link
Collaborator

chase-dwelle commented Sep 12, 2017

In addition to the well functional binary (YES/NO), we also have status messages, e.g.,

Status:Not functional|Quantity:Dry|Quality:Soft
Low yield|Normally operational
Dry pan|No operation in the dry season
No- broken down. Well polluted
No- broken down. WATER TABLE HAS DROPED

So we need to figure out some of these keywords in order to make better categories of well failure conditions.

@chase-dwelle
Copy link
Collaborator Author

chase-dwelle commented Sep 14, 2017

Based on Jimmy's work with NLTK on the status messages, we have a list of keywords that correspond to different failure modes: https://www.lucidchart.com/documents/edit/26a13991-a3a9-4fb2-8572-16b497b7e191?shared=true&

Environmental drivers: {'Reduced water table', 'lowered water table','drought', 'dry', 'dried', 'low yield', 'low flow', 'poor retention','water shortage','source', 'lack','dry season','jerican','jerry can', 'shallow','climatic','insufficient', 'quantity:insufficient'}
Pollution: {'Salty', 'poorly sited', 'millky', 'coloured', 'contaminated', 'odour', 'smell', 'muddy', 'black', 'poor', 'dirty', 'silt', 'soil'}
Potential human causes: {'Committee', 'WSC', 'fuel', 'theft', 'vandalised', 'stolen', 'beneficiaries', 'pay',' paid', 'funds', 'bill', 'people', 'personnel'}
Mechanical causes: {'Pump', 'handle', 'pipes', 'tank', 'construction', 'cylinder', 'apron', 'repair', 'parts', 'installation', 'broken', 'blocked', 'technical'}

@yeemey
Copy link
Owner

yeemey commented Sep 14, 2017

Some words used to tag mechanical failures, (e.g. 'construction'), are applied to wells that are in fact working (e.g. 'STATUS' = 'Functional ( in use)|New Under construction').

Consider using bigrams? Or removing 'FUNC' = 'Yes' entries from consideration for mechanical failures?

@chase-dwelle
Copy link
Collaborator Author

chase-dwelle commented Sep 15, 2017 via email

yeemey added a commit that referenced this issue Sep 15, 2017
Related to comments on issue #3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants