<a href="https://colab.research.google.com/github/cyruskimani/Data-Analytics-challenge/blob/main/Data_Analytics_challenge.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data Science/ Data Analytics challenge

Consider a dataset providing information on the functionality of infrastructure resources,
for each water point it includes the name of the village it is in and its functional state.
Implement a data processing module in python which takes a dataset URL as input and
returns:

● The number of water points that are functional,

● The number of water points per community,

● The rank for each community by the percentage of broken water points.

In [8]:
# Challenge solution
import pandas as pd                         # import pandas
import numpy as np                          # import numpy
import json                                 # import json
import urllib.request                       # import url libraries
from urllib.request import urlopen


class WaterResources:
  def __init__(self,dataset_url):
    self.dataset_url= urllib.request.urlopen(str(dataset_url))                            # Reading the data from the dataset url link.
    self.df = pd.read_json( self.dataset_url.read())                                      # Loading the data into a pandas daraframe.
    self.functional =self.df['water_functioning'][self.df['water_functioning'] =="yes"]   # Isolating functional water points.

  def functional_water_points (self):
    print("The number of water points that are functional = " , self.functional.size)     # Count of functional water points.
    print('********************************************')

  def water_points_per_community (self):
    water_points = self.df.groupby("communities_villages").size()                         # Water points per community village.
    print("The number of water points per community: ")
    print(water_points)
    print('********************************************')

  def community_ranking (self):
    broken_water_points = self.df['water_point_condition'] =="broken"                               # Isolating broken water points.
    broken_water_points_df= self.df[broken_water_points]                                            # Creating a dataframe for broken waterpoints                
    comm_broken_points_df= broken_water_points_df.groupby("communities_villages").size().to_frame() # Creating a dataframe for broken water points per community
    comm_broken_points_df=comm_broken_points_df.rename(columns={0:'% of broken water points'})
    number_broken = comm_broken_points_df['% of broken water points'].sum()
    comm_broken_points_df['% of broken water points'] = comm_broken_points_df['% of broken water points'].map(lambda  x: (x/number_broken)*100)
    print()
    print("The rank for each community by percentage of broken water points: ")
    print("* Communities with least broken water points first")
    print(comm_broken_points_df.sort_values(ascending=True, by='% of broken water points'))

# Dataset link: https://raw.githubusercontent.com/onaio/ona-tech/master/data/water_points.json

dataset_url = input("Input the dataset URL and press ENTER : ")                            # Input the dataset link

output = WaterResources(dataset_url)                                                       # Passing the dataset url into the WaterResources class.
output.functional_water_points()                                                           # Output the number of functional water points.
output.water_points_per_community()                                                        # Output the number of water points per community.
output.community_ranking()


Input the dataset URL and press ENTER : https://raw.githubusercontent.com/onaio/ona-tech/master/data/water_points.json
The number of water points that are functional =  623
********************************************
The number of water points per community: 
communities_villages
Abanyeri        4
Akpari-yeri     3
Alavanyo        3
Arigu          12
Badomsa        27
               ..
Zogsa           6
Zua            28
Zuedema        18
Zukpeni         6
Zundem         30
Length: 65, dtype: int64
********************************************

The rank for each community by percentage of broken water points: 
* Communities with least broken water points first
                      % of broken water points
communities_villages                          
Zukpeni                                    2.5
Banyangsa                                  2.5
Guuta                                      2.5
Jagsa                                      2.5
Kaasa                                      2.5
Ka

In [2]:
# Rough work 
data = urlopen(dataset_url)
water = json.loads(data.read())
water = pd.DataFrame(water)
water.head()

Unnamed: 0,water_pay,respondent,research_asst_name,water_used_season,_bamboo_dataset_id,_deleted_at,water_point_condition,_xform_id_string,other_point_1km,_attachments,communities_villages,end,animal_number,water_point_id,start,water_connected,water_manager_name,_status,enum_id_1,water_lift_mechanism,districts_divisions,_uuid,grid,date,formhub/uuid,road_available,water_functioning,_submission_time,signal,water_source_type,_geolocation,water_point_image,water_point_geocode,deviceid,locations_wards,water_manager,water_developer,_id,animal_point,water_mechanism_plate,water_lift_mechanism_type,road_type,water_mechanism_plate_units,water_mechanism_plate_no,water_not_functioning,water_source_type_other,simserial,subscriberid
0,no,community,Haruna Mohammed,year_round,,,functioning,_08_Water_points_CV,no,[north_ghana/attachments/1351696546452.jpg],Gumaryili,2012-11-12T11:46:32.454Z,more_500,xxx,2012-10-31T15:11:04.618Z,no,community members,submitted_via_web,5,no,northern,f8bcee72d7a0400fb99ae11bbf804010,grid_further_500_m,2012-10-31,4d41d54d134c4bfa9078571addd819b9,no,yes,2012-11-13T07:13:57,low,dam_dugout,"[10.1892764, -0.66410362]",1351696546452.jpg,10.1892764 -0.66410362 155.10000610351563 5.0,355047040123780,west_mamprusi,community,community,381705,yes,,,,,,,,,
1,no,community,Haruna Mohmmed,year_round,,,functioning,_08_Water_points_CV,yes,[north_ghana/attachments/1351701849971.jpg],Selinvoya,2012-11-12T11:49:36.619Z,50_to_500,xxx,2012-10-31T16:41:49.738Z,no,Amadu Salifu,submitted_via_web,5,yes,northern,c2f6b298955f47ab9f177bee1214141d,grid_further_500_m,2012-10-31,4d41d54d134c4bfa9078571addd819b9,yes,yes,2012-11-13T07:14:04,high,unprotected_well,"[10.28173052, -0.56901122]",1351701849971.jpg,10.28173052 -0.56901122 201.89999389648438 5.0,355047040123780,west_mamprusi,individual,community,381706,yes,no,manual_power,gravel,,,,,,
2,no,community,Haruna Mohammed,year_round,,,functioning,_08_Water_points_CV,yes,[north_ghana/attachments/1351702462336.jpg],Selinvoya,2012-10-31T16:57:37.864Z,50_to_500,xxx,2012-10-31T16:52:02.601Z,no,Sulemana Abdulai,submitted_via_web,5,yes,northern,6bc6d188611d47f6a666cfd1eaa33998,grid_further_500_m,2012-10-31,4d41d54d134c4bfa9078571addd819b9,yes,yes,2012-11-13T07:14:07,high,borehole,"[10.28169238, -0.56962993]",1351702462336.jpg,10.28169238 -0.56962993 202.60000610351563 5.0,355047040123780,west_mamprusi,community,individual,381707,yes,no,manual_power,paved,,,,,,
3,no,community,Haruna Mohammed,year_round,,,functioning,_08_Water_points_CV,yes,[north_ghana/attachments/1351702971561.jpg],Selinvoya,2012-10-31T17:06:55.047Z,50_to_500,xxx,2012-10-31T16:58:46.672Z,no,Haruna Mohammed,submitted_via_web,5,yes,northern,4b28ac4cbba744d79ba4257f772f94d6,grid_further_500_m,2012-10-31,4d41d54d134c4bfa9078571addd819b9,yes,yes,2012-11-13T07:14:14,high,borehole,"[10.28115661, -0.56918339]",1351702971561.jpg,10.28115661 -0.56918339 199.6999969482422 5.0,355047040123780,west_mamprusi,individual,community,381708,yes,no,manual_power,paved,,,,,,
4,no,community,Haruna Mohammed,year_round,,,functioning,_08_Water_points_CV,yes,[north_ghana/attachments/1351703622326.jpg],Selinvoya,2012-10-31T17:15:57.847Z,50_to_500,xxx,2012-10-31T17:08:27.160Z,no,Sulemana,submitted_via_web,5,yes,northern,7893ce5321804f229e533f36e90c9c6f,grid_further_500_m,2012-10-31,4d41d54d134c4bfa9078571addd819b9,yes,yes,2012-11-13T07:14:22,high,borehole,"[10.28044635, -0.56723556]",1351703622326.jpg,10.28044635 -0.56723556 208.6999969482422 5.0,355047040123780,west_mamprusi,community,community,381709,yes,no,manual_power,paved,,,,,,


In [3]:
water.water_functioning.unique()

array(['yes', 'no', 'na_dn'], dtype=object)

In [4]:
water.water_point_condition.unique()

array(['functioning', 'newly_constructed', 'under_construction',
       'abandoned', 'broken', 'na_dn'], dtype=object)