# Integrated Project 2
Part 2: Clustering data to unveil Maji Ndogo's water crisis

At the end of this project, we will be able to:

1. Gear up for a data analysis of Maji Ndogo's water scenario.

2. Harness the power of SQL functions, including intricate window functions, to draw insights from the data.

3. Aggregate data to unravel the scale of the problem, and start to form some actionable insights.

First, we have to load our database.

In [1]:
%load_ext sql

[33mThere's a new jupysql version available (0.10.4), you're running 0.10.2. To upgrade: pip install jupysql --upgrade[0m


In [3]:
pip install jupysql

Note: you may need to restart the kernel to use updated packages.


In [2]:
%sql mysql+pymysql://root:Pk_0270197907@localhost:3306/md_water_services

# 1. Cleaning our data

Let's bring up the employee table. It has info on all of our workers, but note that the email addresses have not been added. We will have to send them reports and figures, so let's update it. Luckily the emails for our department are easy: first_name.last_name@ndogowater.gov

In [5]:
%%sql
SELECT
    *
FROM
    employee;

assigned_employee_id,employee_name,phone_number,email,address,province_name,town_name,position
0,Amara Jengo,99637993287,,36 Pwani Mchangani Road,Sokoto,Ilanga,Field Surveyor
1,Bello Azibo,99643864786,,129 Ziwa La Kioo Road,Kilimani,Rural,Field Surveyor
2,Bakari Iniko,99222599041,,18 Mlima Tazama Avenue,Hawassa,Rural,Field Surveyor
3,Malachi Mavuso,99945849900,,100 Mogadishu Road,Akatsi,Lusaka,Field Surveyor
4,Cheche Buhle,99381679640,,1 Savanna Street,Akatsi,Rural,Field Surveyor
5,Zuriel Matembo,99034075111,,26 Bahari Ya Faraja Road,Kilimani,Rural,Field Surveyor
6,Deka Osumare,99379364631,,104 Kenyatta Street,Akatsi,Rural,Field Surveyor
7,Lalitha Kaburi,99681623240,,145 Sungura Amanpour Road,Kilimani,Rural,Field Surveyor
8,Enitan Zuri,99248509202,,117 Kampala Road,Hawassa,Zanzibar,Field Surveyor
10,Farai Nia,99570082739,,33 Angélique Kidjo Avenue,Amanzi,Dahabu,Field Surveyor


We can determine the email address for each employee by:
- selecting the employee_name column
- replacing the space with a full stop
- make it lowercase
- and stitch it all together

In [6]:
%%sql
UPDATE
    employee
SET
    email=
    CONCAT(
    LOWER(REPLACE(employee_name, ' ', '.')), '@ndogowater.gov');

In [7]:
%%sql
SELECT
    *
FROM
    employee;

assigned_employee_id,employee_name,phone_number,email,address,province_name,town_name,position
0,Amara Jengo,99637993287,amara.jengo@ndogowater.gov,36 Pwani Mchangani Road,Sokoto,Ilanga,Field Surveyor
1,Bello Azibo,99643864786,bello.azibo@ndogowater.gov,129 Ziwa La Kioo Road,Kilimani,Rural,Field Surveyor
2,Bakari Iniko,99222599041,bakari.iniko@ndogowater.gov,18 Mlima Tazama Avenue,Hawassa,Rural,Field Surveyor
3,Malachi Mavuso,99945849900,malachi.mavuso@ndogowater.gov,100 Mogadishu Road,Akatsi,Lusaka,Field Surveyor
4,Cheche Buhle,99381679640,cheche.buhle@ndogowater.gov,1 Savanna Street,Akatsi,Rural,Field Surveyor
5,Zuriel Matembo,99034075111,zuriel.matembo@ndogowater.gov,26 Bahari Ya Faraja Road,Kilimani,Rural,Field Surveyor
6,Deka Osumare,99379364631,deka.osumare@ndogowater.gov,104 Kenyatta Street,Akatsi,Rural,Field Surveyor
7,Lalitha Kaburi,99681623240,lalitha.kaburi@ndogowater.gov,145 Sungura Amanpour Road,Kilimani,Rural,Field Surveyor
8,Enitan Zuri,99248509202,enitan.zuri@ndogowater.gov,117 Kampala Road,Hawassa,Zanzibar,Field Surveyor
10,Farai Nia,99570082739,farai.nia@ndogowater.gov,33 Angélique Kidjo Avenue,Amanzi,Dahabu,Field Surveyor


I picked up another bit we have to clean up. Often when databases are created and updated, or information is collected from different sources, errors creep in. For example, if you look at the phone numbers in the phone_number column, the values are stored as strings. The phone numbers should be 12 characters long, consisting of the plus sign, area code (99), and the phone number digits. However, when we use the LENGTH(column) function, it returns 13 characters, indicating there's an extra character.

In [8]:
%%sql
SELECT
    LENGTH(phone_number)
FROM
    employee;

LENGTH(phone_number)
13
13
13
13
13
13
13
13
13
13


That's because there is a space at the end of the number! If you try to send an automated SMS to that number it will fail. This happens so often that they create a function, especially for trimming off the space, called TRIM(column). It removes any leading or trailing spaces from a string.

In [9]:
%%sql

UPDATE
    employee
SET
    phone_number=
    (TRIM(phone_number));

In [10]:
%%sql
SELECT
    LENGTH(phone_number)
FROM
    employee;

LENGTH(phone_number)
12
12
12
12
12
12
12
12
12
12


# 2. Honouring the workers
Let's have a look at where our employees live.
Use the employee table to count how many of our employees live in each town.

In [12]:
%%sql

SELECT
    Town_name,
    COUNT(employee_name) AS num_employees
FROM
    employee
GROUP BY
    town_name;

Town_name,num_employees
Ilanga,3
Rural,29
Lusaka,4
Zanzibar,4
Dahabu,6
Kintampo,1
Harare,5
Yaounde,1
Serowe,3


Note how many of our workers are living in smaller communities in the rural parts of Maji Ndogo. Pres. Naledi congratulated the team for completing the survey, but we would not have this data were it not for our field workers. So let's gather
some data on their performance in this process, so we can thank those who really put all their effort in. Pres. Naledi has asked we send out an email or message congratulating the top 3 field surveyors. So let's use the database to get the employee_ids and use those to get the names, email and phone numbers of the three field surveyors with the most location visits.

Let's first look at the number of records each employee collected. So find the correct table, figure out what function to use and how to group, order
and limit the results to only see the top 3 employee_ids with the highest number of locations visited.

In [14]:
%%sql

SELECT
    assigned_employee_id,
    COUNT(visit_count) AS number_of_visits
FROM
    visits
GROUP BY
    assigned_employee_id
ORDER BY
    COUNT(visit_count) DESC
LIMIT 3;

assigned_employee_id,number_of_visits
1,3708
30,3676
34,3539


Make a note of the top 3 assigned_employee_id and use them to create a query that looks up the employee's info.
You should have a column of names, email addresses and phone numbers.

In [15]:
%%sql

SELECT
    *
FROM
    employee
WHERE
    assigned_employee_id IN (1, 30, 34);

assigned_employee_id,employee_name,phone_number,email,address,province_name,town_name,position
1,Bello Azibo,99643864786,bello.azibo@ndogowater.gov,129 Ziwa La Kioo Road,Kilimani,Rural,Field Surveyor
30,Pili Zola,99822478933,pili.zola@ndogowater.gov,46 Stone Town Street,Hawassa,Zanzibar,Field Surveyor
34,Rudo Imani,99046972648,rudo.imani@ndogowater.gov,15 Lake Malawi Avenue,Kilimani,Ilanga,Field Surveyor


# 3. Analysing locations
Looking at the location table, let’s focus on the province_name, town_name and location_type to understand where the water sources are in Maji Ndogo.
Create a query that counts the number of records per town.

In [16]:
%%sql

SELECT
    Town_name,
    COUNT(town_name) AS records_per_town
FROM 
    Location
GROUP BY
    Town_name
ORDER BY
    COUNT(town_name)DESC;

Town_name,records_per_town
Rural,23740
Harare,1650
Amina,1090
Lusaka,1070
Mrembo,990
Asmara,930
Dahabu,930
Kintampo,780
Ilanga,780
Isiqalo,770


Now count the records per province.

In [17]:
%%sql

SELECT
    Province_name,
    COUNT(Province_name) AS records_per_province
FROM 
    Location
GROUP BY
    province_name
ORDER BY
    COUNT(Province_name)DESC;

Province_name,records_per_province
Kilimani,9510
Akatsi,8940
Sokoto,8220
Amanzi,6950
Hawassa,6030


From this table, it's pretty clear that most of the water sources in the survey are situated in small rural communities, scattered across Maji Ndogo. If we count the records for each province, most of them have a similar number of sources, so every province is well-represented in the survey.

1. Create a result set showing:
• province_name
• town_name
• An aggregated count of records for each town (consider naming this records_per_town).
• Ensure your data is grouped by both province_name and town_name.
2. Order your results primarily by province_name. Within each province, further sort the towns by their record counts in descending order.

In [18]:
%%sql

SELECT
    province_name,
    Town_name,
    COUNT(Town_name) AS records_per_town
FROM 
    Location
GROUP BY
    Province_name, town_name
ORDER BY
    province_name,  COUNT(Town_name) DESC;

province_name,Town_name,records_per_town
Akatsi,Rural,6290
Akatsi,Lusaka,1070
Akatsi,Harare,800
Akatsi,Kintampo,780
Amanzi,Rural,3100
Amanzi,Asmara,930
Amanzi,Dahabu,930
Amanzi,Amina,670
Amanzi,Pwani,520
Amanzi,Abidjan,400


These results show us that our field surveyors did an excellent job of documenting the status of our country's water crisis. Every province and town has many documented sources. This makes me confident that the data we have is reliable enough to base our decisions on. This is an insight we can use to communicate data integrity, so let's make a note of that.

Finally, look at the number of records for each location type.

In [19]:
%%sql

SELECT
    Location_type,
    COUNT(location_type) AS num_sources
FROM
    Location
GROUP BY
    location_type;

Location_type,num_sources
Urban,15910
Rural,23740


We can see that there are more rural sources than urban, but it's really hard to understand those numbers. Percentages are more relatable. If we use SQL as a very overpowered calculator:

In [23]:
%%sql

SELECT 23740 / (15910 + 23740) * 100 AS pct_rural_sources

pct_rural_sources
59.8739


We can see that 60% of all water sources in the data set are in rural communities.
So again, what are some of the insights we gained from the location table?
1. Our entire country was properly canvassed, and our dataset represents the situation on the ground.
2. 60% of our water sources are in rural communities across Maji Ndogo. We need to keep this in mind when we make decisions.

# 4. Diving into the sources
Ok, water_source is a big table, with lots of stories to tell, so strap in!

These are the questions that I am curious about.
1. How many people did we survey in total?
2. How many wells, taps and rivers are there?
3. How many people share particular types of water sources on average?
4. How many people are getting water from each type of source?

In [24]:
%%sql

SELECT
    sum(number_of_people_served)
FROM
    water_source;

sum(number_of_people_served)
27628140


In [25]:
%%sql

SELECT
    Type_of_water_source,
    COUNT(source_ID) AS number_of_sources
FROM
    water_source
GROUP BY
    type_of_water_source
ORDER BY
    COUNT(source_ID);

Type_of_water_source,number_of_sources
river,3379
shared_tap,5767
tap_in_home_broken,5856
tap_in_home,7265
well,17383


In [28]:
%%sql

SELECT
    Type_of_water_source,
    ROUND(AVG((number_of_people_served))) AS average_people_per_source
FROM
    water_source
GROUP BY
    type_of_water_source;

Type_of_water_source,average_people_per_source
tap_in_home,644
tap_in_home_broken,649
well,279
shared_tap,2071
river,699


The surveyors combined the data of many
households together and added this as a single tap record, but each household actually has its own tap. In addition to this, there is an average of 6 people living in a home. So 6 people actually share 1 tap (not 644). It is always important to think about data. We tend to just analyse, and calculate at the start of our careers, but the value we bring as data
practitioners is in understanding the meaning of results or numbers, and interpreting their meaning.

This means that 1 tap_in_home actually represents 644 ÷ 6 = ± 100 taps. Calculating the average number of people served by a single instance of each water source type helps us understand the typical capacity or load on a single water source. This can help us decide which sources should be repaired or upgraded, based on the average impact of each upgrade. For example, wells don't seem to be a problem, as fewer people are sharing them. On the other hand, 2000 share a single public tap on average! We saw some of the queue times last time, and now we can see why. So looking at these results, we probably should focus on improving shared taps first.

Now let’s calculate the total number of people served by each type of water source in total, to make it easier to interpret, order them so the most
people served by a source is at the top.

In [32]:
%%sql

SELECT
    Type_of_water_source,
    ROUND((SUM(number_of_people_served))) AS population_served
FROM
    water_source
GROUP BY
    type_of_water_source
ORDER BY
    ROUND((SUM(number_of_people_served))) DESC;

Type_of_water_source,population_served
shared_tap,11945272
well,4841724
tap_in_home,4678880
tap_in_home_broken,3799720
river,2362544


Next, calculate the percentages using the total we just got.

In [36]:
%%sql

SELECT
    Type_of_water_source,
    ROUND((SUM(number_of_people_served)/27628140) * 100) AS pct_people_served
FROM
    water_source
GROUP BY
    type_of_water_source
ORDER BY
    ROUND((SUM(number_of_people_served)/27628140) * 100) DESC;

Type_of_water_source,pct_people_served
shared_tap,43
well,18
tap_in_home,17
tap_in_home_broken,14
river,9


43% of our people are using shared taps in their communities, and on average, we saw earlier, that 2000 people share one shared_tap. By adding tap_in_home and tap_in_home_broken together, we see that 31% of people have water infrastructure installed in their homes, but 45% (14/31) of these taps are not working! This isn't the tap itself that is broken, but rather the infrastructure like treatment plants, reservoirs, pipes, and pumps that serve these homes that are broken. 18% of people are using wells. But only 4916 out of 17383 are clean = 28%.

# 5. Start of a solution
At some point, we will have to fix or improve all of the infrastructure, so we should start thinking about how we can make a data-driven decision how to do it. I think a simple approach is to fix the things that affect most people first. So let's write a query that ranks each type of source based on how many people in total use it. RANK() should tell you we are going to need a window function to do this, so let's think through the problem.

We will need the following columns:
- Type of sources 
- Total people served grouped by the types
- A rank based on the total people served, grouped by the types 

In [37]:
%%sql

SELECT
    Type_of_water_source,
    SUM(number_of_people_served) AS total_people_served,
    RANK() OVER (ORDER BY SUM(number_of_people_served) DESC) AS Rank_by_population
FROM
    water_source
GROUP BY
    type_of_water_source;

Type_of_water_source,total_people_served,Rank_by_population
shared_tap,11945272,1
well,4841724,2
tap_in_home,4678880,3
tap_in_home_broken,3799720,4
river,2362544,5


But think about this: If someone has a tap in their home, they already have the best source available. Since we can’t do anything more to improve this, we should remove tap_in_home from the ranking before we continue. So use a window function on the total people served column, converting it into a rank.

In [38]:
%%sql

SELECT
    Type_of_water_source,
    SUM(number_of_people_served) AS total_people_served,
    RANK() OVER (ORDER BY SUM(number_of_people_served) DESC) AS Rank_by_population
FROM
    water_source
WHERE
    type_of_water_source !='tap_in_home'
GROUP BY
    type_of_water_source;

Type_of_water_source,total_people_served,Rank_by_population
shared_tap,11945272,1
well,4841724,2
tap_in_home_broken,3799720,3
river,2362544,4


Ok, so we should fix shared taps first, then wells, and so on. But the next question is, which shared taps or wells should be fixed first? We can use the same logic; the most used sources should really be fixed first.

So create a query to do this, and keep these requirements in mind:
1. The sources within each type should be assigned a rank.
2. Limit the results to only improvable sources.
3. Think about how to partition, filter and order the results set.
4. Order the results to see the top of the list.

In [41]:
%%sql

SELECT
    Source_ID,
    type_of_water_source,
    number_of_people_served,
    RANK() OVER(PARTITION BY Type_of_water_source ORDER BY (number_of_people_served) DESC) AS priority_rank
FROM
    Water_source
WHERE
    type_of_water_source !='tap_in_home'
GROUP BY
	Source_ID;

Source_ID,type_of_water_source,number_of_people_served,priority_rank
SoRu34798224,river,998,1
SoRu35837224,river,998,1
SoRu36238224,river,998,1
SoRu36791224,river,998,1
SoRu36880224,river,998,1
SoRu38142224,river,998,1
SoRu37756224,river,998,1
SoMa33775224,river,998,1
KiRu30353224,river,998,1
SoIl32972224,river,998,1


# 6. Analysing queues
A recap from last time:
The visits table documented all of the visits our field surveyors made to each location. For most sources, one visit was enough, but if there were queues, they visited the location a couple of times to get a good idea of the time it took for people to queue for water. So we have the time that they collected the data, how many times the site was visited, and how long people had to queue for water. So, look at the information we have available, and think of what we could learn from it. Remember we can use some DateTime functions here to get some deeper insight into the water queueing situation in Maji Ndogo, like which day of the week it was, and what time.

Ok, these are some of the things I think are worth looking at:
1. How long did the survey take?
2. What is the average total queue time for water?
3. What is the average queue time on different days?
4. How can we communicate this information efficiently?

Question 1:
To calculate how long the survey took, we need to get the first and last dates (which functions can find the largest/smallest value), and subtract them. Remember with DateTime data, we can't just subtract the values. We have to use a function to get the difference in days.

In [43]:
%%sql

SELECT
    MAX(time_of_record),
    MIN(time_of_record),
    DATEDIFF(MAX(time_of_record),MIN(time_of_record)) AS study_duration
FROM
    visits;

MAX(time_of_record),MIN(time_of_record),study_duration
2023-07-14 13:53:00,2021-01-01 09:10:00,924


Question 2:
Let's see how long people have to queue on average in Maji Ndogo. Keep in mind that many sources like taps_in_home have no queues. These are just recorded as 0 in the time_in_queue column, so when we calculate averages, we need to exclude those rows. 

In [44]:
%%sql

SELECT
    AVG(time_in_queue)
FROM
    Visits
WHERE 
    Time_in_queue != 0;

AVG(time_in_queue)
123.2574


Question 3:
So let's look at the queue times aggregated across the different days of the week.

In [45]:
%%sql

SELECT
    ROUND(AVG(time_in_queue)) AS avg_queue_time,
    dayname(time_of_record) AS day_of_week
FROM
    Visits
WHERE 
    Time_in_queue != 0
GROUP BY
    dayname(time_of_record);

avg_queue_time,day_of_week
120,Friday
246,Saturday
82,Sunday
137,Monday
108,Tuesday
97,Wednesday
105,Thursday


Question 4:
We can also look at what time during the day people collect water. Try to order the results in a meaningful way.

In [50]:
%%sql
SELECT
    ROUND(AVG(time_in_queue)) AS avg_queue_time,
    TIME_FORMAT(TIME(time_of_record), '%H:00') AS hour_of_day
FROM
    Visits
WHERE 
    Time_in_queue != 0
GROUP BY
    TIME_FORMAT(TIME(time_of_record), '%H:00')
ORDER BY
    TIME_FORMAT(TIME(time_of_record), '%H:00');

avg_queue_time,hour_of_day
149,06:00
149,07:00
149,08:00
118,09:00
114,10:00
111,11:00
112,12:00
115,13:00
114,14:00
114,15:00


Can you see that mornings and evenings are the busiest? It looks like people collect water before and after work. Wouldn't it be nice to break down the queue times for each hour of each day? In a spreadsheet, we can just create a pivot table. Pivot tables are not widely used in SQL, despite being useful for interpreting results. So there are no built-in functions to do this for us. Sometimes the dataset is just so massive that it is the only option. For rows, we will use the hour of the day in that nice format, and then make each column a different day!

To filter a row we use WHERE, but using CASE() in SELECT can filter columns. We can use a CASE() function for each day to separate the queue time column into a column for each day. So, when a row's DAYNAME(time_of_record) is Sunday, we make that value equal to time_in_queue, and NULL for any days.

In [51]:
%%sql

SELECT
    TIME_FORMAT(TIME(time_of_record), '%H:00') AS hour_of_day,
    ROUND(AVG(
        CASE
        WHEN DAYNAME(time_of_record) = 'Sunday' THEN time_in_queue
        ELSE NULL
    END
        ),0) AS Sunday,

    ROUND(AVG(
        CASE
        WHEN DAYNAME(time_of_record) = 'Monday' THEN time_in_queue
        ELSE NULL
    END
        ),0) AS Monday,
    ROUND(AVG(
        CASE
        WHEN DAYNAME(time_of_record) = 'Tuesday' THEN time_in_queue
        ELSE NULL
    END
        ),0) AS Tuesday,
    ROUND(AVG(
        CASE
        WHEN DAYNAME(time_of_record) = 'Wednesday' THEN time_in_queue
        ELSE NULL
    END
        ),0) AS Wednesday,
    ROUND(AVG(
        CASE
        WHEN DAYNAME(time_of_record) = 'Thursday' THEN time_in_queue
        ELSE NULL
    END
        ),0) AS Thursday,
    ROUND(AVG(
        CASE
        WHEN DAYNAME(time_of_record) = 'Friday' THEN time_in_queue
        ELSE NULL
    END
        ),0) AS Friday,
    ROUND(AVG(
        CASE
        WHEN DAYNAME(time_of_record) = 'Saturday' THEN time_in_queue
        ELSE NULL
    END
        ),0) AS Saturday
FROM
    visits
WHERE
    time_in_queue != 0 
GROUP BY
    hour_of_day
ORDER BY
    hour_of_day;

hour_of_day,Sunday,Monday,Tuesday,Wednesday,Thursday,Friday,Saturday
06:00,79,190,134,112,134,153,247
07:00,82,186,128,111,139,156,247
08:00,86,183,130,119,129,153,247
09:00,84,127,105,94,99,107,252
10:00,83,119,99,89,95,112,259
11:00,78,115,102,86,99,104,236
12:00,78,115,97,88,96,109,239
13:00,81,122,97,98,101,115,242
14:00,83,127,104,92,96,110,244
15:00,83,126,104,88,92,110,248


1. Queues are very long on a Monday morning and Monday evening as people rush to get water.
2. Wednesday has the lowest queue times, but long queues on Wednesday evening.
3. People have to queue pretty much twice as long on Saturdays compared to the weekdays. It looks like people spend their Saturdays queueing
for water, perhaps for the week's supply?
4. The shortest queues are on Sundays, and this is a cultural thing. The people of Maji Ndogo prioritise family and religion, so Sundays are spent
with family and friends.

By understanding CASE, we could build a complex query that aggregates our data in a format that is very easy to understand.


# 7. Reporting insights

Water Accessibility and infrastructure summary report. This survey aimed to identify the water sources people use and determine both the total and average number of users for each source. Additionally, it examined the duration citizens typically spend in queues to access water.

Insights
1. Most water sources are rural.
2. 43% of our people are using shared taps. 2000 people often share one tap.
3. 31% of our population has water infrastructure in their homes, but within that group, 45% face non-functional systems due to issues with pipes,
pumps, and reservoirs.
4. 18% of our people are using wells of which, but within that, only 28% are clean..
5. Our citizens often face long wait times for water, averaging more than 120 minutes.
6. In terms of queues:
- Queues are very long on Saturdays.
- Queues are longer in the mornings and evenings.
- Wednesdays and Sundays have the shortest queues

Start of our plan
We have started thinking about a plan:
1. We want to focus our efforts on improving the water sources that affect the most people.
- Most people will benefit if we improve the shared taps first.
- Wells are a good source of water, but many are contaminated. Fixing this will benefit a lot of people.
- Fixing existing infrastructure will help many people. If they have running water again, they won't have to queue, thereby shorting queue times for
others. So we can solve two problems at once.
- Installing taps in homes will stretch our resources too thin, so for now, if the queue times are low, we won't improve that source.
2. Most water sources are in rural areas. We need to ensure our teams know this as this means they will have to make these repairs/upgrades in
rural areas where road conditions, supplies, and labour are harder challenges to overcome.

Practical solutions
1. If communities are using rivers, we can dispatch trucks to those regions to provide water temporarily in the short term, while we send out crews to drill for wells, providing a more permanent solution.
2. If communities are using wells, we can install filters to purify the water. For wells with biological contamination, we can install UV filters that kill microorganisms, and for *polluted wells*, we can install reverse osmosis filters. In the long term, we need to figure out why these sources are polluted.
3. For shared taps, in the short term, we can send additional water tankers to the busiest taps, on the busiest days. We can use the queue time pivot table we made to send tankers at the busiest times. Meanwhile, we can start the work on installing extra taps where they are needed. According to UN standards, the maximum acceptable wait time for water is 30 minutes. With this in mind, our aim is to install taps to get queue times below 30 min.
4. Shared taps with short queue times (< 30 min) represent a logistical challenge to further reduce waiting times. The most effective solution, installing taps in homes, is resource-intensive and better suited as a long-term goal.
5. Addressing broken infrastructure offers a significant impact even with just a single intervention. It is expensive to fix, but so many people can benefit from repairing one facility. For example, fixing a reservoir or pipe that multiple taps are connected to. We will have to find the commonly affected areas though to see where the problem actually is.