#  Maji Ndogo Water Accessibility Project — Final Queries and Answers

This section contains the key SQL queries and insights generated during the Maji Ndogo Water Accessibility Analysis.  
Each query is documented for reproducibility and reference.





### Question 1 — Format the Date
We want to produce the date in the format **"DD Month YYYY"** from the `time_of_record` column in the `visits` table.  
We’ll use the `DAY()`, `MONTHNAME()`, and `YEAR()` functions with `CONCAT()`.


In [10]:
%%sql

SELECT 
    CONCAT(DAY(time_of_record), ' ', MONTHNAME(time_of_record), ' ', YEAR(time_of_record)) 
    AS formatted_date
FROM 
    visits
limit 5;


 * mysql+pymysql://root:***@localhost:3306/md_water_services
5 rows affected.


formatted_date
1 January 2021
1 January 2021
1 January 2021
1 January 2021
1 January 2021


### Question 2 — Annual Rate of Change (ARC)
We are calculating the **Annual Rate of Change (ARC)** for basic rural water services.  
We’ll group by `name` (country or region) and order by `year`.


In [11]:
%%sql

SELECT
    name,
    wat_bas_r - LAG(wat_bas_r) OVER (PARTITION BY name ORDER BY year) AS annual_rate_change
FROM 
    global_water_access
ORDER BY 
    name;


 * mysql+pymysql://root:***@localhost:3306/md_water_services
464 rows affected.


name,annual_rate_change
Afghanistan,
Afghanistan,13.339405059814451
Albania,
Albania,3.4641036987304688
Algeria,
Algeria,1.6847991943359375
American Samoa,
American Samoa,
Andorra,
Andorra,0.0


### Question 3 — Worst Performing Employees
Find the names of the two worst-performing employees who visited the fewest sites,  
and show how many sites the worst-performing employee visited.


In [26]:
%%sql
SELECT 
    e.employee_name AS employee_name,
    COUNT(DISTINCT v.location_id) AS sites_visited
FROM 
    visits AS v
JOIN 
    employee AS e
  ON v.assigned_employee_id = e.assigned_employee_id
GROUP BY 
    e.employee_name
ORDER BY 
    sites_visited ASC
LIMIT 2;


 * mysql+pymysql://root:***@localhost:3306/md_water_services
2 rows affected.


employee_name,sites_visited
Kunto Asha,8
Lesedi Kofi,94


### Question 4 — Average Queue Time for Shared Taps
This query computes the **average queue time for each shared tap**,  
updated for each visit. We only include sources visited more than once.


In [28]:
%%sql

SELECT 
    location_id,
    time_in_queue,
    AVG(time_in_queue) OVER (PARTITION BY location_id ORDER BY visit_count) AS total_avg_queue_time
FROM 
    visits
WHERE 
    visit_count > 1
ORDER BY 
    location_id, time_of_record
limit 10;


 * mysql+pymysql://root:***@localhost:3306/md_water_services
10 rows affected.


location_id,time_in_queue,total_avg_queue_time
AkHa00036,113,113.0
AkHa00036,229,171.0
AkHa00036,400,247.3333
AkHa00036,120,215.5
AkHa00036,185,209.4
AkHa00036,158,200.8333
AkHa00036,226,204.4286
AkHa00090,50,50.0
AkHa00090,63,56.5
AkHa00090,12,41.6667


### Question 5 — Using TRIM() Function
We use `TRIM()` to remove leading and trailing spaces from text values.  
Example: employee Farai Nia lives at `‘33 Angelique Kidjo Avenue  ’`.  
We’ll see what the result looks like after trimming.


In [29]:
%%sql

SELECT TRIM('33 Angelique Kidjo Avenue  ') AS trimmed_address;


 * mysql+pymysql://root:***@localhost:3306/md_water_services
1 rows affected.


trimmed_address
33 Angelique Kidjo Avenue


### Question 6 — Employees in Dahabu
Find how many employees live in **Dahabu**.


In [32]:
%%sql

SELECT 
    COUNT(*) AS employees_in_dahabu
FROM 
    employee
WHERE 
    town_name = 'Dahabu';


 * mysql+pymysql://root:***@localhost:3306/md_water_services
1 rows affected.


employees_in_dahabu
6


### Question 7 — Employees in Harare, Kilimani
We want to find the number of employees living in **Harare, Kilimani**.


In [35]:
%%sql

SELECT 
    COUNT(*) AS employees_in_harare_kilimani
FROM 
    employee
WHERE 
    town_name = 'Harare' AND province_name = 'Kilimani';


 * mysql+pymysql://root:***@localhost:3306/md_water_services
1 rows affected.


employees_in_harare_kilimani
2


### Question 8 — Average People per Well
Calculate how many people share a well on average.  
We’ll use the `water_source` table and round the result to 0 decimals.


In [36]:
%%sql

SELECT 
    ROUND(AVG(number_of_people_served), 0) AS avg_people_per_well
FROM 
    water_source
WHERE 
    type_of_water_source = 'well';


 * mysql+pymysql://root:***@localhost:3306/md_water_services
1 rows affected.


avg_people_per_well
279


### Question 9 — People Using Taps
We’ll calculate the **total population served by taps** (shared or home taps).  
To filter, we’ll use the `LIKE` operator.


In [37]:
%%sql

SELECT 
    SUM(number_of_people_served) AS total_people_using_taps
FROM 
    water_source
WHERE 
    type_of_water_source LIKE '%tap%';


 * mysql+pymysql://root:***@localhost:3306/md_water_services
1 rows affected.


total_people_using_taps
20423872


### Question 10 — Average Queue Times by Day and Hour
From our pivot table analysis, we compare average queue times for specific days and hours:
- Saturday: 12:00–13:00  
- Tuesday: 18:00–19:00  
- Sunday: 09:00–10:00


In [38]:
%%sql

-- Pivot table query snippet for reference
SELECT
    TIME_FORMAT(TIME(time_of_record), '%H:00') AS hour_of_day,
    ROUND(AVG(CASE WHEN DAYNAME(time_of_record) = 'Sunday' THEN time_in_queue END),0) AS Sunday,
    ROUND(AVG(CASE WHEN DAYNAME(time_of_record) = 'Monday' THEN time_in_queue END),0) AS Monday,
    ROUND(AVG(CASE WHEN DAYNAME(time_of_record) = 'Tuesday' THEN time_in_queue END),0) AS Tuesday,
    ROUND(AVG(CASE WHEN DAYNAME(time_of_record) = 'Wednesday' THEN time_in_queue END),0) AS Wednesday,
    ROUND(AVG(CASE WHEN DAYNAME(time_of_record) = 'Thursday' THEN time_in_queue END),0) AS Thursday,
    ROUND(AVG(CASE WHEN DAYNAME(time_of_record) = 'Friday' THEN time_in_queue END),0) AS Friday,
    ROUND(AVG(CASE WHEN DAYNAME(time_of_record) = 'Saturday' THEN time_in_queue END),0) AS Saturday
FROM 
    visits
WHERE 
    time_in_queue != 0
GROUP BY 
    hour_of_day
ORDER BY 
    hour_of_day;


 * mysql+pymysql://root:***@localhost:3306/md_water_services
14 rows affected.


hour_of_day,Sunday,Monday,Tuesday,Wednesday,Thursday,Friday,Saturday
06:00,79,190,134,112,134,153,247
07:00,82,186,128,111,139,156,247
08:00,86,183,130,119,129,153,247
09:00,84,127,105,94,99,107,252
10:00,83,119,99,89,95,112,259
11:00,78,115,102,86,99,104,236
12:00,78,115,97,88,96,109,239
13:00,81,122,97,98,101,115,242
14:00,83,127,104,92,96,110,244
15:00,83,126,104,88,92,110,248
