# Investigating the Chemical Properties of Drinking Water: A Statistical Analysis of pH and Conductivity in Potable and Non-Potable Water Sources
### Group 34: Roberto Mulliadi, Brian Suharianto, Yuxin Chen, Angelina Hsu

## Introduction
Access to potable water is crucial for various human activities, including domestic use, agriculture, energy production, industrial development, and poverty reduction. These help society achieve economic prosperity (World Health Organization, 2022).

Our project analyzes how the pH levels and conductivity of water affect its potability. We will carry out hypothesis testing and construct confidence intervals based on the Water Potability Dataset from __[Kaggle](https://www.kaggle.com/datasets/adityakadiwal/water-potability)__, which includes data collected from 3276 distinct water bodies across the world. 

The point estimate will be the difference in means for both pH and conductivity, and the sample standard deviation will be the measure of variability due to the unknown population distribution. The population of interest is all potentially potable water bodies worldwide.

##### Definitions
- Conductivity (numerical): Amount of dissolved ions and salts in the water.
- pH (numerical): Concentration of hydrogen ions present in the water.

## Research Question
How do pH levels and conductivity affect the potability of all potential sources of drinking water worldwide?

Hypotheses for pH levels ($\mu_1$ is the mean pH level of potable water and $\mu_2$ is the mean pH level of non-potable water)

$H_0: \mu_1 - \mu_2 = 0$ 

$H_A: \mu_1 - \mu_2 \neq 0$

Hypotheses for Conductivity ($\mu_1$ is the mean Conductivity of potable water and $\mu_2$ is the mean Conductivity of non-potable water)

$H_0: \mu_1 - \mu_2 = 0$ 

$H_A: \mu_1 - \mu_2 \neq 0$

## Preliminary Data Analysis
Load the dataset, standardize all columns except Potability, and check the linear correlation between the 9 variables. The correlation matrix shows no significant linear correlations.

In [4]:
# The following libraries are required to run and visualize the dataset
library(tidyverse, quietly = TRUE)
library(dplyr, quietly = TRUE)
install.packages("infer", quietly = TRUE)
library(infer, quietly = TRUE)
library(ggplot2, quietly = TRUE)
library(tidyr, quietly = TRUE)
install.packages("cowplot", quietly = TRUE)
library(cowplot, quietly = TRUE)
library(gridExtra, quietly = TRUE)

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.2 ──
[32m✔[39m [34mggplot2[39m 3.3.6      [32m✔[39m [34mpurrr  [39m 0.3.4 
[32m✔[39m [34mtibble [39m 3.1.8      [32m✔[39m [34mdplyr  [39m 1.0.10
[32m✔[39m [34mtidyr  [39m 1.2.1      [32m✔[39m [34mstringr[39m 1.4.1 
[32m✔[39m [34mreadr  [39m 2.1.2      [32m✔[39m [34mforcats[39m 0.5.2 
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done

Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done


Attaching package: ‘gridExtra’


The following object is masked from ‘package:dplyr’:

    combine




In [5]:
water_data <- read_csv(url("https://raw.githubusercontent.com/robertomulliadi/STAT201-Project/main/water_potability%20(1).csv"))
head(water_data, 6)

[1mRows: [22m[34m3276[39m [1mColumns: [22m[34m10[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[32mdbl[39m (10): ph, Hardness, Solids, Chloramines, Sulfate, Conductivity, Organic_...

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


ph,Hardness,Solids,Chloramines,Sulfate,Conductivity,Organic_carbon,Trihalomethanes,Turbidity,Potability
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
,204.8905,20791.32,7.300212,368.5164,564.3087,10.379783,86.99097,2.963135,0
3.71608,129.4229,18630.06,6.635246,,592.8854,15.180013,56.32908,4.500656,0
8.099124,224.2363,19909.54,9.275884,,418.6062,16.868637,66.42009,3.055934,0
8.316766,214.3734,22018.42,8.059332,356.8861,363.2665,18.436524,100.34167,4.628771,0
9.092223,181.1015,17978.99,6.5466,310.1357,398.4108,11.558279,31.99799,4.075075,0
5.584087,188.3133,28748.69,7.544869,326.6784,280.4679,8.399735,54.91786,2.559708,0


In [None]:
# Resizing the width and height of plots for better/clearer visualization
options(repr.plot.width = 14, repr.plot.height = 9)

# Standardizing all columns except for Potability
water_data_standardized <- scale(water_data[1:9], center = TRUE) |>
    cbind(water_data[10])
head(water_data_standardized, 6)

# Creating a scatterplot matrix of all variables in the dataset
water_scatterplot_matrix <- water_data_standardized[,1:9] |>
    pairs(labels = colnames(water_data_standardized[,1:9]),
          main = "Scatterplot Matrix of All Variables",
          cex = 0.8,
          cex.labels = 1.5,
          cex.main = 3) +
    geom_point(alpha = 0.1)
water_scatterplot_matrix

Unnamed: 0_level_0,ph,Hardness,Solids,Chloramines,Sulfate,Conductivity,Organic_carbon,Trihalomethanes,Turbidity,Potability
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,,0.2591551,-0.139449583,0.1123977,0.8388053,1.70869338,-1.1804704,1.273240605,-1.2861012,0
2,-2.1104392,-2.0361028,-0.3859277338,-0.3076467,,2.06226017,0.2705559,-0.622393287,0.6841135,0
3,0.6387237,0.8475354,-0.2400106967,1.3603862,,-0.09401776,0.7809976,0.001471379,-1.1671873,0
4,0.7752344,0.5475678,0.0004932291,0.5919175,0.5579943,-0.77871108,1.2549429,2.098631452,0.848282,0
5,1.2616222,-0.4643582,-0.4601783194,-0.3636424,-0.5707833,-0.34388641,-0.8242313,-2.1266326,0.1387643,0
6,-0.9387754,-0.2450192,0.7680379557,0.2669421,-0.1713654,-1.80314114,-1.7790047,-0.709639884,-1.8030621,0


NULL

In [None]:
# Create a dataframe with the first and other 8 columns (excluding the 10th)
water_data_corr <- water_data_standardized[, 1:9]

# Compute the correlation matrix while removing any missing values
cor_matrix <- cor(water_data_corr, use = "pairwise.complete.obs")

# Convert the correlation matrix to a long format
cor_matrix_long <- reshape2::melt(cor_matrix)

# Create a correlation matrix plot using ggplot2
ggplot(cor_matrix_long, aes(x = Var1, y = Var2, fill = value)) +
    geom_tile() +
    scale_fill_gradient2(low = "blue", mid = "white", high = "red", midpoint = 0) +
    geom_text(aes(label = round(value, 2)), size = 5) +
    theme(text = element_text(size = 18)) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 14)) +
    theme(axis.text.y = element_text(size = 14)) +
    
    ggtitle("Correlation Matrix of All Variables in the Dataset") +
    labs(x = "", y = "", fill = "Correlation")

In [None]:
mean1 <- 10
sd1 <- 2
mean2 <- 12
sd2 <- 2

# Set the sample sizes for two populations
n1 <- 20
n2 <- 20

# Calculate the true sampling distribution
diff_means <- seq(-5, 5, length = 1000)
true_dist <- data.frame(diff_means = diff_means, 
                        density = dt(diff_means, df = n1 + n2 - 2))

# Calculate the null model distribution
null_dist <- data.frame(diff_means = diff_means,
                        density = dt(diff_means, df = n1 + n2 - 2))

# Calculate the observed difference in means
set.seed(123) # for reproducibility
x <- rnorm(n1, mean = mean1, sd = sd1)
y <- rnorm(n2, mean = mean2, sd = sd2)
obs_diff <- mean(x) - mean(y)

# Calculate the p-value
p_value <- 2 * (1 - pt(abs(obs_diff), df = n1 + n2 - 2))

# Create a ggplot object
ggplot() +
  
  # Plot the true sampling distribution
  geom_line(data = true_dist, aes(x = diff_means, y = density), 
            color = "blue", size = 1) +
  
  # Plot the null model distribution
  geom_line(data = null_dist, aes(x = diff_means, y = density), 
            color = "red", size = 1) +
  
  # Add a vertical line for the observed difference in means
  geom_vline(xintercept = obs_diff, color = "black", size = 1, linetype = "dashed") +
  
  # Add a shaded area for the p-value
  geom_area(data = null_dist, aes(x = diff_means, y = density, 
                                  fill = diff_means < obs_diff | diff_means > -obs_diff), 
            alpha = 0.5) +
  
  # Add a label for the x-axis
  xlab("Difference in Means") +
  
  # Add a label for the y-axis
  ylab("Density") +
  
  # Add a title for the plot
  ggtitle("True Sampling Distribution vs. Null Model Distribution") +
  
  # Add a legend
  scale_fill_manual(name = "p-value", 
                    values = c("TRUE" = "green", "FALSE" = "white"),
                    labels = c("p-value < 0.05", "p-value >= 0.05"),
                    guide = "legend") +
  
  # Add labels for the correct/wrong rejection/non-rejection of null hypothesis
  annotate("text", x = obs_diff, y = 0.4, label = "Correct Rejection of Null Hypothesis", 
           hjust = 0, color = "black") +
  annotate("text", x = -obs_diff, y = 0.4, label = "Wrong Non-Rejection of Null Hypothesis", 
           hjust = 1, color = "black") +
  
  # Add vertical lines for the critical values
  geom_vline(xintercept = qt(0.025, df = n1 + n2 - 2), color = "green", size = 1, linetype

Split the dataset into two categories based on Potability values: 1 indicates drinkable water, and 0 indicates non-drinkable. Then, select only pH and Conductivity variables while removing missing values.

In [None]:
# Splitting the dataset into potable and non-potable water
# Potable water data
water_data_potable <- water_data |>
    filter(Potability == 1, !is.na(ph), !is.na(Conductivity)) |>
    select(ph, Conductivity, Potability)
head(water_data_potable, 5)
nrow(water_data_potable)

# Non potable water data
water_data_non_potable <- water_data |>
    filter(Potability == 0, !is.na(ph), !is.na(Conductivity)) |>
        select(ph, Conductivity, Potability)
head(water_data_non_potable, 5)
nrow(water_data_non_potable)


Plot the distributions of both variables in each sample. Note that pH and conductivity are approximately normal.

In [None]:
# Plotting the distribution of ph and Conductivity for potable water
# Create histogram for pH of potable water
potable_water_ph <- ggplot(water_data_potable, aes(x = ph)) +
    geom_histogram(alpha = 0.5, bins = 25, fill = "red") +
    labs(title = "Potable Water pH Sample Distribution",
         x = "pH",
         y = "Frequency") +
    theme(text = element_text(size = 16))

# Create histogram for Conductivity of potable water
potable_water_conductivity <- ggplot(water_data_potable, aes(x = Conductivity)) +
    geom_histogram(alpha = 0.5, bins = 25, fill = "green") +
    labs(title = "Potable Water Conductivity Sample Distribution",
         x = "Conductivity (Microsiemens/cm)",
         y = "Frequency") +
    theme(text = element_text(size = 16))

# Plotting the distribution of ph and Conductivity for non-potable water
# Create histogram for pH of non-potable water
non_potable_water_ph <- ggplot(water_data_non_potable, aes(x = ph)) +
    geom_histogram(alpha = 0.5, bins = 25, fill = "red") +
    labs(title = "Non-Potable Water pH Sample Distribution",
         x = "pH",
         y = "Frequency") +
    theme(text = element_text(size = 16))

# Create histogram for Conductivity of non-potable water
non_potable_water_conductivity <- ggplot(water_data_non_potable, aes(x = Conductivity)) +
    geom_histogram(alpha = 0.5, bins = 25, fill = "green") +
    labs(title = "Non-Potable Water Conductivity Sample Distribution",
         x = "Conductivity (Microsiemens/cm)",
         y = "Frequency") +
    theme(text = element_text(size = 16))

# Combine all histograms into a single plot
combined_plot <- plot_grid(potable_water_ph, potable_water_conductivity,
                              non_potable_water_ph, non_potable_water_conductivity,
                              ncol = 2)
combined_plot
    

Then calculate the point estimates so that we have an estimate of the unknown population parameter.

In [None]:
# Calculating the mean pH and conductivity of potable water, along with variance
potable_statistics <- water_data_potable |>
    summarize(mean_ph = mean(ph),
              mean_conductivity = mean(Conductivity),
              var_ph = var(ph),
              var_conductivity = var(Conductivity),
              n = n())
potable_statistics

# Calculating the mean pH and conductivity of non-potable water, along with variance
non_potable_statistics <- water_data_non_potable |>
    summarize(mean_ph = mean(ph),
              mean_conductivity = mean(Conductivity),
              var_ph = var(ph),
              var_conductivity = var(Conductivity),
              n = n())
non_potable_statistics

In [None]:
# Calculating the difference in means of pH and conductivity of potable vs non-potable water
point_estimate_ph <- potable_statistics$mean_ph - non_potable_statistics$mean_ph
point_estimate_conductivity <- potable_statistics$mean_conductivity - non_potable_statistics$mean_conductivity

# Creating a table to present the point estimates
point_estimates <- data.frame(
    diff_in_means_ph = c(point_estimate_ph),
    diff_in_means_conductivity = c(point_estimate_conductivity)
)
point_estimates

Next we calculate the test statistic for both two-sample t-tests using the formula:
$$
T = \frac{\bar{x}_{\text{potable}} - \bar{x}_{\text{non-potable}}}{\sqrt{\frac{s^2_{\text{potable}}}{n_1}+\frac{s^2_{\text{non-potable}}}{n_2}}}
$$

In [None]:
# Calculating the test statistic for pH 
test_statistic_ph <- (point_estimates$diff_in_means_ph) / 
                        sqrt((potable_statistics$var_ph/nrow(water_data_potable)) + 
                             (non_potable_statistics$var_ph/nrow(water_data_non_potable)))

# Calculating the test statistic for Conductivity
test_statistic_conductivity <- (point_estimates$diff_in_means_conductivity) / 
                        sqrt((potable_statistics$var_conductivity/nrow(water_data_potable)) + 
                             (non_potable_statistics$var_conductivity/nrow(water_data_non_potable)))

# Print test statistics for each test
cat("Test Statistic for pH Level: ", test_statistic_ph, "\n")
cat("Test Statistic for Conductivity: ", test_statistic_conductivity)

Then we calculate the degrees of freedom using the Welch-Satterthwaite equation since the sample sizes are different and variance for both samples are unequal, and then round it down to the nearest integer value:
$$
\nu = \frac{
    \left(\frac{s_{Chinstrap}^2}{n_1}+\frac{s_{Adelie}^2}{n_2}\right)^2
}
{
\frac{s_{Chinstrap}^4}{n_1^2(n_1-1)}+\frac{s_{Adelie}^2}{n_2^2(n_2-1)}
}
$$

In [None]:
# Degrees of Freedom for pH hypothesis test
v_ph <- ((potable_statistics$var_ph/potable_statistics$n) + 
      (non_potable_statistics$var_ph/non_potable_statistics$n))^2/
     ((potable_statistics$var_ph)^2/(1101^2*1100) + 
      (non_potable_statistics$var_ph)/(1684^2*1683))
v_ph = floor(v_ph)

# Degrees of Freedom for conductivity hypothesis test
v_conductivity <- ((potable_statistics$var_conductivity/potable_statistics$n) + 
      (non_potable_statistics$var_conductivity/non_potable_statistics$n))^2/
     ((potable_statistics$var_conductivity)^2/(1101^2*1100) + 
      (non_potable_statistics$var_conductivity)/(1684^2*1683))
v_conductivity = floor(v_conductivity)
     
#Table for Degrees of Freedom
v <- data.frame(v_ph, v_conductivity)
names(v) <- c("dof_ph", "dof_conductivity")
v

Now we have everything we need to conduct each hypothesis test to find the p-values. 

In [None]:
# Calculate the p-values for each hypothesis test
p_value_ph <- 2*pt(test_statistic_ph, df = 3307, lower.tail = TRUE)
p_value_conductivity <- 2*pt(test_statistic_conductivity, df = 2911, lower.tail = TRUE)

# Show the p-values in a data frame
p_values <- data.frame(p_value_ph, p_value_conductivity)
p_values

Next, we can construct the confidence intervals for each variable (mean difference in pH and Conductivity) via bootstrapping first. We will generate a bootstrap distribution of these statistics of interest and use it to construct the respectice confidence intervals.

In [None]:
set.seed(123)
# Constructing confidence interval for difference in mean pH Level
# Mapping Potability column into factors
water_data_potability_mutated <- water_data |>
    mutate(Potability_mapped = case_when(Potability == 0 ~ "Non-potable", Potability == 1 ~ "Potable")) |>
    filter(!is.na(ph), !is.na(Conductivity))

# Generate the bootstrap distribution and calculate the difference in mean pH statistic
bootstrap_dist_diff_means_ph <- water_data_potability_mutated |>
    specify(formula = ph ~ Potability_mapped) |>
    generate(reps = 1000, type = "bootstrap") |>
    calculate(stat = "diff in means", order = c("Potable", "Non-potable"))

# Get the percentile endpoints for the confidence interval
diff_means_ci_ph <- bootstrap_dist_diff_means_ph %>%
  get_ci(level = 0.95, type = "percentile")
names(diff_means_ci_ph) <- c("lower_bound_ph", "upper_bound_ph")
diff_means_ci_ph

# Visualize the confidence interval
ci_ph <- visualize(bootstrap_dist_diff_means_ph) +
    shade_ci(endpoints = diff_means_ci_ph) +
    labs(x = "Difference in Mean pH Level", 
         y = "Count", 
         title = "Confidence Interval of Difference in Mean pH Level between Potable and Non-Potable Water") +
    theme(text = element_text(size = 16))
ci_ph


# Generate the bootstrap distribution and calculate the difference in mean Conductivity statistic
bootstrap_dist_diff_means_conductivity <- water_data_potability_mutated |>
    specify(formula = Conductivity ~ Potability_mapped) |>
    generate(reps = 1000, type = "bootstrap") |>
    calculate(stat = "diff in means", order = c("Potable", "Non-potable"))

# Get the percentile endpoints for the confidence interval
diff_means_ci_conductivity <- bootstrap_dist_diff_means_conductivity %>%
  get_ci(level = 0.95, type = "percentile")
names(diff_means_ci_conductivity) <- c("lower_bound_conductivity", "upper_bound_conductivity")
diff_means_ci_conductivity

# Visualize the confidence interval
ci_conductivity <- visualize(bootstrap_dist_diff_means_conductivity) +
    shade_ci(endpoints = diff_means_ci_conductivity) +
    labs(x = "Difference in Mean Conductivity", 
         y = "Count", 
         title = "Confidence Interval of Difference in Mean Conductivity between Potable and Non-Potable Water") +
    theme(text = element_text(size = 16))
ci_conductivity

We can then compare this confidence interval to one that is constructed using asymptotics

In [None]:
# Using Asymptotics to construct confidence intervals
# Calculate z statistic for 95% confidence interval
z <- qnorm(0.975)

# Constructing Confidence Interval for diff in mean pH Level
sd_diff_ph <- sqrt(potable_statistics$var_ph/potable_statistics$n + non_potable_statistics$var_ph/non_potable_statistics$n)
lower_ci <- point_estimate_ph - z*sd_diff_ph
upper_ci <- point_estimate_ph + z*sd_diff_ph
ci_ph_asymptotics <- data.frame(lower_ci, upper_ci)
names(ci_ph_asymptotics) = c("lower_ci_ph", "upper_ci_ph")
ci_ph_asymptotics

# Constructing Confidence Interval for diff in mean COnductivity
sd_diff_ph <- sqrt(potable_statistics$var_conductivity/potable_statistics$n + 
                   non_potable_statistics$var_conductivity/non_potable_statistics$n)
lower_ci <- point_estimate_conductivity - z*sd_diff
upper_ci <- point_estimate_conductivity + z*sd_diff
ci_conductivity_asymptotics <- data.frame(lower_ci, upper_ci)
names(ci_conductivity_asymptotics) = c("lower_ci_conductivity", "upper_ci_conductivity")
ci_conductivity_asymptotics

Finally, we can visualize the null distribution and visualize our results for each hypothesis test

In [None]:
null_dist_ph <- water_data_potability_mutated |>
    specify(formula = ph ~ Potability_mapped) |>
    hypothesize(null = "independence") |>
    generate(reps = 1000, type = "permute") |>
    calculate(stat = "diff in means", order = c("Potable", "Non-potable"))
visualize(null_dist_ph)



## Methods: Plan
### Why our report is reliable
We ensured report reliability with random sampling, a large sample size, and consistent methods and outcomes with prior research. Robustness checks addressed potential concerns, and all results were transparently reported. All conditions of two-sample t-tests and confidence intervals were met.
<br>
Estimates offer valuable insights to stakeholders, but other factors may randomly influence the results. Additional information from existing literature and human factors may be necessary to make informed decisions about the complexity/uncertainty of the data.

### Methodology:

1. Load libraries and data
2. Visualize and analyze correlations between variables
3. Split data into potable and non-potable water samples (Independent)
4. Analyze sample distributions
5. Conduct two-sample t-test on pH and conductivity
6. Calculate 95% confidence intervals via bootstrapping and compare it with asymptotics method
<br>
    (This allows us to quantify errors of our estimates)
7. Interpret results by rejecting or not rejecting the null hypotheses to establish statistical significance

To ensure reproducibility, we will document analysis steps, use version control, and share our code and data.

### Expected Outcomes and Significance
We anticipate significant differences in mean pH values between potable and non-potable water, but no significant difference in mean conductivity <br> (Sofi et al., 2014).

#### Limitations
Data may be incomplete due to Earth's vast water content, and errors like Type I or Type II may arise. Publication bias and assumption of variance homogeneity may impact research reliability.


#### Project Impact
The analysis may have significant implications for water treatment, public health policies, and international standards for safe water. For instance, if high conductivity reduces potability, treatment plants could adapt their methods to mitigate the issue (Eun Jung Kim a et al., 2011).

#### Future Questions
- What specific treatment methods can be used to effectively address the effects of pH and conductivity on water potability?
- What are the long-term health effects of consuming water with varying pH and conductivity levels?











## References
- Eun Jung Kim a, a, b, High lead levels in drinking water are still a concern for households serviced by lead pipes in many parts of North America and Europe. p, O’Reilly, S. E., Mohapatra, M., Deshommes, E., Dando, K. J., Arai, Y., Bisogni, J. J., Copeland, R. C., Gerke, T. L., & Harsh, J. (2011, March 11). Effect of ph on the concentrations of lead and trace contaminants in drinking water: A combined batch, Pipe Loop and Sentinel Home Study. Water Research. Retrieved March 18, 2023, from https://www.sciencedirect.com/science/article/abs/pii/S004313541100090X

- Canada, H. (2016, March 9). Government of Canada. Canada.ca. Retrieved March 18, 2023, from https://www.canada.ca/en/health-canada/services/publications/healthy-living/guidelines-canadian-drinking-water-quality-guideline-technical-document-ph.html
- Environmental Protection Agency. (n.d.). US Environmental Protection Agency. EPA. Retrieved March 18, 2023, from https://www.epa.gov/caddis-vol2/ph

- Sofi, M. H., Gudi, R., Karumuthil-Melethil, S., Perez, N., Johnson, B. M., & Vasu, C. (2014, January 16). Ph of drinking water influences the composition of gut microbiome and type 1 diabetes incidence. American Diabetes Association. Retrieved March 18, 2023, from https://diabetesjournals.org/diabetes/article/63/2/632/34242/pH-of-Drinking-Water-Influences-the-Composition-of
- What is the typical water conductivity range? Atlas Scientific. (2022, November 8). Retrieved March 18, 2023, from https://atlas-scientific.com/blog/water-conductivity-range/
- World Health Organization. (n.d.). Drinking-water. World Health Organization. Retrieved March 18, 2023, from https://www.who.int/news-room/fact-sheets/detail/drinking-water 

