Skip to content

mim86/WW_stat_analysis

Repository files navigation

Wastewater 16-City Microbiome Analysis — README

This repository contains three R scripts and three required input files to reproduce the full genus-level paired analysis (Week 3 vs Week 21) for 16 Swedish WWTP cities.


0. Requirements

R ≥ 4.2 with packages:

readr, dplyr, tidyr, tibble, stringr, stringi, purrr,
vegan, compositions, ggplot2, geosphere

Optional (extra outputs if installed):

ANCOMBC, phyloseq, lme4, ALDEx2

Input files required in working directory:

  • WWTP_metadata_Sheet1.tsv
  • Metadata_sverige_2.csv (semicolon-delimited; columns: ID, site, week)
  • abundance_genus.tsv (Bracken wide table with .bracken_frac and .bracken_num)

Scripts use relative paths — run from the folder containing these inputs or edit the path variables at the top of each script.


1. Clean WWTP Metadata

Script: tidying_WW_metadata.R
Input: WWTP_metadata_Sheet1.tsv
Output: wwtp_meta_16cities_clean_short.csv

What it does

  • Detects relevant columns (WWTP name, connected inhabitants, sewer type, sampling stage, socioeconomics, city classification, age, foreign-born %)
  • Normalizes numeric formatting, harmonizes Stockholm sub-plants
  • Maps to 16 fixed site keys and filters to those sites only
  • Writes a clean CSV for downstream analysis

2. Join Metadata with Bracken Abundance

Script: tidying_data_v1.R
Inputs:

  • Metadata_sverige_2.csv (semicolon-delimited)
  • abundance_genus.tsv (or abundance_species.tsv if desired)

Output:

  • abund_with_meta_clean_genus.csv (detected suffix: G)

What it does

  • Auto-detects metadata delimiter and abundance level from suffix
  • Drops Homo sapiens at species level (not relevant here)
  • Long-pivots frac/num columns into tidy table and joins metadata on ID
  • Outputs clean table for analysis

3. Genus-Level Analysis (Paired Cities, W3 vs W21)

Script: analysis_v2.R
Inputs:

  • abund_with_meta_clean_genus.csv (Step 2 output)
  • wwtp_meta_16cities_clean_short.csv (Step 1 output)

Key parameters (top of script)

det_frac = 0.001
det_reads = 50
core_prev = 0.75
n_perm = 9999

Analyses performed

  • Alpha diversity (Richness & Shannon) + paired W3→W21 Wilcoxon tests
  • Beta diversity (CLR/Aitchison, Jaccard)
  • PERMANOVA, PCoA with arrows, PERMDISP
  • Within-city turnover + core/periphery decomposition
  • Prevalence shifts (McNemar)
  • Optional DA (ANCOM-BC2 & ALDEx2)
  • Mantel, distance-decay, PCNM, dbRDA, PROTEST
  • WWTP meta variable modeling

Main outputs

Figures in ./figures_g/ and tables in ./figures_g/.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages