# Groundwater Geochemistry Analysis: Yang et al. Dataset

## Learning Objectives
- Analyze real groundwater chemistry data from Yang et al. (2020)
- Apply multiple hydrogeochemical diagrams to understand water evolution
- Compare temporal and spatial variations in groundwater chemistry
- Practice data preprocessing for WQChartPy

## Dataset Description
This dataset contains groundwater chemistry data from monitoring wells, including:
- Multiple wells sampled over several years
- Seasonal variations (dry vs. wet seasons)
- Major ion concentrations
- pH and trace elements

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Load the Yang groundwater dataset
df_raw = pd.read_csv('./datasets/yang_groundwater_sample.csv')

print("Dataset loaded successfully!")
print(f"Dataset shape: {df_raw.shape}")
print("\nFirst 5 rows:")
df_raw.head()

## Data Exploration and Preprocessing

In [None]:
# Examine the structure of the dataset
print("Dataset Information:")
print("===================")
print(f"Number of samples: {len(df_raw)}")
print(f"Number of wells: {df_raw['Well'].nunique()}")
print(f"Wells: {df_raw['Well'].unique()}")
print(f"Years covered: {df_raw['Sampling year'].min()} - {df_raw['Sampling year'].max()}")
print(f"Seasons: {df_raw['Sampling season'].unique()}")

print("\nBasic statistics for major ions (mg/L):")
ion_cols = ['Ca', 'Mg', 'Na', 'K', 'Cl', 'SO4', 'HCO3']
df_raw[ion_cols].describe().round(2)

## Hydrogeochemical Analysis

### 1. Piper Diagrams - Understanding Water Types

In [None]:
# Create triangle Piper diagram for all samples
from wqchartpy import triangle_piper

# Create Piper diagram
triangle_piper.plot(
    df_raw, unit='mg/L', 
    figname='./plots/yang_piper_diagram', figformat='png'
)

### 2. Temporal Analysis - Seasonal Variations

In [None]:
# Create separate plots for seasonal comparison
# Modify data to show seasonal differences
df_seasonal = df_raw.copy()
df_seasonal['Label'] = df_seasonal['Sampling season']

# Assign different colors for seasons
df_seasonal.loc[df_seasonal['Sampling season'] == 'dry', 'Color'] = 'red'
df_seasonal.loc[df_seasonal['Sampling season'] == 'wet', 'Color'] = 'blue'

# Assign different markers for seasons
df_seasonal.loc[df_seasonal['Sampling season'] == 'dry', 'Marker'] = 'o'
df_seasonal.loc[df_seasonal['Sampling season'] == 'wet', 'Marker'] = 's'

# Create Piper diagram showing seasonal differences
triangle_piper.plot(
    df_seasonal, unit='mg/L', 
    figname='./plots/yang_piper_seasonal_comparison', figformat='png'
)

print("Seasonal comparison Piper diagram created!")
print("Red circles = Dry season, Blue squares = Wet season")