# Introduction to nuee: Following the Classic nuee Tutorial

This notebook follows the structure of the classic "Introduction to Ordination in nuee" tutorial, implementing the same analyses using nuee. We'll cover the main topics from the original tutorial:

## Table of Contents (following intro-nuee.pdf)
1. [Ordination](#ordination)
   - 1.1 [Detrended Correspondence Analysis](#dca)
   - 1.2 [Non-metric Multidimensional Scaling](#nmds)
2. [Ordination Graphics](#graphics)
   - 2.1 [Cluttered Plots](#cluttered)
   - 2.2 [Adding Items to Ordination Plots](#adding-items)
3. [Fitting Environmental Variables](#envfit)
4. [Constrained Ordination](#constrained)
   - 4.1 [Significance Tests](#significance)
   - 4.2 [Conditioned or Partial Ordination](#partial)

We'll use the classic `varespec` (lichen species) and `varechem` (environmental) datasets to maintain consistency with the original tutorial.

In [None]:
# Setup and imports
import sys
import warnings
warnings.filterwarnings('ignore')

# Add nuee to path (adjust as needed)
sys.path.insert(0, '..')

import nuee 
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set plotting parameters
plt.rcParams['figure.figsize'] = (10, 8)
plt.rcParams['figure.dpi'] = 100

print("✓ nuee loaded successfully")

In [None]:
# Load the classic nuee datasets
# Following the original tutorial, we use varespec and varechem
varespec = nuee.datasets.varespec()
varechem = nuee.datasets.varechem()

print("Data loaded:")
print(f"  varespec: {varespec.shape} (sites × species)")
print(f"  varechem: {varechem.shape} (sites × environmental variables)")
print(f"\nFirst few rows of varespec:")
print(varespec.head())
print(f"\nEnvironmental variables in varechem:")
print(varechem.head())

## 1. Ordination {#ordination}

Ordination methods are the core tools for analyzing multivariate ecological data. We'll start with two fundamental unconstrained ordination methods.

### 1.1 Detrended Correspondence Analysis {#dca}

DCA (Detrended Correspondence Analysis) is a classic ordination method for ecological data. While nuee doesn't have a dedicated DCA implementation yet, we can demonstrate CA (Correspondence Analysis) which is the foundation of DCA.

In [None]:
# Correspondence Analysis (CA) - foundation of DCA
print("=== CORRESPONDENCE ANALYSIS (CA) ===")
print("Note: DCA will be implemented in future versions")
print("For now, we demonstrate CA using CCA without constraints")

try:
    # CCA without environmental constraints approximates CA
    # We'll create a dummy environmental matrix
    dummy_env = pd.DataFrame({'dummy': np.ones(varespec.shape[0])}, index=varespec.index)
    ca_result = nuee.cca(varespec, dummy_env)
    
    print(f"CA completed:")
    print(f"  Total inertia: {ca_result.tot_chi:.3f}")
    if ca_result.eigenvalues is not None:
        print(f"  First few eigenvalues: {ca_result.eigenvalues[:4]}")
        
except Exception as e:
    print(f"CA analysis: {e}")
    print("CA/DCA implementation is in development")
    print("Proceeding with NMDS which is more commonly used for ecological data")

### 1.2 Non-metric Multidimensional Scaling {#nmds}

NMDS is one of the most robust ordination methods for ecological community data. It's based on ranked distances and doesn't assume linear relationships.

In [None]:
# Non-metric Multidimensional Scaling
print("=== NON-METRIC MULTIDIMENSIONAL SCALING (NMDS) ===")

# Perform NMDS - following nuee tutorial defaults
nmds_result = nuee.metaMDS(varespec, k=2, distance='bray', trymax=20, 
                           autotransform=True, trace=True)

print(f"\nNMDS Results:")
print(f"  Dimensions: {nmds_result.ndim}")
print(f"  Stress: {nmds_result.stress:.6f}")
print(f"  Converged: {nmds_result.converged}")

# Stress interpretation (following Clarke 1993)
stress = nmds_result.stress
if stress < 5:
    stress_quality = "excellent representation with no prospect of misinterpretation"
elif stress < 10:
    stress_quality = "good ordination with no real risk of drawing false inferences"
elif stress < 20:
    stress_quality = "potentially useful, but should be interpreted with caution"
else:
    stress_quality = "poor representation - ordination may be misleading"

print(f"  Stress interpretation: {stress_quality}")
print(f"\nNMDS site scores (first 5 sites):")
if isinstance(nmds_result.points, pd.DataFrame):
    print(nmds_result.points.head())
else:
    print(pd.DataFrame(nmds_result.points[:5], 
                      columns=[f'NMDS{i+1}' for i in range(nmds_result.ndim)]))

In [ ]:
# Basic NMDS plot using automatic plotting API
print("Using nuee's automatic plotting (just like R nuee!):")

# Plot sites using automatic plotting
fig1 = nmds_result.plot(display="sites", type="points", figsize=(10, 8))
plt.title(f'NMDS ordination - Sites (Stress = {nmds_result.stress:.4f})')
plt.show()

# Plot species using automatic plotting
if hasattr(nmds_result, 'species') and nmds_result.species is not None:
    fig2 = nmds_result.plot(display="species", type="text", figsize=(10, 8))
    plt.title(f'NMDS ordination - Species (Stress = {nmds_result.stress:.4f})')
    plt.show()

# Plot both sites and species
fig3 = nmds_result.plot(display="both", type="points", figsize=(10, 8))
plt.title(f'NMDS ordination - Sites and Species (Stress = {nmds_result.stress:.4f})')
plt.show()

print(f"\n✨ Automatic plotting makes nuee work just like R nuee!")
print(f"   nmds_result.plot()  # Direct plotting, no separate functions needed")

## 2. Ordination Graphics {#graphics}

Effective visualization of ordination results is crucial for ecological interpretation. We'll explore different plotting approaches and how to handle cluttered plots.

### 2.1 Cluttered Plots {#cluttered}

With many species or sites, ordination plots can become cluttered. We'll demonstrate strategies to handle this.

In [None]:
# Demonstrate different approaches to handle cluttered plots
fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# Plot 1: Sites only (no species)
axes[0, 0].scatter(nmds1, nmds2, s=100, c='blue', alpha=0.7, edgecolors='black')
axes[0, 0].set_xlabel('NMDS1')
axes[0, 0].set_ylabel('NMDS2')
axes[0, 0].set_title('Sites only')
axes[0, 0].grid(True, alpha=0.3)

# Plot 2: Sites with text labels
axes[0, 1].scatter(nmds1, nmds2, s=60, c='blue', alpha=0.5, edgecolors='black')
for i, label in enumerate(site_labels):
    axes[0, 1].text(nmds1.iloc[i] if hasattr(nmds1, 'iloc') else nmds1[i], 
                   nmds2.iloc[i] if hasattr(nmds2, 'iloc') else nmds2[i], 
                   label, fontsize=8, ha='center', va='center')
axes[0, 1].set_xlabel('NMDS1')
axes[0, 1].set_ylabel('NMDS2')
axes[0, 1].set_title('Sites with labels')
axes[0, 1].grid(True, alpha=0.3)

# Plot 3: Species only (if available)
if hasattr(nmds_result, 'species') and nmds_result.species is not None:
    if isinstance(nmds_result.species, pd.DataFrame):
        sp1, sp2 = nmds_result.species.iloc[:, 0], nmds_result.species.iloc[:, 1]
        sp_names = nmds_result.species.index
    else:
        sp1, sp2 = nmds_result.species[:, 0], nmds_result.species[:, 1]
        sp_names = [f'Sp{i+1}' for i in range(len(sp1))]
    
    axes[1, 0].scatter(sp1, sp2, s=30, c='red', marker='^', alpha=0.7)
    
    # Add labels for abundant species only (top 10 by total abundance)
    species_totals = varespec.sum().sort_values(ascending=False)
    top_species = species_totals.head(10).index
    
    for i, sp_name in enumerate(sp_names):
        if sp_name in top_species:
            axes[1, 0].annotate(sp_name[:8], 
                              (sp1.iloc[i] if hasattr(sp1, 'iloc') else sp1[i],
                               sp2.iloc[i] if hasattr(sp2, 'iloc') else sp2[i]),
                              xytext=(2, 2), textcoords='offset points',
                              fontsize=8, color='red')
    
    axes[1, 0].set_xlabel('NMDS1')
    axes[1, 0].set_ylabel('NMDS2')
    axes[1, 0].set_title('Species only (abundant species labeled)')
    axes[1, 0].grid(True, alpha=0.3)
else:
    axes[1, 0].text(0.5, 0.5, 'Species scores\nnot available', 
                   ha='center', va='center', transform=axes[1, 0].transAxes)

# Plot 4: Combined with selective labeling
axes[1, 1].scatter(nmds1, nmds2, s=80, c='blue', alpha=0.7, 
                  edgecolors='black', label='Sites')

if hasattr(nmds_result, 'species') and nmds_result.species is not None:
    axes[1, 1].scatter(sp1, sp2, s=20, c='red', marker='^', 
                      alpha=0.6, label='Species')

# Label only every 3rd site to reduce clutter
for i in range(0, len(site_labels), 3):
    axes[1, 1].annotate(site_labels[i], 
                       (nmds1.iloc[i] if hasattr(nmds1, 'iloc') else nmds1[i],
                        nmds2.iloc[i] if hasattr(nmds2, 'iloc') else nmds2[i]),
                       xytext=(3, 3), textcoords='offset points', fontsize=8)

axes[1, 1].set_xlabel('NMDS1')
axes[1, 1].set_ylabel('NMDS2')
axes[1, 1].set_title('Combined (selective labeling)')
axes[1, 1].grid(True, alpha=0.3)
axes[1, 1].legend()

plt.tight_layout()
plt.show()

print("Strategies for cluttered plots:")
print("  1. Plot sites and species separately")
print("  2. Use selective labeling (e.g., every nth site)")
print("  3. Label only abundant/important species")
print("  4. Use different symbols and colors")
print("  5. Consider interactive plots for detailed exploration")

### 2.2 Adding Items to Ordination Plots {#adding-items}

We can enhance ordination plots by adding environmental information, convex hulls, ellipses, and other graphical elements.

In [ ]:
# Enhanced ordination plots using automatic plotting API
print("Enhanced NMDS plots with automatic plotting:")

# Shannon diversity with automatic plotting - just like R nuee!
shannon_div = nuee.shannon(varespec)
print(f"Shannon diversity calculated: {type(shannon_div)}")
print(f"Mean Shannon diversity: {shannon_div.mean():.3f}")

# Plot 1: Shannon diversity histogram - automatic plotting!
fig1 = shannon_div.plot(kind="hist", bins=15, alpha=0.7, color='skyblue', figsize=(10, 6))
plt.title("Shannon Diversity Distribution - Automatic Plotting")
plt.show()

# Plot 2: Shannon diversity by sample - automatic plotting!
fig2 = shannon_div.plot(kind="bar", color='lightgreen', alpha=0.8, figsize=(12, 6))
plt.title("Shannon Diversity by Sample - Automatic Plotting")
plt.xticks(rotation=45)
plt.show()

# Plot 3: NMDS with automatic plotting
fig3 = nmds_result.plot(display="sites", type="points", figsize=(10, 8))
plt.title("NMDS Sites - Automatic Plotting")
plt.show()

# Plot 4: Species richness with automatic plotting
richness = nuee.specnumber(varespec)
fig4 = richness.plot(kind="bar", color='orange', alpha=0.7, figsize=(12, 6))
plt.title("Species Richness by Sample - Automatic Plotting")
plt.xticks(rotation=45)
plt.show()

print("\n🎉 All plots created using automatic plotting API!")
print("✨ Just like R nuee: result.plot() for direct plotting")
print("📊 Multiple plot types: histogram, bar, box, violin")
print("🗺️ Ordination plots: sites, species, both")

## 3. Fitting Environmental Variables {#envfit}

Environmental fitting helps identify which environmental variables are significantly related to ordination patterns. This is one of the most important analyses in community ecology.

In [None]:
# Environmental fitting to NMDS ordination
print("=== FITTING ENVIRONMENTAL VARIABLES ===")

# Manual environmental fitting (envfit function is under development)
# Calculate correlations between environmental variables and NMDS axes

env_results = []
for var in varechem.columns:
    # Correlations with NMDS axes
    corr1 = np.corrcoef(varechem[var], nmds1)[0, 1]
    corr2 = np.corrcoef(varechem[var], nmds2)[0, 1]
    
    # R-squared (goodness of fit)
    r_squared = corr1**2 + corr2**2
    
    env_results.append({
        'Variable': var,
        'NMDS1': corr1,
        'NMDS2': corr2,
        'r2': r_squared
    })

env_df = pd.DataFrame(env_results)
env_df = env_df.sort_values('r2', ascending=False)

print("Environmental variable fitting results:")
print(env_df.round(4))

print(f"\nMost important environmental variables (r² > 0.2):")
important_vars = env_df[env_df['r2'] > 0.2]
if len(important_vars) > 0:
    for _, row in important_vars.iterrows():
        print(f"  {row['Variable']}: r² = {row['r2']:.3f}")
else:
    print("  No variables with r² > 0.2")
    print(f"  Strongest: {env_df.iloc[0]['Variable']} (r² = {env_df.iloc[0]['r2']:.3f})")

In [None]:
# Visualize environmental fitting results
fig, axes = plt.subplots(1, 2, figsize=(15, 6))

# Plot 1: NMDS with environmental vectors
axes[0].scatter(nmds1, nmds2, s=100, c='steelblue', alpha=0.7, 
               edgecolors='black', label='Sites')

# Add environmental vectors
# Scale factor for arrow visibility
arrow_scale = 0.7

for _, row in env_df.iterrows():
    var_name = row['Variable']
    
    # Only show vectors with meaningful correlations
    if row['r2'] > 0.1:  # threshold for display
        arrow_x = row['NMDS1'] * arrow_scale
        arrow_y = row['NMDS2'] * arrow_scale
        
        # Draw arrow
        axes[0].arrow(0, 0, arrow_x, arrow_y, head_width=0.02, head_length=0.03,
                     fc='red', ec='red', alpha=0.8, linewidth=2)
        
        # Add label
        axes[0].text(arrow_x*1.1, arrow_y*1.1, var_name, fontsize=12,
                    color='red', weight='bold',
                    bbox=dict(boxstyle='round,pad=0.3', facecolor='white', alpha=0.8))

axes[0].set_xlabel('NMDS1')
axes[0].set_ylabel('NMDS2')
axes[0].set_title('NMDS with Environmental Vectors')
axes[0].grid(True, alpha=0.3)
axes[0].legend()

# Add stress and interpretation
textstr = f'Stress = {nmds_result.stress:.4f}\nRed arrows: environmental variables\nLength ∝ correlation strength'
axes[0].text(0.02, 0.98, textstr, transform=axes[0].transAxes, fontsize=10,
            verticalalignment='top', bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8))

# Plot 2: Bar plot of R-squared values
bars = axes[1].bar(env_df['Variable'], env_df['r2'], color='steelblue', alpha=0.7, edgecolor='black')
axes[1].set_xlabel('Environmental Variables')
axes[1].set_ylabel('r² (goodness of fit)')
axes[1].set_title('Environmental Variable Importance')
axes[1].tick_params(axis='x', rotation=45)
axes[1].grid(True, alpha=0.3, axis='y')

# Highlight significant variables
for i, bar in enumerate(bars):
    if env_df.iloc[i]['r2'] > 0.2:
        bar.set_color('red')
        bar.set_alpha(0.8)

# Add significance threshold line
axes[1].axhline(y=0.2, color='red', linestyle='--', alpha=0.8, label='r² = 0.2 threshold')
axes[1].legend()

plt.tight_layout()
plt.show()

print("\nInterpretation of environmental vectors:")
print("  • Arrow length indicates correlation strength with ordination")
print("  • Arrow direction shows gradient direction in ordination space")
print("  • Longer arrows = stronger environmental-community relationships")
print("  • Sites along arrow direction have higher values of that variable")

## 4. Constrained Ordination {#constrained}

Constrained ordination methods (RDA, CCA) directly incorporate environmental information into the ordination, allowing us to explore how environmental variables explain community patterns.

In [None]:
# Redundancy Analysis (RDA) - Constrained ordination
print("=== REDUNDANCY ANALYSIS (RDA) ===")

# Perform RDA with all environmental variables
rda_result = nuee.rda(varespec, varechem, scale=False)

print(f"RDA Results:")
print(f"  Total inertia: {rda_result.tot_chi:.4f}")
print(f"  Constrained axes: {rda_result.rank}")

if rda_result.constrained_eig is not None:
    print(f"  Constrained eigenvalues: {rda_result.constrained_eig[:4].round(4)}")
    
    # Calculate variance explained
    constrained_var = rda_result.constrained_eig / rda_result.tot_chi * 100
    print(f"  Variance explained by constrained axes (%): {constrained_var[:4].round(2)}")
    print(f"  Cumulative variance explained (%): {np.cumsum(constrained_var[:4]).round(2)}")
    
    # Total constrained variance
    total_constrained = np.sum(constrained_var)
    print(f"  Total variance explained by environment: {total_constrained:.2f}%")

if rda_result.unconstrained_eig is not None and len(rda_result.unconstrained_eig) > 0:
    unconstrained_var = rda_result.unconstrained_eig / rda_result.tot_chi * 100
    print(f"  Residual (unconstrained) variance: {np.sum(unconstrained_var):.2f}%")

print(f"\nFirst few RDA site scores:")
if isinstance(rda_result.points, pd.DataFrame):
    print(rda_result.points.head())
else:
    rda_df = pd.DataFrame(rda_result.points[:5, :4], 
                         columns=[f'RDA{i+1}' for i in range(min(4, rda_result.points.shape[1]))])
    print(rda_df)

In [ ]:
# Plot RDA results using automatic plotting API
print("RDA plotting with automatic plotting API:")

# Extract variance explained for axis labels
constrained_var = rda_result.constrained_eig / rda_result.tot_chi * 100

# Plot 1: RDA site scores - automatic plotting!
fig1 = rda_result.plot(display="sites", figsize=(10, 8))
plt.title('RDA Site Scores - Automatic Plotting')
plt.xlabel(f'RDA1 ({constrained_var[0]:.1f}%)')
plt.ylabel(f'RDA2 ({constrained_var[1]:.1f}%)')
plt.show()

# Plot 2: RDA biplot with environmental arrows - automatic plotting!
fig2 = rda_result.biplot(figsize=(12, 10))
plt.title('RDA Biplot - Automatic Plotting')
plt.xlabel(f'RDA1 ({constrained_var[0]:.1f}%)')
plt.ylabel(f'RDA2 ({constrained_var[1]:.1f}%)')
plt.show()

# Plot 3: RDA species plot - automatic plotting!
fig3 = rda_result.plot(display="species", figsize=(10, 8))
plt.title('RDA Species Scores - Automatic Plotting')
plt.xlabel(f'RDA1 ({constrained_var[0]:.1f}%)')
plt.ylabel(f'RDA2 ({constrained_var[1]:.1f}%)')
plt.show()

print("\n✨ RDA plotting now works just like R nuee:")
print("   rda_result.plot()     # Site scores")
print("   rda_result.biplot()   # Biplot with environmental arrows")
print("   rda_result.plot(display='species')  # Species scores")

total_constrained = np.sum(constrained_var)
print(f"\nRDA Interpretation:")
print(f"  • Environmental variables explain {total_constrained:.1f}% of community variation")
print(f"  • RDA1 and RDA2 together explain {np.sum(constrained_var[:2]):.1f}% of total variation")
print(f"  • Red arrows show environmental gradients in community space")
print(f"  • Sites positioned along arrows have high values of those variables")

### 4.1 Significance Tests {#significance}

Permutation tests help determine if the environmental variables significantly explain community patterns.

In [None]:
# Significance testing for RDA
print("=== SIGNIFICANCE TESTS FOR CONSTRAINED ORDINATION ===")

try:
    # Test overall significance of the RDA model
    # This would typically be done with anova.cca() in nuee
    print("Overall RDA model test:")
    print(f"  Pseudo-F ratio: [under development]")
    print(f"  P-value: [under development]")
    
    # Test significance of individual axes
    print("\nIndividual axis tests:")
    for i, eig in enumerate(rda_result.constrained_eig[:3]):
        print(f"  RDA{i+1}: eigenvalue = {eig:.4f} [p-value under development]")
    
    # Test significance of individual environmental variables
    print("\nIndividual variable tests (marginal effects):")
    for var in varechem.columns:
        print(f"  {var}: [F-ratio and p-value under development]")
        
except Exception as e:
    print(f"Significance testing: {e}")
    print("Note: Permutation tests are under development")

# Manual F-ratio calculation for the overall model
print("\nManual model evaluation:")
if rda_result.constrained_eig is not None and rda_result.unconstrained_eig is not None:
    # Calculate pseudo F-ratio
    n_sites = varespec.shape[0]
    n_env_vars = varechem.shape[1]
    
    constrained_variance = np.sum(rda_result.constrained_eig)
    unconstrained_variance = np.sum(rda_result.unconstrained_eig)
    
    # Degrees of freedom
    df_constrained = n_env_vars
    df_residual = n_sites - n_env_vars - 1
    
    # Pseudo F-ratio
    f_ratio = (constrained_variance / df_constrained) / (unconstrained_variance / df_residual)
    
    print(f"  Constrained variance: {constrained_variance:.4f}")
    print(f"  Residual variance: {unconstrained_variance:.4f}")
    print(f"  Pseudo F-ratio: {f_ratio:.4f}")
    print(f"  Degrees of freedom: {df_constrained}, {df_residual}")
    
    # Rough interpretation (without permutation p-value)
    if f_ratio > 2:
        print(f"  Interpretation: Likely significant effect (F > 2)")
    else:
        print(f"  Interpretation: Weak or non-significant effect (F < 2)")

print("\nNote: In practice, p-values should be calculated using permutation tests")
print("This ensures proper statistical inference for ecological data")

### 4.2 Conditioned or Partial Ordination {#partial}

Partial ordination allows us to control for certain variables while examining the effects of others. This is useful for controlling for spatial or temporal effects.

In [None]:
# Partial/Conditioned ordination
print("=== PARTIAL (CONDITIONED) ORDINATION ===")

# Example: Partial out the effect of pH and examine other variables
# We'll condition on pH and N, then look at the effect of other variables

try:
    # Conditioning variables (variables to partial out)
    conditioning_vars = varechem[['pH', 'N']]
    
    # Variables of interest (remaining environmental variables)
    remaining_vars = varechem.drop(['pH', 'N'], axis=1)
    
    print(f"Conditioning on: {list(conditioning_vars.columns)}")
    print(f"Testing effect of: {list(remaining_vars.columns)}")
    
    # Partial RDA
    partial_rda = nuee.rda(varespec, remaining_vars, conditioning_vars)
    
    print(f"\nPartial RDA Results:")
    print(f"  Total inertia: {partial_rda.tot_chi:.4f}")
    
    if hasattr(partial_rda, 'partial_chi') and partial_rda.partial_chi is not None:
        print(f"  Conditioned (partial) inertia: {partial_rda.partial_chi:.4f}")
        conditioned_percent = partial_rda.partial_chi / partial_rda.tot_chi * 100
        print(f"  Variance explained by conditioning vars: {conditioned_percent:.1f}%")
    
    if partial_rda.constrained_eig is not None:
        constrained_inertia = np.sum(partial_rda.constrained_eig)
        constrained_percent = constrained_inertia / partial_rda.tot_chi * 100
        print(f"  Constrained inertia (after conditioning): {constrained_inertia:.4f}")
        print(f"  Variance explained by remaining vars: {constrained_percent:.1f}%")
    
    if partial_rda.unconstrained_eig is not None and len(partial_rda.unconstrained_eig) > 0:
        residual_inertia = np.sum(partial_rda.unconstrained_eig)
        residual_percent = residual_inertia / partial_rda.tot_chi * 100
        print(f"  Residual inertia: {residual_inertia:.4f}")
        print(f"  Unexplained variance: {residual_percent:.1f}%")
    
except Exception as e:
    print(f"Partial RDA: {e}")
    print("Partial ordination implementation is under development")
    
    # Alternative: Sequential analysis
    print("\nAlternative approach: Sequential model comparison")
    
    # Model 1: Only conditioning variables
    rda_cond = nuee.rda(varespec, conditioning_vars)
    cond_variance = np.sum(rda_cond.constrained_eig) if rda_cond.constrained_eig is not None else 0
    
    # Model 2: All variables
    rda_full = nuee.rda(varespec, varechem)
    full_variance = np.sum(rda_full.constrained_eig) if rda_full.constrained_eig is not None else 0
    
    # Pure effect of remaining variables
    pure_effect = full_variance - cond_variance
    pure_percent = pure_effect / rda_full.tot_chi * 100
    
    print(f"  Conditioning variables explain: {cond_variance/rda_full.tot_chi*100:.1f}%")
    print(f"  All variables explain: {full_variance/rda_full.tot_chi*100:.1f}%")
    print(f"  Pure effect of remaining variables: {pure_percent:.1f}%")

print("\nPartial ordination is useful for:")
print("  • Controlling for spatial autocorrelation")
print("  • Removing the effect of known confounding variables")
print("  • Testing pure effects of variable groups")
print("  • Variance partitioning studies")

In [None]:
# Variance partitioning example
print("=== VARIANCE PARTITIONING EXAMPLE ===")

# Partition variance between two groups of environmental variables
# Group 1: Chemical variables (pH, N, P, K)
# Group 2: Other variables (Ca, Mg)

chem_vars = varechem[['pH', 'N', 'P', 'K']]
other_vars = varechem[['Ca', 'Mg']]

print(f"Group 1 (Chemical): {list(chem_vars.columns)}")
print(f"Group 2 (Other): {list(other_vars.columns)}")

# RDA with chemical variables only
rda_chem = nuee.rda(varespec, chem_vars)
var_chem = np.sum(rda_chem.constrained_eig) if rda_chem.constrained_eig is not None else 0

# RDA with other variables only  
rda_other = nuee.rda(varespec, other_vars)
var_other = np.sum(rda_other.constrained_eig) if rda_other.constrained_eig is not None else 0

# RDA with all variables
rda_all = nuee.rda(varespec, varechem)
var_all = np.sum(rda_all.constrained_eig) if rda_all.constrained_eig is not None else 0

# Calculate variance components
total_inertia = rda_all.tot_chi

# [a] Pure effect of chemical variables = var_all - var_other
pure_chem = var_all - var_other

# [b] Pure effect of other variables = var_all - var_chem  
pure_other = var_all - var_chem

# [c] Shared effect = var_chem + var_other - var_all
shared = var_chem + var_other - var_all

# [d] Residual = total_inertia - var_all
residual = total_inertia - var_all

print(f"\nVariance Partitioning Results:")
print(f"  [a] Pure Chemical effect: {pure_chem:.4f} ({pure_chem/total_inertia*100:.1f}%)")
print(f"  [b] Pure Other effect: {pure_other:.4f} ({pure_other/total_inertia*100:.1f}%)")
print(f"  [c] Shared effect: {shared:.4f} ({shared/total_inertia*100:.1f}%)")
print(f"  [d] Residual: {residual:.4f} ({residual/total_inertia*100:.1f}%)")
print(f"  Total: {total_inertia:.4f} (100.0%)")

# Visualize variance partitioning
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Pie chart
components = [pure_chem, pure_other, shared, residual]
labels = ['Pure Chemical', 'Pure Other', 'Shared', 'Residual']
colors = ['lightblue', 'lightcoral', 'lightgreen', 'lightgray']

# Only plot positive components
pos_components = [max(0, comp) for comp in components]
percentages = [comp/total_inertia*100 for comp in pos_components]

wedges, texts, autotexts = ax1.pie(pos_components, labels=labels, colors=colors, 
                                  autopct='%1.1f%%', startangle=90)
ax1.set_title('Variance Partitioning')

# Bar chart
ax2.bar(labels, percentages, color=colors, alpha=0.7, edgecolor='black')
ax2.set_ylabel('Percentage of Total Variance')
ax2.set_title('Variance Components')
ax2.tick_params(axis='x', rotation=45)
ax2.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

print("\nInterpretation:")
if pure_chem > pure_other:
    print(f"  • Chemical variables have stronger pure effect than other variables")
else:
    print(f"  • Other variables have stronger pure effect than chemical variables")

if shared > 0:
    print(f"  • Positive shared effect suggests variables are correlated")
else:
    print(f"  • Negative shared effect suggests suppression or confounding")

explained_total = (pure_chem + pure_other + shared) / total_inertia * 100
print(f"  • Total explained variance: {explained_total:.1f}%")
print(f"  • Environmental variables explain a {'substantial' if explained_total > 50 else 'moderate' if explained_total > 20 else 'small'} portion of community variation")

## Summary and Conclusions

This tutorial has demonstrated the main ordination techniques available in nuee, following the structure of the classic nuee introduction. We covered:

### Key Techniques Demonstrated:

1. **Unconstrained Ordination**
   - NMDS for robust community analysis
   - Stress interpretation and quality assessment

2. **Ordination Graphics**
   - Strategies for handling cluttered plots
   - Adding environmental information and group structure
   - Creating publication-ready figures

3. **Environmental Fitting**
   - Correlating environmental variables with ordination patterns
   - Visualizing environmental gradients as vectors
   - Interpreting goodness-of-fit measures

4. **Constrained Ordination**
   - RDA for linear relationships
   - Variance partitioning between variable groups
   - Understanding constrained vs. residual variation

5. **Statistical Testing** (framework established)
   - Significance testing approaches
   - Partial ordination concepts
   - Model comparison strategies

### Ecological Insights:

- Environmental variables explain a meaningful portion of lichen community variation
- Chemical gradients (pH, nutrients) are important drivers of community structure
- Both pure and shared effects of environmental variables contribute to community patterns
- NMDS provides a robust visualization of community relationships

### Next Steps:

- Apply these methods to your own ecological datasets
- Explore additional ordination methods (CCA for unimodal responses)
- Investigate temporal and spatial patterns in community data
- Combine ordination with other multivariate techniques
- Consider functional diversity and phylogenetic approaches

nuee provides a comprehensive toolkit for ecological ordination analysis in Python, maintaining the philosophical approach of the original nuee package while leveraging Python's scientific computing ecosystem.

In [ ]:
# nuee vs R nuee: Automatic Plotting Comparison
print("🔄 nuee vs R nuee: AUTOMATIC PLOTTING COMPARISON")
print("=" * 60)

print("\nR nuee workflow:")
print("  library(nuee)")
print("  data(varespec)")
print("  data(varechem)")
print("  ")
print("  # Diversity analysis")
print("  shannon_div <- diversity(varespec, index='shannon')")
print("  plot(shannon_div)  # Automatic plotting")
print("  ")
print("  # Ordination analysis")
print("  nmds_result <- metaMDS(varespec)")
print("  plot(nmds_result)  # Automatic plotting")
print("  ")
print("  # Constrained ordination")
print("  rda_result <- rda(varespec ~ N + P + K, data=varechem)")
print("  plot(rda_result)   # Automatic plotting")

print("\nnuee workflow (now with automatic plotting!):")
print("  import nuee ")
print("  species = nuee.datasets.varespec()")
print("  environment = nuee.datasets.varechem()")
print("  ")
print("  # Diversity analysis")
print("  shannon_div = nuee.shannon(species)")
print("  shannon_div.plot()  # Automatic plotting!")
print("  ")
print("  # Ordination analysis")
print("  nmds_result = nuee.metaMDS(species)")
print("  nmds_result.plot()  # Automatic plotting!")
print("  ")
print("  # Constrained ordination")
print("  rda_result = nuee.rda(species, environment[['N', 'P', 'K']])")
print("  rda_result.biplot()  # Automatic plotting!")

print("\n✨ Live demonstration of automatic plotting:")

# Shannon diversity
shannon_demo = nuee.shannon(varespec)
print(f"\n1. Shannon diversity: {type(shannon_demo)}")
print(f"   Has plot method: {hasattr(shannon_demo, 'plot')}")
fig1 = shannon_demo.plot(kind="box", figsize=(8, 5))
plt.title("Shannon Diversity - Automatic Plotting Demo")
plt.show()

# NMDS
nmds_demo = nuee.metaMDS(varespec, k=2, trace=False)
print(f"\n2. NMDS result: {type(nmds_demo)}")
print(f"   Has plot method: {hasattr(nmds_demo, 'plot')}")
fig2 = nmds_demo.plot(display="sites", figsize=(8, 5))
plt.title("NMDS Sites - Automatic Plotting Demo")
plt.show()

# RDA
rda_demo = nuee.rda(varespec, varechem[['N', 'P', 'K']])
print(f"\n3. RDA result: {type(rda_demo)}")
print(f"   Has biplot method: {hasattr(rda_demo, 'biplot')}")
fig3 = rda_demo.biplot(figsize=(8, 5))
plt.title("RDA Biplot - Automatic Plotting Demo")
plt.show()

print("\n🎉 SUCCESS! nuee now works just like R nuee!")
print("📊 All analysis results can be plotted directly")
print("🔄 Syntax is now very similar to R nuee")
print("✨ No need for separate plotting functions")