# üìä Community Structure & Diversity
## Measuring and Understanding Ecological Communities

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/The-Pattern-Hunter/interactive-ecology-biometry/blob/main/unit-3-community/notebooks/02_community_structure_diversity.ipynb)

---

> *"Diversity is the magic. It is the first manifestation, the first beginning of the differentiation of a thing and of simple identity."* - Henri Bergson

### üéØ Learning Objectives

By the end of this notebook, you will:
1. Distinguish between **species richness** and **species evenness**
2. Calculate **Shannon diversity index** (H')
3. Calculate **Simpson diversity index** (D)
4. Interpret **rank-abundance curves**
5. Understand **dominance** vs **evenness**
6. Compare diversity across different communities
7. Apply diversity indices to real ecological data

In [None]:
# Setup
!pip install numpy pandas plotly matplotlib scipy -q

import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from scipy.stats import entropy

print("‚úÖ Ready to explore community diversity!")
print("üìä Let's measure biodiversity!")

---

## üìö Part 1: What is Diversity?

### Two Components of Diversity:

#### 1Ô∏è‚É£ **Species Richness (S)**
- **Definition**: Number of different species
- **Simple count**: Just count how many species you found
- **Example**: Forest A has 20 species, Forest B has 50 species
- **Limitation**: Doesn't consider abundance

#### 2Ô∏è‚É£ **Species Evenness (E)**
- **Definition**: How evenly individuals are distributed among species
- **High evenness**: All species equally common
- **Low evenness**: Few species dominant, most rare
- **Example**: 10 species with 10 individuals each (high) vs 1 species with 91, others with 1 each (low)

### The Complete Picture:

**Diversity = Richness + Evenness**

```
HIGH Diversity = Many species + Evenly distributed
LOW Diversity = Few species OR Unevenly distributed
```

In [None]:
# Compare richness vs evenness
def create_comparison_scenarios():
    """
    Create 4 scenarios showing richness vs evenness
    """
    scenarios = {
        'High Richness\nHigh Evenness': {
            'species': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
            'counts': [10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
        },
        'High Richness\nLow Evenness': {
            'species': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
            'counts': [85, 5, 2, 2, 1, 1, 1, 1, 1, 1]
        },
        'Low Richness\nHigh Evenness': {
            'species': ['A', 'B', 'C'],
            'counts': [33, 33, 34]
        },
        'Low Richness\nLow Evenness': {
            'species': ['A', 'B', 'C'],
            'counts': [95, 3, 2]
        }
    }
    
    return scenarios

scenarios = create_comparison_scenarios()

# Create 2x2 subplot
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=list(scenarios.keys()),
    specs=[[{'type': 'bar'}, {'type': 'bar'}],
           [{'type': 'bar'}, {'type': 'bar'}]]
)

positions = [(1,1), (1,2), (2,1), (2,2)]
colors_scenarios = ['green', 'orange', 'blue', 'red']

for (scenario_name, data), (row, col), color in zip(scenarios.items(), positions, colors_scenarios):
    fig.add_trace(
        go.Bar(
            x=data['species'],
            y=data['counts'],
            marker_color=color,
            showlegend=False,
            text=data['counts'],
            textposition='outside'
        ),
        row=row, col=col
    )

fig.update_xaxes(title_text="Species")
fig.update_yaxes(title_text="Abundance", range=[0, 110])

fig.update_layout(
    title="üìä Richness vs Evenness: Four Scenarios<br><sub>Each community has 100 total individuals</sub>",
    height=700,
    template='plotly_white'
)

fig.show()

print("\nüìä Scenario Analysis:")
print("\nüü¢ Top-Left (BEST): High richness (10 spp) + High evenness")
print("   ‚Üí HIGHEST diversity")
print("   ‚Üí All species equally common")
print("\nüü† Top-Right: High richness (10 spp) + Low evenness")
print("   ‚Üí Moderate diversity")
print("   ‚Üí One species dominates")
print("\nüîµ Bottom-Left: Low richness (3 spp) + High evenness")
print("   ‚Üí Moderate diversity")
print("   ‚Üí Few species but balanced")
print("\nüî¥ Bottom-Right (WORST): Low richness (3 spp) + Low evenness")
print("   ‚Üí LOWEST diversity")
print("   ‚Üí One species dominates everything")

---

## üßÆ Part 2: Shannon Diversity Index (H')

### The Most Popular Diversity Measure!

### Formula:

**H' = -Œ£ (p·µ¢ √ó ln(p·µ¢))**

Where:
- **H'** = Shannon diversity index
- **p·µ¢** = Proportion of species i (n·µ¢/N)
- **n·µ¢** = Number of individuals of species i
- **N** = Total number of individuals
- **Œ£** = Sum across all species

### Interpretation:

- **H' = 0**: Only ONE species (no diversity)
- **H' = 1-2**: Low diversity
- **H' = 2-3**: Moderate diversity
- **H' = 3-4**: High diversity
- **H' > 4**: Very high diversity (rare!)

### Maximum Value:

**H'max = ln(S)** where S = number of species

Achieved when all species equally abundant (perfect evenness)

### Why Shannon Index?

‚úÖ Considers both richness and evenness  
‚úÖ Widely used and understood  
‚úÖ Good for comparing communities  
‚úÖ Based on information theory  

In [None]:
# Calculate Shannon Diversity Index
def calculate_shannon(counts):
    """
    Calculate Shannon diversity index H'
    """
    counts = np.array(counts)
    proportions = counts / counts.sum()
    # Remove zeros (log of 0 is undefined)
    proportions = proportions[proportions > 0]
    H = -np.sum(proportions * np.log(proportions))
    return H

def calculate_simpson(counts):
    """
    Calculate Simpson diversity index D
    """
    counts = np.array(counts)
    n = counts.sum()
    D = 1 - np.sum((counts / n) ** 2)
    return D

def calculate_evenness(H, S):
    """
    Calculate evenness (Pielou's J)
    J = H / ln(S)
    """
    if S <= 1:
        return 0
    return H / np.log(S)

# Calculate for all scenarios
results = []

for name, data in scenarios.items():
    counts = data['counts']
    S = len(counts)
    H = calculate_shannon(counts)
    D = calculate_simpson(counts)
    J = calculate_evenness(H, S)
    
    results.append({
        'Scenario': name.replace('\n', ' '),
        'Richness (S)': S,
        'Shannon (H\')': round(H, 3),
        'Simpson (D)': round(D, 3),
        'Evenness (J)': round(J, 3)
    })

df_results = pd.DataFrame(results)

# Display as table
fig = go.Figure(data=[go.Table(
    header=dict(
        values=['<b>' + col + '</b>' for col in df_results.columns],
        fill_color='lightblue',
        align='left',
        font=dict(size=13)
    ),
    cells=dict(
        values=[df_results[col] for col in df_results.columns],
        fill_color=[['white', 'lightgray', 'white', 'lightgray']],
        align='left',
        font=dict(size=12),
        height=35
    )
)])

fig.update_layout(
    title="üìä Diversity Indices for All Scenarios",
    height=300
)

fig.show()

print("\nüìä Index Interpretation:")
print("\n   Shannon (H'):")
print("      ‚Ä¢ Higher = More diverse")
print("      ‚Ä¢ Considers richness + evenness")
print("      ‚Ä¢ Range: 0 to ln(S)")
print("\n   Simpson (D):")
print("      ‚Ä¢ Higher = More diverse")
print("      ‚Ä¢ Range: 0 to 1")
print("      ‚Ä¢ Less sensitive to rare species")
print("\n   Evenness (J):")
print("      ‚Ä¢ 1.0 = Perfect evenness")
print("      ‚Ä¢ 0.0 = One species dominates")
print("      ‚Ä¢ Range: 0 to 1")

---

## üî¢ Part 3: Simpson Diversity Index (D)

### Alternative Diversity Measure

### Formula:

**D = 1 - Œ£(p·µ¢¬≤)**

Or equivalently:

**Œª = Œ£(p·µ¢¬≤)** (Simpson's dominance)  
**D = 1 - Œª** (Simpson's diversity)

### Interpretation:

**D** represents the probability that two randomly selected individuals belong to DIFFERENT species

- **D = 0**: No diversity (all same species)
- **D = 0.5**: Moderate diversity
- **D = 0.9**: High diversity
- **D ‚Üí 1**: Maximum diversity

### Simpson vs Shannon:

| Feature | Shannon (H') | Simpson (D) |
|---------|--------------|-------------|
| **Range** | 0 to ln(S) | 0 to 1 |
| **Interpretation** | Information content | Probability different |
| **Rare species** | More sensitive | Less sensitive |
| **Common use** | Ecology | Ecology, genetics |
| **Calculation** | More complex | Simpler |

In [None]:
# Interactive diversity calculator
def diversity_calculator(species_data):
    """
    Calculate all diversity metrics for a community
    """
    species = list(species_data.keys())
    counts = list(species_data.values())
    
    # Basic metrics
    S = len(species)
    N = sum(counts)
    
    # Diversity indices
    H = calculate_shannon(counts)
    D = calculate_simpson(counts)
    J = calculate_evenness(H, S)
    
    # Maximum possible Shannon
    H_max = np.log(S)
    
    # Create visualization
    fig = make_subplots(
        rows=2, cols=2,
        subplot_titles=(
            'Species Abundance',
            'Diversity Metrics',
            'Proportional Abundance',
            'Evenness Score'
        ),
        specs=[
            [{'type': 'bar'}, {'type': 'indicator'}],
            [{'type': 'pie'}, {'type': 'indicator'}]
        ]
    )
    
    # 1. Bar chart
    fig.add_trace(
        go.Bar(x=species, y=counts, marker_color='lightblue', showlegend=False),
        row=1, col=1
    )
    
    # 2. Shannon gauge
    fig.add_trace(
        go.Indicator(
            mode="gauge+number+delta",
            value=H,
            title={'text': f"Shannon H'<br><sub>Max: {H_max:.2f}</sub>"},
            delta={'reference': H_max},
            gauge={
                'axis': {'range': [0, H_max]},
                'bar': {'color': "darkblue"},
                'steps': [
                    {'range': [0, H_max*0.33], 'color': "red"},
                    {'range': [H_max*0.33, H_max*0.67], 'color': "yellow"},
                    {'range': [H_max*0.67, H_max], 'color': "green"}
                ]
            }
        ),
        row=1, col=2
    )
    
    # 3. Pie chart
    fig.add_trace(
        go.Pie(labels=species, values=counts, showlegend=False),
        row=2, col=1
    )
    
    # 4. Evenness gauge
    fig.add_trace(
        go.Indicator(
            mode="gauge+number",
            value=J,
            title={'text': "Evenness (J)<br><sub>1.0 = Perfect</sub>"},
            gauge={
                'axis': {'range': [0, 1]},
                'bar': {'color': "purple"},
                'steps': [
                    {'range': [0, 0.33], 'color': "red"},
                    {'range': [0.33, 0.67], 'color': "yellow"},
                    {'range': [0.67, 1], 'color': "green"}
                ]
            }
        ),
        row=2, col=2
    )
    
    fig.update_layout(
        title=f"üìä Community Analysis: {S} Species, {N} Individuals",
        height=800
    )
    
    return fig, {'S': S, 'N': N, 'H': H, 'D': D, 'J': J, 'H_max': H_max}

# Example: Tropical Rainforest Birds
rainforest_birds = {
    'Toucan': 12,
    'Parrot': 15,
    'Hummingbird': 20,
    'Macaw': 8,
    'Eagle': 5,
    'Woodpecker': 10,
    'Tanager': 18,
    'Flycatcher': 14
}

fig, metrics = diversity_calculator(rainforest_birds)
fig.show()

print("\nüìä Rainforest Bird Community Metrics:")
print(f"   Species Richness (S): {metrics['S']}")
print(f"   Total Individuals (N): {metrics['N']}")
print(f"   Shannon Index (H'): {metrics['H']:.3f}")
print(f"   Simpson Index (D): {metrics['D']:.3f}")
print(f"   Evenness (J): {metrics['J']:.3f}")
print(f"\nüí° Interpretation:")
print(f"   ‚Ä¢ High diversity (H' = {metrics['H']:.2f})")
print(f"   ‚Ä¢ Good evenness (J = {metrics['J']:.2f})")
print(f"   ‚Ä¢ Healthy, balanced community!")

---

## üìà Part 4: Rank-Abundance Curves

### Visual Representation of Community Structure

### How to Create:
1. **Rank** species from most to least abundant
2. **Plot** rank (x-axis) vs abundance (y-axis)
3. **Interpret** the shape

### Curve Shapes:

#### **Steep Curve** (Geometric/Niche Preemption)
- Few dominant species
- Many rare species
- Low evenness
- Harsh environments

#### **Moderate Curve** (Log-Normal)
- Most common pattern
- Moderate evenness
- Typical communities

#### **Flat Curve** (Broken Stick)
- High evenness
- No dominant species
- Rare in nature

In [None]:
# Create rank-abundance curves for different community types
def create_rank_abundance_curve(counts, name, color):
    """
    Create rank-abundance data
    """
    # Sort in descending order
    sorted_counts = sorted(counts, reverse=True)
    ranks = list(range(1, len(sorted_counts) + 1))
    
    return ranks, sorted_counts

# Three community types
communities = {
    'Low Evenness\n(Harsh Environment)': {
        'counts': [80, 40, 20, 10, 5, 3, 2, 1, 1, 1],
        'color': 'red'
    },
    'Moderate Evenness\n(Typical Community)': {
        'counts': [35, 28, 22, 18, 15, 12, 10, 8, 6, 4],
        'color': 'orange'
    },
    'High Evenness\n(Stable Environment)': {
        'counts': [18, 17, 16, 15, 14, 13, 12, 11, 10, 9],
        'color': 'green'
    }
}

fig = go.Figure()

for name, data in communities.items():
    ranks, abundances = create_rank_abundance_curve(data['counts'], name, data['color'])
    
    fig.add_trace(go.Scatter(
        x=ranks,
        y=abundances,
        mode='lines+markers',
        line=dict(width=3, color=data['color']),
        marker=dict(size=8),
        name=name
    ))

fig.update_layout(
    title="üìà Rank-Abundance Curves: Three Community Types<br><sub>Steeper = Less even, Flatter = More even</sub>",
    xaxis_title="Species Rank (1 = Most abundant)",
    yaxis_title="Abundance (log scale)",
    yaxis_type="log",
    height=500,
    template='plotly_white'
)

fig.show()

print("\nüìà Rank-Abundance Interpretation:")
print("\nüî¥ Red (Steep):")
print("   ‚Ä¢ Few dominant species")
print("   ‚Ä¢ Low diversity and evenness")
print("   ‚Ä¢ Examples: Deserts, early succession")
print("\nüü† Orange (Moderate):")
print("   ‚Ä¢ Some dominant, some rare")
print("   ‚Ä¢ Moderate diversity")
print("   ‚Ä¢ Examples: Most natural communities")
print("\nüü¢ Green (Flat):")
print("   ‚Ä¢ All species similar abundance")
print("   ‚Ä¢ High diversity and evenness")
print("   ‚Ä¢ Examples: Coral reefs, rainforests")

---

## üåç Part 5: Real-World Example - Comparing Ecosystems

In [None]:
# Compare 3 real ecosystems
ecosystems = {
    'Tropical Rainforest\n(High Diversity)': {
        'Orchids': 25, 'Ferns': 22, 'Bromeliads': 20, 'Palms': 18,
        'Vines': 15, 'Mosses': 12, 'Figs': 10, 'Bamboo': 8,
        'Heliconias': 6, 'Tree ferns': 5, 'Gingers': 4, 'Begonias': 3
    },
    'Temperate Forest\n(Moderate Diversity)': {
        'Oak': 45, 'Maple': 35, 'Pine': 25, 'Birch': 20,
        'Ash': 15, 'Beech': 10, 'Hemlock': 5
    },
    'Arctic Tundra\n(Low Diversity)': {
        'Sedges': 80, 'Mosses': 40, 'Lichens': 30,
        'Dwarf willows': 15, 'Grasses': 10
    }
}

# Calculate metrics for each
comparison_results = []

for name, species_dict in ecosystems.items():
    counts = list(species_dict.values())
    S = len(counts)
    N = sum(counts)
    H = calculate_shannon(counts)
    D = calculate_simpson(counts)
    J = calculate_evenness(H, S)
    
    comparison_results.append({
        'Ecosystem': name.replace('\n', ' '),
        'Species (S)': S,
        'Individuals (N)': N,
        'Shannon (H\')': round(H, 3),
        'Simpson (D)': round(D, 3),
        'Evenness (J)': round(J, 3)
    })

df_comparison = pd.DataFrame(comparison_results)

# Create comparison visualization
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=('Species Richness', 'Shannon Diversity', 'Simpson Diversity', 'Evenness'),
    specs=[[{'type': 'bar'}, {'type': 'bar'}],
           [{'type': 'bar'}, {'type': 'bar'}]]
)

ecosystem_names = [name.replace('\n', ' ') for name in ecosystems.keys()]
colors_eco = ['green', 'orange', 'lightblue']

# Richness
fig.add_trace(
    go.Bar(x=ecosystem_names, y=df_comparison['Species (S)'],
           marker_color=colors_eco, showlegend=False),
    row=1, col=1
)

# Shannon
fig.add_trace(
    go.Bar(x=ecosystem_names, y=df_comparison['Shannon (H\')'],
           marker_color=colors_eco, showlegend=False),
    row=1, col=2
)

# Simpson
fig.add_trace(
    go.Bar(x=ecosystem_names, y=df_comparison['Simpson (D)'],
           marker_color=colors_eco, showlegend=False),
    row=2, col=1
)

# Evenness
fig.add_trace(
    go.Bar(x=ecosystem_names, y=df_comparison['Evenness (J)'],
           marker_color=colors_eco, showlegend=False),
    row=2, col=2
)

fig.update_layout(
    title="üåç Diversity Comparison: Three Major Ecosystems",
    height=700,
    template='plotly_white'
)

fig.show()

# Display detailed table
fig_table = go.Figure(data=[go.Table(
    header=dict(
        values=['<b>' + col + '</b>' for col in df_comparison.columns],
        fill_color='lightblue',
        align='left',
        font=dict(size=13)
    ),
    cells=dict(
        values=[df_comparison[col] for col in df_comparison.columns],
        fill_color=[colors_eco] * len(df_comparison.columns),
        align='left',
        font=dict(size=12),
        height=35
    )
)])

fig_table.update_layout(
    title="üìä Detailed Metrics Table",
    height=250
)

fig_table.show()

print("\nüåç Ecosystem Diversity Analysis:")
print("\nüü¢ Tropical Rainforest:")
print("   ‚Ä¢ HIGHEST species richness (12 species)")
print("   ‚Ä¢ HIGHEST Shannon diversity (H' ‚âà 2.5)")
print("   ‚Ä¢ High evenness (J ‚âà 0.93)")
print("   ‚Ä¢ Most balanced community")
print("\nüü† Temperate Forest:")
print("   ‚Ä¢ MODERATE richness (7 species)")
print("   ‚Ä¢ Moderate diversity (H' ‚âà 1.8)")
print("   ‚Ä¢ Moderate evenness (J ‚âà 0.92)")
print("   ‚Ä¢ Some dominance by oak and maple")
print("\nüîµ Arctic Tundra:")
print("   ‚Ä¢ LOWEST richness (5 species)")
print("   ‚Ä¢ LOWEST diversity (H' ‚âà 1.3)")
print("   ‚Ä¢ Low evenness (J ‚âà 0.81)")
print("   ‚Ä¢ Harsh environment limits diversity")
print("   ‚Ä¢ Sedges strongly dominate")

---

## üéì Summary

### Key Takeaways:

‚úÖ **Diversity = Richness + Evenness**  
‚úÖ **Shannon Index (H')**: Most popular, considers both components  
‚úÖ **Simpson Index (D)**: Probability-based, less sensitive to rare species  
‚úÖ **Evenness (J)**: How equally distributed species are  
‚úÖ **Rank-abundance curves**: Visual representation of community structure  
‚úÖ **Environmental stress**: Generally reduces diversity  
‚úÖ **Tropical rainforests**: Highest diversity on Earth  

### Diversity Index Quick Reference:

| Index | Formula | Range | When to Use |
|-------|---------|-------|-------------|
| **Richness (S)** | Count species | 1 to ‚àû | Simple comparison |
| **Shannon (H')** | -Œ£(p·µ¢ ln p·µ¢) | 0 to ln(S) | Standard diversity |
| **Simpson (D)** | 1 - Œ£(p·µ¢¬≤) | 0 to 1 | Dominance focus |
| **Evenness (J)** | H'/ln(S) | 0 to 1 | Distribution check |

### When to Use Which Index:

**Shannon (H')**:
- General biodiversity studies
- Conservation assessments
- Comparing multiple sites
- When rare species matter

**Simpson (D)**:
- Dominance analysis
- Genetics (heterozygosity)
- When common species matter more
- Simpler calculation needed

### Real-World Applications:

#### üå≥ **Conservation**:
- Prioritize high-diversity areas
- Monitor ecosystem health
- Track biodiversity loss

#### üåæ **Agriculture**:
- Crop diversity planning
- Pest management
- Soil health assessment

#### üè• **Medicine**:
- Gut microbiome diversity
- Disease resistance
- Probiotic effectiveness

### Next Notebook:

**03_ecological_succession.ipynb** - How communities change over time!

---

<div align="center">

**Made with üíö by The Pattern Hunter Team**

[üìì Previous: Species Interactions](01_species_interactions.ipynb) | 
[üìì Next: Ecological Succession](03_ecological_succession.ipynb)

</div>