# Agroforestry Plot Generator

Before modelling climate risks, we first define **stylised, “typical” agroforestry plots** for the crops/regions of interest.  
These serve as **reference systems** for all subsequent analyses in this notebook.

The plots represent **archetypes**, not individual farms.  
They allow us to:
- Quantify shade-providing canopy and its vulnerability to hazards.
- Link risk estimates to tangible, management-relevant agroforestry configurations.

We base them on both literature and expert guidance, depending on the crop and region.

---


In [5]:
from config import DATA_DIR
import pandas as pd
import numpy as np
import requests
import time
import math
import re
from typing import Optional, Dict, List
import requests

# 🌴 Dominican Republic Agroforestry Plot Generator

Generates **stylised agroforestry plots** for coffee and cacao systems in the Dominican Republic, based on **internal guidance from CEDAE/Centro Naturaleza (2023)** and simplified ecological assumptions.  
Outputs a `pandas.DataFrame` (or Excel workbook) describing the main crop and associated shade trees, including densities, shading, and basic yields.  

---

## 📚 Data Sources & Assumptions

- **Core reference**: Internal guidance document from CEDAE/Centro Naturaleza (2023), prepared in response to UNU consultant questions.  
- Reflects expert knowledge of biodiversity-based agroforestry systems in DR: typical species composition, densities, shading, and yields.  
- Not publicly available.  
- Locations are fixed reference points within the Cibao Noroeste region.

---

## 🌳 Systems Overview

**Coffee (upper watershed)**  
- Main crop: *Coffea arabica*  
- Permanent shade: *Inga spp.* (Guama)  
- Fruit trees: avocado, citrus, sapote, breadfruit  

**Cacao (middle watershed)**  
- Main crop: *Theobroma cacao*  
- Permanent shade: *Gliricidia sepium* (Piñón cubano)  
- Fruit trees: avocado, citrus, sapote, breadfruit  

---

## 📏 Plot & Species Parameters

| System   | Plot size (ha) | Shade trees/ha | Yield (t/ha/year) |
|----------|----------------|----------------|-------------------|
| Coffee   | 1.0            | ~144           | ~0.73             |
| Cacao    | 1.0            | ~144           | ~0.50             |

**Shading ranges:**  
- *Inga spp.*: 30–60%  
- *Gliricidia sepium*: 40–60%  
- Fruit trees: 30–50%  

Shading is randomised within these ranges for each generated plot.

---

## 🔀 Alternatives Overview

Each Excel workbook contains a **baseline plot** (`present`) plus several **alternatives**:

- **Baseline**
  - `present`: A typical coffee or cacao agroforestry system with randomly allocated shade trees.

- **Shading adjustments**
  - `shade_-20`: Shade tree densities reduced by 20%  
  - `shade_-10`: Shade tree densities reduced by 10%  
  - `shade_-5`: Shade tree densities reduced by 5%  
  - `shade_+5`: Shade tree densities increased by 5%  
  - `shade_+10`: Shade tree densities increased by 10%  
  - `shade_+20`: Shade tree densities increased by 20%  

  ➡️ These variations allow exploring how **more or less canopy cover** changes the agroforestry system.

- **Crop switch**
  - `coffee_to_cacao`: Coffee plots re-generated as cacao plots at the same location  
  - `cacao_to_coffee`: Cacao plots re-generated as coffee plots at the same location  

  ➡️ These alternatives simulate a **land-use switch** between the two main agroforestry systems.

---


## 📝 Sampling point

Instead of fixing each system to a single location, we generate multiple random points 
within the target region (e.g. Cibao Noroeste in the Dominican Republic).  
This ensures that the synthetic agroforestry plots:  

- capture some of the **geographic variability** within the region,  
- can be linked to realistic **altitude and climate conditions**,  
- and allow direct comparison across scenarios at several representative sites, 
  rather than relying on just one arbitrary coordinate.  

The sampled points are then used consistently across all scenario sheets for each system.


## 🛠️ Technical Notes

- Shade tree counts are allocated randomly across typical species using a Dirichlet distribution, so composition varies between plots.  
- Sensitivity alternatives (`shade_±…`) are applied by scaling the number of shade trees, while the main crop remains unchanged.  
- Crop switch alternatives fully regenerate the plot with the other system’s composition.  

---

## 📝 Output Table

| Column | Description |
|--------|-------------|
| Region | Always "DR - Cibao Noroeste" |
| System | Coffee / Cacao |
| Plot size (ha) | Typical plot size |
| Species | Common name |
| Scientific name | Botanical name |
| Plants/ha | Density (adjusted in sensitivity runs) |
| Yield (t/ha/year) | Main crop yield only |
| Per-tree shading (%) | Randomised within plausible range |
| Latitude / Longitude | Fixed reference coordinates |

---


## ⚠️ Disclaimer

This is a **stylised representation of typical agroforestry plots**, not a farm census.  
Values reflect average conditions and expert input from CEDAE, with added randomisation to capture variability.  


In [6]:
# === All-in-one cell for DR (Coffee & Cacao, multiple points) ===
from pathlib import Path
from datetime import datetime
from typing import Optional, List, Dict
import numpy as np
import pandas as pd
import requests

# --- helper for shade tree distribution ---
def dirichlet_counts(total: int, n: int) -> np.ndarray:
    if total <= 0 or n <= 0:
        return np.zeros(n, dtype=int)
    props = np.random.dirichlet(np.ones(n))
    return np.round(props * total).astype(int)

# --- constants ---
MAIN_CROP_SCI = {
    "Coffee": "Coffea arabica",
    "Cacao":  "Theobroma cacao",
}

DR_BBOX = (-71.8, 19.0, -70.9, 20.1)  # lon_min, lat_min, lon_max, lat_max

DR_TYPICAL_YIELD = {"Coffee": 0.73, "Cacao": 0.50}

DR_SPECIES = {
    "Coffee": [
        {"Species":"Guama",          "Scientific name":"Inga spp.",          "Shade range":(30,60)},
        {"Species":"Bitter orange",  "Scientific name":"Citrus aurantium",   "Shade range":(30,50)},
        {"Species":"Sweet orange",   "Scientific name":"Citrus sinensis",    "Shade range":(30,50)},
        {"Species":"Sapote",         "Scientific name":"Pouteria sapota",    "Shade range":(30,50)},
        {"Species":"Breadfruit",     "Scientific name":"Artocarpus altilis", "Shade range":(30,50)},
        {"Species":"Avocado",        "Scientific name":"Persea americana",   "Shade range":(30,50)},
    ],
    "Cacao": [
        {"Species":"Gliricidia (Piñón cubano)", "Scientific name":"Gliricidia sepium", "Shade range":(40,60)},
        {"Species":"Bitter orange",  "Scientific name":"Citrus aurantium",   "Shade range":(30,50)},
        {"Species":"Sweet orange",   "Scientific name":"Citrus sinensis",    "Shade range":(30,50)},
        {"Species":"Sapote",         "Scientific name":"Pouteria sapota",    "Shade range":(30,50)},
        {"Species":"Breadfruit",     "Scientific name":"Artocarpus altilis", "Shade range":(30,50)},
        {"Species":"Avocado",        "Scientific name":"Persea americana",   "Shade range":(30,50)},
    ],
}

# --- altitude lookup via Open-Elevation ---
def get_elevation(lat: float, lon: float) -> Optional[float]:
    try:
        url = "https://api.open-elevation.com/api/v1/lookup"
        r = requests.get(url, params={"locations": f"{lat},{lon}"}, timeout=10)
        if r.ok:
            return r.json()["results"][0]["elevation"]
    except Exception:
        return None
    return None

# --- generator for one plot ---
def generate_dr_plot(system: str, lat: float, lon: float,
                     *, seed: Optional[int] = None,
                     total_shade_trees: int = 144) -> pd.DataFrame:
    np.random.seed(seed)
    sys = system.capitalize()
    if sys not in ("Coffee", "Cacao"):
        raise ValueError("system must be one of: Coffee, Cacao")

    plot_size = 1.0
    fixed_yield = DR_TYPICAL_YIELD[sys]

    rows: List[Dict] = [{
        "Region": "DR - Cibao Noroeste",
        "System": sys,
        "Plot size (ha)": plot_size,
        "Species": f"{sys} (main crop)",
        "Scientific name": MAIN_CROP_SCI[sys],
        "Plants/ha": None,
        "Yield (t/ha/year)": round(fixed_yield, 3),
        "Per-tree shading (%)": None,
        "Latitude": lat,
        "Longitude": lon,
        "Altitude (m)": get_elevation(lat, lon),
    }]

    if total_shade_trees > 0:
        species = DR_SPECIES[sys]
        counts = dirichlet_counts(total_shade_trees, len(species))
        for sp, n in zip(species, counts):
            shade = np.random.uniform(*sp["Shade range"]) if sp["Shade range"] else None
            rows.append({
                "Region": "DR - Cibao Noroeste",
                "System": sys,
                "Plot size (ha)": plot_size,
                "Species": sp["Species"],
                "Scientific name": sp["Scientific name"],
                "Plants/ha": int(n),
                "Yield (t/ha/year)": None,
                "Per-tree shading (%)": (round(shade, 1) if shade is not None else None),
                "Latitude": lat,
                "Longitude": lon,
                "Altitude (m)": rows[0]["Altitude (m)"],
            })

    return pd.DataFrame(rows)

# --- adjust shading by scaling shade tree counts ---
def adjust_shading(df: pd.DataFrame, factor: float) -> pd.DataFrame:
    df2 = df.copy()
    main_label = f"{df['System'].iloc[0]} (main crop)"
    mask = df2["Species"] != main_label
    df2.loc[mask, "Plants/ha"] = (df2.loc[mask, "Plants/ha"] * factor).round().astype(int)
    return df2

# --- sample 10 points within bbox ---
def sample_points(n: int = 10) -> List[tuple]:
    lon_min, lat_min, lon_max, lat_max = DR_BBOX
    return [(np.random.uniform(lat_min, lat_max),
             np.random.uniform(lon_min, lon_max)) for _ in range(n)]

# --- write outputs ---
output_dir = Path("agroforestry_systems")
output_dir.mkdir(exist_ok=True)
date_tag = datetime.now().strftime("%Y%m%d")

factors = {
    "shade_-20": 0.8,
    "shade_-10": 0.9,
    "shade_-5":  0.95,
    "shade_+5":  1.05,
    "shade_+10": 1.10,
    "shade_+20": 1.20,
}

for sys in ("Coffee", "Cacao"):
    points = sample_points(10)  # 10 locations
    present_all = pd.concat([
        generate_dr_plot(sys, lat, lon, seed=i) for i, (lat, lon) in enumerate(points)
    ], ignore_index=True)

    # crop switch → same coords, other crop
    target = "Cacao" if sys == "Coffee" else "Coffee"
    switched_all = pd.concat([
        generate_dr_plot(target, lat, lon, seed=100+i) for i, (lat, lon) in enumerate(points)
    ], ignore_index=True)

    # save workbook
    output_file = output_dir / f"{date_tag}__{sys}__DR_Cibao_Noroeste.xlsx"
    with pd.ExcelWriter(output_file) as w:
        present_all.to_excel(w, sheet_name="present", index=False)
        for name, factor in factors.items():
            adjust_shading(present_all, factor).to_excel(w, sheet_name=name, index=False)
        switched_all.to_excel(w, sheet_name=f"{sys.lower()}_to_{target.lower()}", index=False)

    print(f"✅ Saved {sys}: {output_file}")


  present_all = pd.concat([
  switched_all = pd.concat([


✅ Saved Coffee: agroforestry_systems/20250819__Coffee__DR_Cibao_Noroeste.xlsx


  present_all = pd.concat([


✅ Saved Cacao: agroforestry_systems/20250819__Cacao__DR_Cibao_Noroeste.xlsx


  switched_all = pd.concat([


# 🇬🇹 Guatemala — Coffee & Cacao Agroforestry Plot Generators

These scripts generate **stylised agroforestry plots** for coffee (*Coffea arabica*) in the Western Highlands and cacao (*Theobroma cacao*) in Alta Verapaz, Guatemala, using published field studies as baselines.  
Outputs are `pandas.DataFrame` objects describing the main crop and associated shade-tree species, with densities, shading values, and location metadata.

---

## 📚 Data Sources

### Coffee (Western Highlands)
- **Rice, R. A. (2008; 2011)** — surveys of 153 coffee farmers in Guatemala’s Western Highlands.
  - Typical farm: ~3.5 ha; elevation: 900–1,400 m; shade cover: ~30–50%
  - Yield: ~0.78 t/ha/year
  - Shade tree density: ~184 trees/ha
  - *Inga spp.* dominant (~60% of shade trees), plus timber and minor fruit species

### Cacao (Alta Verapaz)
- **Villanueva-González, C. E., et al. (2023)**. *Botanical diversity, structure and composition in cocoa agroforest systems in Alta Verapaz, Guatemala.* Scientia Agropecuaria, 14(2), 223–234.  
  [https://doi.org/10.17268/sci.agropecu.2023.020](https://doi.org/10.17268/sci.agropecu.2023.020)
  - Based on 70 cacao agroforestry plots across 4 municipalities
  - Reports species richness, tree density, and structural traits

---

## 🌳 Plot Composition

### Coffee — Western Highlands
| Species | Scientific name | Target density (trees/ha) | Per-tree shading (%) |
|---|---|---:|---:|
| Guama | *Inga spp.* | 111 | 45–60 |
| Cedar | *Cedrela odorata* | 30 | 30–40 |
| Oak | *Quercus spp.* | 20 | 30–40 |
| Walnut | *Juglans spp.* | 15 | 30–40 |
| Banana | *Musa spp.* | 10 | 20–30 |
| Avocado | *Persea americana* | 8 | 30–40 |
| Citrus | *Citrus spp.* | 6 | 30–40 |
| Pacaya | *Chamedorea spp.* | 5 | 20–30 |
| Hog plum | *Spondias mombins* | 5 | 20–30 |

---

### Cacao — Alta Verapaz
| Species | Scientific name | Shade range (%) |
|---|---|---:|
| Gliricidia | *Gliricidia sepium* | 40–60 |
| Mahogany | *Swietenia macrophylla* | 50–70 |
| Cedar | *Cedrela odorata* | 40–60 |
| Copal | *Protium copal* | 30–50 |
| Cordia | *Cordia alliodora* | 40–60 |
| Inga | *Inga sapindoides* | 50–70 |
| Avocado | *Persea americana* | 40–60 |
| Sapote | *Pouteria sapota* | 40–60 |
| Mango | *Mangifera indica* | 50–70 |

---

## 📏 Core Assumptions

| Parameter           | Coffee (WH)       | Cacao (Alta Verapaz) |
|---------------------|-------------------|----------------------|
| Plot size           | 3.5 ha            | 1.0 ha               |
| Yield               | 0.78 t/ha/year    | 0.40 t/ha/year       |
| Shade tree density  | ~184 trees/ha     | 151 trees/ha         |
| Elevation range     | 900–1,400 m       | 132–500 m            |
| Lat/lon range       | given point       | 15.4–15.8° N / –90.3 to –89.8° |

---

## 🛠️ Generator Behaviour

- **Location:**  
  - Coffee generator takes a **given point** (lat, lon).  
  - Cacao generator samples a location within the Alta Verapaz bounding box.
- **Shade densities:**  
  - Coffee: target densities varied ±20% (5% chance to omit minor species).  
  - Cacao: total shade trees = 151, distributed via random Dirichlet proportions.
- **Per-tree shading:**  
  - Sampled within species-specific ranges to keep aggregate canopy realistic.
- **Elevation:**  
  - Coffee generator can fetch elevation (optional).  
  - Cacao uses sampled elevation within literature range.

---

## 📝 Sampling point

Instead of fixing each system to a single location, we generate multiple random points 
within the target region (e.g. Cibao Noroeste in the Dominican Republic).  
This ensures that the synthetic agroforestry plots:  

- capture some of the **geographic variability** within the region,  
- can be linked to realistic **altitude and climate conditions**,  
- and allow direct comparison across scenarios at several representative sites, 
  rather than relying on just one arbitrary coordinate.  

The sampled points are then used consistently across all scenario sheets for each system.


## 📄 Output Columns

Region / System / Plot size (ha) / Species / Scientific name /
Plants/ha / Yield (t/ha/year) / Per-tree shading (%) /
Latitude / Longitude / Elevation (m)


---

## ⚠️ Notes

- These are **stylised representations** from literature, **not** actual farm censuses.
- Randomisation ensures plausible heterogeneity between generated plots.
- Intended for **scenario building and modelling**, not exact inventories.

---

## 📑 Suggested Citations

Rice, R. A. (2011). *Agroforestry Systems*, 83(1), 41–49. https://doi.org/10.1007/s10457-011-9385-4  
Rice, R. A. (2008). *Agricultural intensification within agroforestry: Shade coffee in Central America.* University of California.  
Villanueva-González, C. E., et al. (2023). *Botanical diversity, structure and composition in cocoa agroforest systems in Alta Verapaz, Guatemala.* Scientia Agropecuaria, 14(2), 223–234.


In [7]:
# === Guatemala Coffee & Cacao Plots with Alternatives ===
from pathlib import Path
from datetime import datetime
import numpy as np
import pandas as pd
from config import DATA_DIR
# -------------------------------
# 1) Config
# -------------------------------
output_dir.mkdir(exist_ok=True)
date_tag = datetime.now().strftime("%Y%m%d")

MAIN_CROP_SCI = {
    "Coffee": "Coffea arabica",
    "Cacao":  "Theobroma cacao",
}

# -------------------------------
# 2) Guatemala Coffee (Western Highlands)
# -------------------------------
GTM_WH_REGION     = "GTM - Western Highlands"
GTM_WH_PLOT_SIZE  = 3.5
GTM_WH_YIELD      = 0.78
GTM_WH_SPECIES = [
    {"Species": "Guama",     "Scientific name": "Inga spp.",        "Density/ha": 111, "Shade range": (45, 60)},
    {"Species": "Cedar",     "Scientific name": "Cedrela odorata",  "Density/ha":  30, "Shade range": (30, 40)},
    {"Species": "Oak",       "Scientific name": "Quercus spp.",     "Density/ha":  20, "Shade range": (30, 40)},
    {"Species": "Walnut",    "Scientific name": "Juglans spp.",     "Density/ha":  15, "Shade range": (30, 40)},
    {"Species": "Banana",    "Scientific name": "Musa paradisiaca", "Density/ha":  10, "Shade range": (20, 30)},
    {"Species": "Avocado",   "Scientific name": "Persea americana", "Density/ha":   8, "Shade range": (30, 40)},
    {"Species": "Citrus",    "Scientific name": "Citrus sinensis",  "Density/ha":   6, "Shade range": (30, 40)},
    {"Species": "Pacaya",    "Scientific name": "Chamaedorea spp.", "Density/ha":   5, "Shade range": (20, 30)},
    {"Species": "Hog plum",  "Scientific name": "Spondias mombin",  "Density/ha":   5, "Shade range": (20, 30)},
]

def generate_coffee_plot(lat, lon):
    rows = [{
        "Region": GTM_WH_REGION, "System": "Coffee", "Plot size (ha)": GTM_WH_PLOT_SIZE,
        "Species": "Coffee (main crop)", "Scientific name": MAIN_CROP_SCI["Coffee"],
        "Plants/ha": None, "Yield (t/ha/year)": GTM_WH_YIELD,
        "Per-tree shading (%)": None, "Latitude": lat, "Longitude": lon
    }]
    shade_mid = lambda a, b: round((a+b)/2, 1)
    for sp in GTM_WH_SPECIES:
        rows.append({
            "Region": GTM_WH_REGION, "System": "Coffee", "Plot size (ha)": GTM_WH_PLOT_SIZE,
            "Species": sp["Species"], "Scientific name": sp["Scientific name"],
            "Plants/ha": sp["Density/ha"], "Yield (t/ha/year)": None,
            "Per-tree shading (%)": shade_mid(*sp["Shade range"]),
            "Latitude": lat, "Longitude": lon
        })
    return pd.DataFrame(rows)

# -------------------------------
# 3) Guatemala Cacao (Alta Verapaz)
# -------------------------------
GTM_AV_REGION   = "GTM - Alta Verapaz"
GTM_CACAO_PLOT  = 1.0
GTM_CACAO_YIELD = 0.50
TOTAL_TREES     = 151
GTM_AV_SPECIES = [
    {"Species": "Gliricidia", "Scientific name": "Gliricidia sepium", "Shade range": (40,60), "w":0.25},
    {"Species": "Mahogany",   "Scientific name": "Swietenia macrophylla", "Shade range": (50,70), "w":0.12},
    {"Species": "Cedar",      "Scientific name": "Cedrela odorata", "Shade range": (40,60), "w":0.10},
    {"Species": "Copal",      "Scientific name": "Protium copal",   "Shade range": (30,50), "w":0.08},
    {"Species": "Cordia",     "Scientific name": "Cordia alliodora","Shade range": (40,60), "w":0.10},
    {"Species": "Inga",       "Scientific name": "Inga sapindoides","Shade range": (50,70), "w":0.15},
    {"Species": "Avocado",    "Scientific name": "Persea americana","Shade range": (40,60), "w":0.08},
    {"Species": "Sapote",     "Scientific name": "Pouteria sapota","Shade range": (40,60), "w":0.06},
    {"Species": "Mango",      "Scientific name": "Mangifera indica","Shade range": (50,70), "w":0.06},
]

def _det_counts(total, weights):
    arr = np.array(weights) / sum(weights)
    raw = arr * total
    floored = np.floor(raw).astype(int)
    remainder = total - floored.sum()
    frac_idx = np.argsort(-(raw - floored))
    for k in range(remainder):
        floored[frac_idx[k]] += 1
    return floored.tolist()

def generate_cacao_plot(lat, lon):
    rows = [{
        "Region": GTM_AV_REGION, "System": "Cacao", "Plot size (ha)": GTM_CACAO_PLOT,
        "Species": "Cacao (main crop)", "Scientific name": MAIN_CROP_SCI["Cacao"],
        "Plants/ha": None, "Yield (t/ha/year)": GTM_CACAO_YIELD,
        "Per-tree shading (%)": None, "Latitude": lat, "Longitude": lon
    }]
    shade_mid = lambda a, b: round((a+b)/2, 1)
    weights = [sp["w"] for sp in GTM_AV_SPECIES]
    counts = _det_counts(TOTAL_TREES, weights)
    for sp, n in zip(GTM_AV_SPECIES, counts):
        rows.append({
            "Region": GTM_AV_REGION, "System": "Cacao", "Plot size (ha)": GTM_CACAO_PLOT,
            "Species": sp["Species"], "Scientific name": sp["Scientific name"],
            "Plants/ha": n, "Yield (t/ha/year)": None,
            "Per-tree shading (%)": shade_mid(*sp["Shade range"]),
            "Latitude": lat, "Longitude": lon
        })
    return pd.DataFrame(rows)

# -------------------------------
# 4) Shading adjustment
# -------------------------------
def adjust_shading(df, factor):
    df2 = df.copy()
    main_label = f"{df['System'].iloc[0]} (main crop)"
    mask = df2["Species"] != main_label
    df2.loc[mask, "Plants/ha"] = (df2.loc[mask, "Plants/ha"] * factor).round().astype(int)
    return df2

factors = {
    "shade_-20": 0.8,
    "shade_-10": 0.9,
    "shade_-5": 0.95,
    "shade_+5": 1.05,
    "shade_+10": 1.10,
    "shade_+20": 1.20,
}

# -------------------------------
# 5) Generate & Save
# -------------------------------
for sys in ("Coffee", "Cacao"):
    lat, lon = (15.0, -91.0) if sys == "Coffee" else (15.6, -90.2)
    present = generate_coffee_plot(lat, lon) if sys=="Coffee" else generate_cacao_plot(lat, lon)

    # Crop swap
    swapped = generate_cacao_plot(lat, lon) if sys=="Coffee" else generate_coffee_plot(lat, lon)
    swap_name = f"{sys.lower()}_to_{'cacao' if sys=='Coffee' else 'coffee'}"

    # Save workbook
    output_file = output_dir / f"{date_tag}__{sys}__Guatemala.xlsx"
    with pd.ExcelWriter(output_file) as w:
        present.to_excel(w, sheet_name="present", index=False)
        for name, factor in factors.items():
            adjust_shading(present, factor).to_excel(w, sheet_name=name, index=False)
        swapped.to_excel(w, sheet_name=swap_name, index=False)

    print(f"✅ Saved {sys}: {output_file}")


✅ Saved Coffee: agroforestry_systems/20250819__Coffee__Guatemala.xlsx
✅ Saved Cacao: agroforestry_systems/20250819__Cacao__Guatemala.xlsx


# 🌳 Veracruz Agroforestry Coffee Plot Generator

This script simulates **1-hectare coffee agroforestry plots** in the **Veracruz region of Mexico**, based on the systems described in:

> *"Tree species diversity and vegetation structure in shade coffee farms in Veracruz, Mexico"*  
> López‐Gómez, A. M., Williams‐Linera, G., & Manson, R. H. (2008). Agriculture, Ecosystems & Environment, 124(3–4), 160–172.  
> [https://doi.org/10.1016/j.agee.2007.09.008](https://doi.org/10.1016/j.agee.2007.09.008)

---  
It samples **10 random plot locations** per system (DP, SP, SM) within a bounding box and elevation range, 
and creates an Excel file describing tree composition, shading, and management type.

---

## Systems
- **SM** – Simple system: lower tree density, low basal area, low mean height  
- **SP** – Semi-diverse system: medium density, medium basal area, medium mean height  
- **DP** – Diversified system: higher density, high basal area, high mean height  

---

## Main crop
- Default: **Coffee (Coffea arabica)**  
- Scenario: switch to **Cacao (Theobroma cacao)**  

---

## Shade trees
Species lists are based on literature for Veracruz systems.  
For each plot:
- Species counts distributed stochastically (Dirichlet-based)  
- Per-tree shading values sampled within species-specific ranges  

---

## Scenarios generated
Each system produces 10 plots, saved as separate **Excel sheets**:

- `DP_present` – baseline diversified plots  
- `DP_shade_-20`, `DP_shade_-10`, `DP_shade_-5` – shade reduction scenarios  
- `DP_shade_+5`, `DP_shade_+10`, `DP_shade_+20` – shade increase scenarios  
- `DP_switch_to_cacao` – main crop replaced by cacao  
- `SP_present` – semi-diverse system  
- `SM_present` – simple system  

---

## Output
- Excel file written to: `agroforestry_systems/<YYYYMMDD>__Veracruz_systems.xlsx`  
- Each sheet contains:
  - Region, System, Plot size  
  - Species + Scientific name  
  - Plants/ha, Yield  
  - Per-tree shading (%)  
  - Latitude, Longitude, Elevation  
  - Basal area, Mean height  
  - Plot_ID (e.g. `DP_1`, `SP_5`)  

---

## Notes
- Elevation retrieved via the **Open-Elevation API**  
- Locations sampled randomly within Veracruz coffee belt (900–1500 m asl)  
- Tree counts rounded to integers, may vary per plot  
- Shade adjustments applied as absolute deltas and clipped to [0,100]  


In [None]:
# =========================================
# Veracruz coffee systems (DP, SP, SM with 10 points each)
# =========================================
from pathlib import Path
from datetime import datetime
import numpy as np
import pandas as pd
import requests, time

# --- helpers ---
def _get_elevation(lat, lon, timeout=5):
    try:
        r = requests.get("https://api.open-elevation.com/api/v1/lookup",
                         params={"locations": f"{lat},{lon}"}, timeout=timeout)
        if r.status_code == 200:
            return float(r.json()["results"][0]["elevation"])
    except Exception:
        pass
    return None

def _sample_valid_location(bbox, elev_range, max_attempts=30, sleep_sec=0.3):
    for _ in range(max_attempts):
        lat = float(np.random.uniform(*bbox["lat"]))
        lon = float(np.random.uniform(*bbox["lon"]))
        elev = _get_elevation(lat, lon)
        if elev is not None and elev_range[0] <= elev <= elev_range[1]:
            return lat, lon, elev
        time.sleep(sleep_sec)
    raise RuntimeError("Could not find valid location.")

# --- definitions ---
VERA_BBOX = {"lat": (19.15, 19.45), "lon": (-96.97, -96.47)}
VERA_ELEV_RANGE = (900, 1500)  # m

VERA_SYSTEM_INFO = {
    "SM": {"plot_size_ha": 1.0, "tree_density": 80,  "basal_area": "low",    "mean_height": "low"},
    "SP": {"plot_size_ha": 1.0, "tree_density": 100, "basal_area": "medium", "mean_height": "medium"},
    "DP": {"plot_size_ha": 1.0, "tree_density": 130, "basal_area": "high",   "mean_height": "high"},
}

VERA_SPECIES = {
    "SM": [
        {"Species": "Inga vera",        "Scientific name": "Inga vera",        "Shade range": (40, 60)},
        {"Species": "Mimosa scabrella", "Scientific name": "Mimosa scabrella", "Shade range": (30, 50)},
    ],
    "SP": [
        {"Species": "Inga vera",        "Scientific name": "Inga vera",        "Shade range": (40, 60)},
        {"Species": "Cedrela odorata",  "Scientific name": "Cedrela odorata",  "Shade range": (30, 50)},
        {"Species": "Persea americana", "Scientific name": "Persea americana", "Shade range": (30, 50)},
        {"Species": "Citrus sinensis",  "Scientific name": "Citrus sinensis",  "Shade range": (25, 40)},
    ],
    "DP": [
        {"Species": "Inga vera",           "Scientific name": "Inga vera",           "Shade range": (40, 60)},
        {"Species": "Cedrela odorata",     "Scientific name": "Cedrela odorata",     "Shade range": (30, 50)},
        {"Species": "Quercus sapotifolia", "Scientific name": "Quercus sapotifolia", "Shade range": (35, 55)},
        {"Species": "Enterolobium cyclocarpum","Scientific name":"Enterolobium cyclocarpum","Shade range": (30, 50)},
        {"Species": "Persea americana",    "Scientific name": "Persea americana",    "Shade range": (30, 50)},
        {"Species": "Mangifera indica",    "Scientific name": "Mangifera indica",    "Shade range": (25, 45)},
    ],
}

def build_veracruz_plot(system, lat, lon, elev, seed=None):
    np.random.seed(seed)
    info = VERA_SYSTEM_INFO[system]
    sp_list = VERA_SPECIES[system]

    rows = []
    # coffee row
    rows.append({
        "Region": "MEX - Veracruz",
        "System": system,
        "Plot size (ha)": info["plot_size_ha"],
        "Species": "Coffee (main crop)",
        "Scientific name": "Coffea arabica",
        "Plants/ha": None,
        "Yield (t/ha/year)": 0.55,
        "Per-tree shading (%)": None,
        "Latitude": lat,
        "Longitude": lon,
        "Elevation (m)": elev,
        "Basal area": info["basal_area"],
        "Mean height": info["mean_height"],
    })

    # shade trees
    proportions = np.random.dirichlet(np.ones(len(sp_list)))
    counts = np.round(proportions * info["tree_density"]).astype(int)
    for sp, n in zip(sp_list, counts):
        shade = float(np.random.uniform(*sp["Shade range"]))
        rows.append({
            "Region": "MEX - Veracruz",
            "System": system,
            "Plot size (ha)": info["plot_size_ha"],
            "Species": sp["Species"],
            "Scientific name": sp["Scientific name"],
            "Plants/ha": int(n),
            "Yield (t/ha/year)": None,
            "Per-tree shading (%)": round(shade, 1),
            "Latitude": lat,
            "Longitude": lon,
            "Elevation (m)": elev,
            "Basal area": info["basal_area"],
            "Mean height": info["mean_height"],
        })
    return pd.DataFrame(rows)

def adjust_shade(df, delta_pct):
    df2 = df.copy()
    mask = df2["Species"] != "Coffee (main crop)"
    df2.loc[mask, "Per-tree shading (%)"] = (
        df2.loc[mask, "Per-tree shading (%)"] + delta_pct
    ).clip(lower=0, upper=100)
    return df2

def switch_to_cacao(df):
    df2 = df.copy()
    df2.loc[df2["Species"] == "Coffee (main crop)", ["Species","Scientific name"]] = ["Cacao (main crop)", "Theobroma cacao"]
    return df2

# --- generate & save ---
date_tag = datetime.now().strftime("%Y%m%d")
output_dir.mkdir(exist_ok=True, parents=True)
output_file = output_dir / f"{date_tag}__Veracruz_systems.xlsx"

shade_deltas = [-20, -10, -5, +5, +10, +20]

with pd.ExcelWriter(output_file) as w:
    for sys in ("DP", "SP", "SM"):
        # 10 plots for each system
        all_plots = []
        for i in range(10):
            lat, lon, elev = _sample_valid_location(VERA_BBOX, VERA_ELEV_RANGE)
            df = build_veracruz_plot(sys, lat, lon, elev, seed=i)
            df["Plot_ID"] = f"{sys}_{i+1}"
            all_plots.append(df)
        present = pd.concat(all_plots, ignore_index=True)

        if sys == "DP":
            # DP baseline
            present.to_excel(w, sheet_name="DP_present", index=False)
            # Shade deltas
            for d in shade_deltas:
                adjust_shade(present, d).to_excel(w, sheet_name=f"DP_shade_{d:+d}", index=False)
            # Cacao switch
            switch_to_cacao(present).to_excel(w, sheet_name="DP_switch_to_cacao", index=False)
        else:
            present.to_excel(w, sheet_name=f"{sys}_present", index=False)

print(f"✅ Saved Veracruz systems with 10 plots each: {output_file}")
