# PyCIAT Advanced Features Tutorial
## Soil Integration and Advanced Simulation Features

This notebook demonstrates how to use PyCIAT's advanced features including:
- Soil data integration
- HYDRUS-1D coupling
- Biotic stress simulation
- Soil carbon modeling
- Machine learning surrogates

In [None]:
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Add project root to path if needed
project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
if project_root not in sys.path:
    sys.path.insert(0, project_root)

from src import load_config
from src.soil_processing import load_soil_data, process_soil_profiles
from src.advanced_modules import biotic_stress, soil_carbon
from src.surrogate_model.feature_engineering import engineer_features
from src.surrogate_model.model_selection import train_surrogate_model

## 1. Load Configuration and Setup

In [None]:
# Load config
config = load_config('config/config.yaml')

# Set paths
soil_shapefile = config['paths']['soil_shapefile']
soil_profiles = config['paths']['soil_profiles']
experimental_data = config['paths']['experimental_data']

## 2. Soil Data Integration

Demonstrate how to load and process soil data for integration with crop models.

In [None]:
# Load soil data
soil_df = load_soil_data(soil_shapefile)

# Process soil profiles
soil_profiles_df = process_soil_profiles(soil_profiles)

# Display summary
print("\nSoil Data Summary:")
print(f"Number of soil units: {len(soil_df)}")
print(f"Number of soil profiles: {len(soil_profiles_df)}")

# Plot soil texture distribution
plt.figure(figsize=(10, 6))
plt.scatter(soil_profiles_df['clay_pct'], soil_profiles_df['sand_pct'], alpha=0.6)
plt.xlabel('Clay %')
plt.ylabel('Sand %')
plt.title('Soil Texture Distribution')
plt.grid(True)
plt.show()

## 3. Advanced Simulation Features

### 3.1 Biotic Stress Simulation

In [None]:
# Configure biotic stress
stress_config = config['simulation']['biotic_stress_parameters']

# Example stress event simulation
stress_impact = biotic_stress.simulate_stress_impact(
    target_variable='yield',
    stress_timing=60,  # Days after sowing
    stress_duration=30,
    stress_intensity=0.15
)

# Plot stress impact
plt.figure(figsize=(10, 6))
plt.plot(range(120), stress_impact)
plt.xlabel('Days After Sowing')
plt.ylabel('Stress Factor')
plt.title('Biotic Stress Impact Profile')
plt.grid(True)
plt.show()

### 3.2 Soil Carbon Modeling

In [None]:
# Initialize soil carbon model
soc_config = config['simulation']['soil_carbon_parameters']
initial_soc = soc_config['initial_soc_ton_ha']
residue_incorporation = soc_config['residue_incorporation_pct']

# Simulate carbon dynamics
years = np.arange(2025, 2051)
soc_trajectory = soil_carbon.simulate_soc_dynamics(
    initial_soc=initial_soc,
    residue_incorporation_rate=residue_incorporation/100,
    years=len(years)
)

# Plot SOC trajectory
plt.figure(figsize=(10, 6))
plt.plot(years, soc_trajectory)
plt.xlabel('Year')
plt.ylabel('Soil Organic Carbon (t/ha)')
plt.title('Soil Carbon Dynamics Under Management')
plt.grid(True)
plt.show()

## 4. Machine Learning Integration

Demonstrate how to train and use surrogate models for rapid impact assessment.

In [None]:
# Load simulation results
results_file = os.path.join(config['paths']['analysis_output_dir'], 'combined_results_std_vars.parquet')
df_results = pd.read_parquet(results_file)

# Engineer features
df_engineered, feature_list = engineer_features(df_results, config)

# Train surrogate model
trained_models = train_surrogate_model(
    df_engineered[feature_list],
    df_engineered[config['surrogate_model']['targets']],
    feature_list,
    config['surrogate_model']['targets'],
    config
)

# Display feature importance
importances = pd.DataFrame({
    'feature': feature_list,
    'importance': trained_models['yield'].feature_importances_
}).sort_values('importance', ascending=False)

plt.figure(figsize=(10, 6))
sns.barplot(data=importances.head(10), x='importance', y='feature')
plt.title('Top 10 Feature Importances for Yield Prediction')
plt.show()

## 5. Summary and Next Steps

This notebook demonstrated PyCIAT's advanced features including:
- Soil data integration and processing
- Biotic stress simulation capabilities
- Soil carbon modeling
- Machine learning surrogate model development

These features enable more comprehensive climate impact assessments by considering:
- Soil-specific responses
- Pest/disease impacts
- Carbon sequestration potential
- Rapid scenario exploration

For more details, refer to the documentation and other example notebooks.