# Example 04: Complete HMS Workflow

This notebook demonstrates an end-to-end HEC-HMS workflow using hms-commander:

1. **Project Initialization** - Load and explore HMS project structure
2. **DataFrame Exploration** - Access basin, subbasin, run, and gage data
3. **DSS File Operations** - Read input and result time series
4. **Results Analysis** - Extract and compare computed vs observed results
5. **Cross-Reference Queries** - Use run configurations to access related data

In [None]:
# Setup - use development version of hms-commander
import sys
from pathlib import Path

current_dir = Path.cwd()
parent_dir = current_dir.parent
if str(parent_dir) not in sys.path:
    sys.path.insert(0, str(parent_dir))

from hms_commander import init_hms_project, HmsExamples
from hms_commander.dss import DssCore
import pandas as pd
import matplotlib.pyplot as plt

## 1. Project Initialization

Extract and initialize the castro example project.

In [None]:
# Extract castro project
project_path = HmsExamples.extract_project(
    "castro",
    output_path=Path.cwd() / 'example_projects' / 'castro'
)
print(f"Project extracted to: {project_path}")

# Initialize the project
hms = init_hms_project(project_path)

# Display project summary
print(f"\n{hms}")

## 2. Project Attributes (hms_df)

The `hms_df` contains project-level metadata from the .hms file.

In [None]:
# View project attributes
hms.hms_df

## 3. Basin Models (basin_df)

Summary of basin models with element counts and hydrologic methods used.

In [None]:
# View basin models
hms.basin_df[['name', 'num_subbasins', 'num_reaches', 'num_junctions', 
              'total_area', 'loss_methods', 'transform_methods']]

## 4. Detailed Subbasin Parameters (subbasin_df)

The `subbasin_df` provides detailed hydrologic parameters for each subbasin - useful for sensitivity analysis and calibration.

In [None]:
# View all subbasins
cols = ['name', 'basin_model', 'area', 'loss_method', 'transform_method', 
        'initial_deficit', 'snyder_tp', 'snyder_cp']
available_cols = [c for c in cols if c in hms.subbasin_df.columns]
hms.subbasin_df[available_cols]

In [None]:
# Filter subbasins for a specific basin model
hms.get_subbasin_entries(basin_name='Castro 1')

## 5. Simulation Runs (run_df)

The `run_df` shows all simulation runs with their basin, met, and control references.

In [None]:
# View simulation runs
hms.run_df[['name', 'basin_model', 'met_model', 'control_spec', 'dss_file']]

In [None]:
# Get complete configuration for a specific run
config = hms.get_run_configuration('Current')
print("Run Configuration:")
for key, value in config.items():
    print(f"  {key}: {value}")

## 6. Run Configuration Management (NEW)

**Phase 1 Feature**: Modify run configurations programmatically with built-in validation.

### Why This Matters

⚠️ **CRITICAL**: HEC-HMS automatically **deletes runs** with invalid component references when opening a project. 

The new `HmsRun` set methods prevent this by validating that components exist before modifying the run file.

### Viewing Available Components

Before modifying runs, check what components are available:

In [None]:
from hms_commander import HmsRun

# Display available components
print("Available Components:")
print("=" * 60)
print(f"Basins:        {hms.list_basin_names()}")
print(f"Met Models:    {hms.list_met_names()}")
print(f"Control Specs: {hms.list_control_names()}")
print(f"Runs:          {hms.list_run_names()}")

### Modifying Run Metadata

Update run description, log file, and DSS output file:

In [None]:
# Get the first run name
run_name = hms.list_run_names()[0]
print(f"Modifying run: '{run_name}'")
print("=" * 60)

# Set run description
HmsRun.set_description(
    run_name=run_name,
    description="Example workflow - demonstrating Phase 1 run management",
    hms_object=hms
)
print("[OK] Updated run description")

# Set log file
HmsRun.set_log_file(
    run_name=run_name,
    log_file="example_workflow.log",
    hms_object=hms
)
print("[OK] Updated log file")

# Set DSS output file
HmsRun.set_dss_file(
    run_name=run_name,
    dss_file="example_workflow_output.dss",
    hms_object=hms
)
print("[OK] Updated DSS output file")

### Modifying Run Components (with Validation)

Update which basin, met model, or control spec a run uses. These methods **validate** that the component exists!

In [None]:
# Get available basin models
basin_names = hms.list_basin_names()
print(f"Available basins: {basin_names}")

if basin_names:
    # Set basin model (validates it exists first!)
    try:
        HmsRun.set_basin(
            run_name=run_name,
            basin_model=basin_names[0],
            hms_object=hms
        )
        print(f"[OK] Set basin to '{basin_names[0]}'")
    except ValueError as e:
        print(f"[ERROR] {e}")

### Demonstrating Validation Protection

Try to set an invalid component - validation will catch it:

In [None]:
# Try to set a non-existent basin (will fail safely)
try:
    HmsRun.set_basin(
        run_name=run_name,
        basin_model="NonExistentBasin",
        hms_object=hms
    )
    print("[ERROR] Should have failed validation!")
except ValueError as e:
    print("[OK] Validation prevented invalid configuration:")
    print(f"     {str(e)[:100]}...")
    print("")
    print("THIS IS CRITICAL! Without validation, HMS would silently delete")
    print("this run when you next open the project!")

### Verify Changes

Reinitialize the project to see updated run configuration:

In [None]:
# Reinitialize to refresh DataFrames
from hms_commander import init_hms_project
init_hms_project(hms.project_folder, hms_object=hms)

# Display updated run configuration
print(f"Updated Run Configuration for '{run_name}':")
print("=" * 60)
config = HmsRun.get_dss_config(run_name, hms_object=hms)
for key, value in config.items():
    print(f"  {key:20s}: {value}")

## 7. Time Series Gages (gage_df)

The `gage_df` contains observed data references with DSS file paths.

In [None]:
# View gages
hms.gage_df[['name', 'gage_type', 'dss_file', 'dss_pathname', 'has_dss_reference']]

In [None]:
# Get flow gages only
flow_gages = hms.list_gage_names(gage_type='Flow')
print(f"Flow gages: {flow_gages}")

## 8. DSS File Operations

Use the standalone `DssCore` class to read DSS files. This works independently of ras-commander.

In [None]:
# Get the result DSS file for the 'Current' run
result_dss = hms.get_run_dss_file('Current')
print(f"Result DSS file: {result_dss}")
print(f"File exists: {result_dss.exists() if result_dss else False}")

In [None]:
# Get DSS catalog (requires pyjnius and Java)
if result_dss and result_dss.exists() and DssCore.is_available():
    catalog = DssCore.get_catalog(result_dss)
    print(f"Found {len(catalog)} paths in DSS file")
    
    # Show first 10 paths
    print("\nSample paths:")
    for path in catalog[:10]:
        print(f"  {path}")
else:
    print("DSS not available or file not found")

In [None]:
# Filter catalog for flow results
if result_dss and result_dss.exists() and DssCore.is_available():
    flow_paths = DssCore.filter_catalog(catalog, data_type='FLOW')
    print(f"Flow paths: {len(flow_paths)}")
    for path in flow_paths:
        print(f"  {path}")

## 9. Reading Time Series Data

Read and visualize flow results from the DSS file.

In [None]:
# Read computed flow at outlet
if result_dss and result_dss.exists() and DssCore.is_available():
    # Find the outlet flow path
    outlet_paths = [p for p in catalog if 'OUTLET' in p and 'FLOW' in p and 'OBSERVED' not in p]
    
    if outlet_paths:
        computed_path = outlet_paths[0]
        print(f"Reading: {computed_path}")
        
        df_computed = DssCore.read_timeseries(result_dss, computed_path)
        print(f"Shape: {df_computed.shape}")
        print(f"Units: {df_computed.attrs.get('units', 'N/A')}")
        print(f"\n{df_computed.head()}")

In [None]:
# Read observed flow for comparison
if result_dss and result_dss.exists() and DssCore.is_available():
    observed_paths = [p for p in catalog if 'OUTLET' in p and 'FLOW-OBSERVED' in p]
    
    if observed_paths:
        observed_path = observed_paths[0]
        print(f"Reading: {observed_path}")
        
        df_observed = DssCore.read_timeseries(result_dss, observed_path)
        print(f"Shape: {df_observed.shape}")
        print(f"\n{df_observed.head()}")

## 10. Compare Computed vs Observed

Plot and analyze the comparison between computed and observed hydrographs.

In [None]:
# Plot comparison
if 'df_computed' in dir() and 'df_observed' in dir():
    fig, ax = plt.subplots(figsize=(12, 6))
    
    ax.plot(df_observed.index, df_observed['value'], 'b-', label='Observed', linewidth=2)
    ax.plot(df_computed.index, df_computed['value'], 'r--', label='Computed', linewidth=2)
    
    ax.set_xlabel('Time')
    ax.set_ylabel('Flow (CFS)')
    ax.set_title('Castro Valley Outlet - Computed vs Observed Flow')
    ax.legend()
    ax.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Calculate statistics
    print("\nStatistics:")
    print(f"  Observed Peak: {df_observed['value'].max():.1f} CFS")
    print(f"  Computed Peak: {df_computed['value'].max():.1f} CFS")
    print(f"  Peak Difference: {df_computed['value'].max() - df_observed['value'].max():.1f} CFS")

## 11. Control Specifications

View simulation time windows with parsed dates.

In [None]:
# View control specifications
hms.control_df[['name', 'start_date', 'end_date', 'time_interval', 'duration_hours']]

## 12. Computed Properties

Access aggregated project information.

In [None]:
# Total area across all basins
print(f"Total project area: {hms.total_area:.2f} sq mi")

# All DSS files
print(f"\nDSS files referenced: {len(hms.dss_files)}")
for dss_file in hms.dss_files:
    exists = dss_file.exists()
    print(f"  [{'+' if exists else '-'}] {dss_file.name}")

# Hydrologic methods used
print("\nHydrologic methods used:")
for method_type, methods in hms.available_methods.items():
    if methods:
        print(f"  {method_type.title()}: {', '.join(methods)}")

## Summary

This notebook demonstrated the key hms-commander capabilities:

| Feature | Description |
|---------|-------------|
| `init_hms_project()` | Initialize HMS project with dataframes |
| `hms_df` | Project-level attributes |
| `basin_df` | Basin model summary with methods |
| `subbasin_df` | Detailed subbasin parameters |
| `run_df` | Simulation runs with cross-references |
| `gage_df` | Observed data gages with DSS references |
| `control_df` | Control specs with parsed dates |
| `DssCore` | Standalone DSS file reading |
| `get_run_configuration()` | Complete run details |
| `get_run_dss_file()` | Result file path lookup |