# Project DataFrames and Discovery

This notebook demonstrates the `HmsPrj` class and its comprehensive DataFrames for exploring HEC-HMS project structure.

After initializing a project, you have access to:

| DataFrame | Description |
|-----------|-------------|
| `hms_df` | Project-level attributes from the .hms file |
| `basin_df` | Basin models with element counts and hydrologic methods |
| `met_df` | Meteorologic models with precipitation methods |
| `control_df` | Control specifications with parsed time windows |
| `run_df` | Simulation runs with cross-references |
| `gage_df` | Time-series gages with DSS references |
| `pdata_df` | Paired data tables (storage-outflow, etc.) |

In [None]:
# pip install hms-commander

**For Development**: If working on hms-commander source code, use the `hmscmdr_local` conda environment (editable install) instead of pip install.

In [None]:
from pathlib import Path
from hms_commander import init_hms_project, HmsExamples, HmsPrj

print("hms-commander loaded")

## 1. Extract Example Project

We use the `castro` example project which has multiple basin models, runs, and gages to demonstrate the full DataFrame capabilities.

In [None]:
# Extract the castro example project
project_path = HmsExamples.extract_project(
    "castro",
    output_path=Path.cwd() / 'hms_example_projects' / 'castro_dataframes'
)
print(f"Extracted to: {project_path}")

## 2. Initialize Project

The `init_hms_project()` function initializes the global `hms` object and builds all DataFrames.

In [None]:
# Initialize the project
hms = init_hms_project(project_path)

# Display project summary
print(hms)

## 3. Project Attributes (hms_df)

The `hms_df` contains all project-level attributes from the `.hms` file.

In [None]:
# Display project attributes
print("Project Attributes:")
print("=" * 60)
hms.hms_df

In [None]:
# Access individual attributes
print(f"Project Name: {hms.get_project_attribute('name')}")
print(f"HMS Version: {hms.hms_version}")
print(f"Description: {hms.get_project_attribute('Description')}")

## 4. Basin Models (basin_df)

The `basin_df` includes element counts and hydrologic methods used in each basin model.

In [None]:
# Display basin models
print("Basin Models:")
print("=" * 60)
hms.basin_df[['name', 'num_subbasins', 'num_reaches', 'num_junctions', 
              'total_area', 'loss_methods', 'transform_methods']]

In [None]:
# Computed property: total area across all basins
print(f"Total project area: {hms.total_area:.2f} sq mi")

## 5. Meteorologic Models (met_df)

The `met_df` shows precipitation, evapotranspiration, and snowmelt methods.

In [None]:
# Display met models
print("Meteorologic Models:")
print("=" * 60)
hms.met_df[['name', 'precip_method', 'et_method', 'snowmelt_method']]

## 6. Control Specifications (control_df)

The `control_df` includes parsed start/end dates and time intervals.

In [None]:
# Display control specifications
print("Control Specifications:")
print("=" * 60)
hms.control_df[['name', 'start_date', 'end_date', 'time_interval', 
                'time_interval_minutes', 'duration_hours']]

## 7. Simulation Runs (run_df)

The `run_df` links basin models, met models, and control specifications together.

In [None]:
# Display simulation runs
print("Simulation Runs:")
print("=" * 60)
hms.run_df[['name', 'basin_model', 'met_model', 'control_spec', 'dss_file']]

## 8. Time Series Gages (gage_df)

The `gage_df` contains gage information with DSS file references.

In [None]:
# Display gages
print("Time Series Gages:")
print("=" * 60)
hms.gage_df[['name', 'gage_type', 'dss_file', 'has_dss_reference']]

## 9. Paired Data Tables (pdata_df)

The `pdata_df` contains paired data tables used for storage-outflow, stage-discharge, etc.

In [None]:
# Display paired data tables
print("Paired Data Tables:")
print("=" * 60)
if not hms.pdata_df.empty:
    display(hms.pdata_df[['name', 'table_type', 'x_units', 'y_units']])
else:
    print("No paired data tables in this project.")

## 10. Accessor Methods for Component Names

The `HmsPrj` class provides convenient accessor methods for getting lists of component names. These methods are particularly useful when you need to:
- Check if a component exists before modifying run configurations
- Display available components to users
- Validate component names programmatically

In [None]:
# Display all available components using accessor methods
print("Available Components in Project:")
print("=" * 60)
print(f"Basins:        {hms.list_basin_names()}")
print(f"Met Models:    {hms.list_met_names()}")
print(f"Control Specs: {hms.list_control_names()}")
print(f"Runs:          {hms.list_run_names()}")
print(f"Gages:         {hms.list_gage_names()}")

In [None]:
# Filter gages by type
print("\nPrecipitation Gages:")
print("=" * 60)
precip_gages = hms.list_gage_names(gage_type='Precipitation')
print(f"Count: {len(precip_gages)}")
print(f"Names: {precip_gages}")

print("\nFlow Gages:")
print("=" * 60)
flow_gages = hms.list_gage_names(gage_type='Flow')
print(f"Count: {len(flow_gages)}")
print(f"Names: {flow_gages}")

### Why Accessor Methods Matter

These methods provide a **consistent API** for getting component names across all component types:

```python
# Consistent pattern for all components
hms.list_basin_names()     # Returns: ['Basin1', 'Basin2', ...]
hms.list_met_names()       # Returns: ['Met1', 'Met2', ...]  
hms.list_control_names()   # Returns: ['Control1', 'Control2', ...]
hms.list_run_names()       # Returns: ['Run1', 'Run2', ...]
hms.list_gage_names()      # Returns: ['Gage1', 'Gage2', ...]
```

**Critical for validation**: When modifying run configurations (covered in 04_run_management.ipynb), you must verify that components exist before assigning them to runs. HMS will silently delete runs with invalid component references!

## 11. Computed Properties

Several useful properties are computed from the DataFrames.

In [None]:
# All DSS files referenced in the project
print("Referenced DSS Files:")
print("=" * 60)
for dss_file in hms.dss_files:
    exists = dss_file.exists()
    status = "[EXISTS]" if exists else "[NOT FOUND]"
    print(f"  {status} {dss_file.name}")

In [None]:
# All hydrologic methods used in the project
print("Hydrologic Methods Used:")
print("=" * 60)
methods = hms.available_methods
for method_type, method_list in methods.items():
    if method_list:
        print(f"  {method_type.title()}: {', '.join(method_list)}")

## 12. Working with Multiple Projects

You can work with multiple projects by creating separate `HmsPrj` instances.

In [None]:
# Extract a second project
tifton_path = HmsExamples.extract_project(
    "tifton",
    output_path=Path.cwd() / 'hms_example_projects' / 'tifton_dataframes'
)

# Create separate instances
castro_prj = HmsPrj()
init_hms_project(project_path, hms_object=castro_prj)

tifton_prj = HmsPrj()
init_hms_project(tifton_path, hms_object=tifton_prj)

# Compare projects
print("Project Comparison:")
print("=" * 60)
print(f"Castro:  {castro_prj.total_area:.2f} sq mi, {len(castro_prj.basin_df)} basins")
print(f"Tifton:  {tifton_prj.total_area:.2f} sq mi, {len(tifton_prj.basin_df)} basins")

## Summary

The `HmsPrj` class provides:

| Feature | Description |
|---------|-------------|
| **7 DataFrames** | Comprehensive views of all project components |
| **Accessor methods** | `list_basin_names()`, `list_met_names()`, `list_control_names()`, etc. |
| **Automatic parsing** | Dates, intervals, methods extracted from HMS files |
| **Computed properties** | `total_area`, `dss_files`, `available_methods` |
| **Cross-reference validation** | Links between runs, basins, and met models |
| **Multi-project support** | Separate `HmsPrj` instances for each project |

## Next Steps

- **03_file_ops_basin_met_control_gage.ipynb**: Work with individual HMS files
- **04_run_management.ipynb**: Configure and validate simulation runs
- **05_clone_workflow.ipynb**: Non-destructive model modifications