# New Core User Interface Demonstration

## Overview of Available Features

This notebook demonstrates the capabilities of the `new_core` user interface for the climakitae library. The interface provides a fluent API for accessing climate data with the following features:

### Available Processors:
- `concatenate`: Concatenate datasets along specified dimensions
- `filter_unbiased_models`: Remove or include unbiased models from the dataset
- `update_attributes`: Update dataset attributes

### Key Capabilities:
- **Fluent Interface**: Chain method calls for readable query building
- **Progressive Exploration**: Discover available options as you build queries
- **Error Handling**: Clear feedback when queries fail or have invalid parameters
- **Flexible Data Access**: Support for multiple catalogs and data types

### Demonstration Structure:
1. **Setup and Basic Usage** - Initialize interface and basic queries
2. **Data Exploration** - Explore available options and build queries
3. **Working with Processors** - Demonstrate the three available processors
4. **Error Handling** - Show what happens when things go wrong
5. **Best Practices** - Tips for effective usage

---

In [None]:
# Import necessary libraries
import matplotlib.pyplot as plt

# Import the new_core user interface
from climakitae.new_core.user_interface import ClimateData

# Initialize the interface
cd = ClimateData()

## 1. Data Exploration

The interface provides several methods to explore available data options. Let's start with a comprehensive overview:

In [None]:
# Get a comprehensive overview of all available options
cd.show_all_options()

In [None]:
# Explore options step by step
print("=== Available Catalogs ===")
cd.show_catalog_options()

print("\n=== Choose 'renewables' catalog and explore installations ===")
renewables_explorer = cd.catalog("renewables")
renewables_explorer.show_installation_options()

print("\n=== Choose 'pv_utility' installation and explore variables ===")
pv_explorer = renewables_explorer.installation("pv_utility")
pv_explorer.show_variable_options()

## 2. Basic Data Retrieval

Now let's retrieve some actual climate data using the fluent interface pattern:

In [None]:
# Simple data retrieval example
print("Retrieving PV utility capacity factor data...")
basic_data = (cd
    .catalog("renewables")
    .installation("pv_utility")
    .experiment_id("historical")
    .table_id("day")
    .grid_label("d03")
    .variable("cf")
    .get()
)

print(f"Data retrieved successfully: {basic_data is not None}")
if basic_data is not None:
    print(f"Data shape:\n {basic_data.dims}")
    print(f"Variables:\n {list(basic_data.data_vars)}")
    print(f"Coordinates:\n {list(basic_data.coords)}")
    print(f"Data size:\n {basic_data.nbytes / (1024**2):.1f} MB")
    print(basic_data)

In [None]:
# Retrieve data from the climate data catalog
print("Retrieving temperature data from climate catalog...")
climate_data = (cd
    .catalog("data")
    .activity_id("WRF")
    .experiment_id("historical")
    .table_id("mon")
    .grid_label("d01")
    .variable("t2max")
    .get()
)

print(f"Climate data retrieved successfully: {climate_data is not None}")
if climate_data is not None:
    print(f"Data shape: {climate_data.dims}")
    print(f"Time range: {climate_data.time.min().values} to {climate_data.time.max().values}")
    print(f"Simulations: {list(climate_data.sim.values)}")
    print(climate_data)

## 3. Working with Processors

The new_core interface includes three processors that can be applied to your data:

### Available Processors:
1. **`concatenate`** - Concatenate datasets along specified dimensions, default behavior is to concatenate on "time" using a historical+ssp approach.
2. **`filter_unbiased_models`** - Remove or include unbiased models (default: "yes" to remove)
3. **`update_attributes`** - Update dataset attributes, not for use by user. Runs in the background to ensure transparency of how data was manipulated.

Let's demonstrate each processor:

In [None]:
# Processor 1: Concatenate
# The following will change the default behavior of the concatenate processor
# to concatenate along the simulation dimension, instead of 
print("=== Demonstrating 'concatenate' processor ===")
concat_data = (cd
    .catalog("renewables")
    .installation("pv_utility")
    .experiment_id("historical")
    .table_id("day")
    .grid_label("d03")
    .variable("cf")
    .processes({
        "concatenate": "sim"  # Concatenate along simulation dimension
    })
    .get()
)

print(f"Concatenated data retrieved: {concat_data is not None}")
if concat_data is not None:
    print(f"Data shape: {concat_data.dims}")
    print(f"Simulation dimension: {concat_data.sim.values}")
    print(f"Number of simulations: {len(concat_data.sim)}")
    print(concat_data)

In [None]:
# Processor 2: Filter unbiased models
print("=== Demonstrating 'filter_unbiased_models' processor ===")

# Default behavior (filter_unbiased_models = "yes")
print("Default behavior - filtering unbiased models:")
default_filtered = (cd
    .catalog("data")
    .activity_id("WRF")
    .table_id("mon")
    .grid_label("d01")
    .variable("t2max")
    .get()
)

print(f"Default filtered data retrieved: {default_filtered is not None}")
if default_filtered is not None:
    print(f"Simulations with default filtering: {len(default_filtered.sim)}")
    print(f"Simulation names: {list(default_filtered.sim.values)}")

# Explicitly including unbiased models
print("\nExplicitly including unbiased models:")
unfiltered_data = (cd
    .catalog("data")
    .activity_id("WRF")
    .table_id("mon")
    .grid_label("d01")
    .variable("t2max")
    .processes({
        "filter_unbiased_models": "no"
    })
    .get()
)

print(f"Unfiltered data retrieved: {unfiltered_data is not None}")
if unfiltered_data is not None:
    print(f"Simulations without filtering: {len(unfiltered_data.sim)}")
    print(f"Simulation names: {list(unfiltered_data.sim.values)}")

## 4. Error Handling and Debugging

Now let's explore what happens when things go wrong. The interface provides helpful error messages and suggestions for common mistakes:

In [None]:
# Error 1: Misspelled catalog name
print("=== Error Example 1: Misspelled Catalog ===")
try:
    bad_catalog = (cd
        .catalog("renewbles")  # Typo: missing 'a'
        .installation("pv_utility")
        .grid_label("d03")
        .table_id("day")
        .variable("cf")
        .get()
    )
    print(f"Result: {bad_catalog}")
except Exception as e:
    print(f"Exception caught: {str(e)}")

# The interface should provide helpful suggestions

In [None]:
# Error 2: Misspelled installation
print("=== Error Example 2: Misspelled Installation ===")
try:
    bad_installation = (cd
        .catalog("renewables")
        .installation("pv_utolity")  # Typo: 'utolity' instead of 'utility'
        .grid_label("d03")
        .table_id("day")
        .variable("cf")
        .get()
    )
    print(f"Result: {bad_installation}")
    if bad_installation is not None:
        print(f"Dataset shape: {bad_installation.dims}")
        print("This returns an empty dataset because no matching data was found")
except Exception as e:
    print(f"Exception caught: {str(e)}")

In [None]:
# Error 3: Invalid parameter combination
print("=== Error Example 3: Invalid Parameter Combination ===")
try:
    conflicting_params = (cd
        .catalog("data")
        .activity_id("LOCA2")
        .table_id("mon")
        .grid_label("d01")  # No monthly data available for this grid resolution
        .variable("tasmax")
        .get()
    )
    print(f"Result: {conflicting_params}")
    if conflicting_params is not None:
        print(f"Dataset shape: {conflicting_params.dims}")
        print("This should return an empty dataset with a warning about the conflict")
except Exception as e:
    print(f"Exception caught: {str(e)}")

In [None]:
# Error 4: Faulty processor entries
print("=== Error Example 4: Invalid Processor ===")
try:
    invalid_processor = (cd
        .catalog("renewables")
        .installation("pv_utility")
        .variable("cf")
        .processes({
            "invalid_processor": "some_value"  # This processor doesn't exist
        })
        .get()
    )
    print(f"Result: {invalid_processor}")
except Exception as e:
    print(f"Exception caught: {str(e)}")

print("\n=== Error Example 5: Invalid Processor Value ===")
try:
    invalid_value = (cd
        .catalog("renewables")
        .installation("pv_utility")
        .variable("cf")
        .processes({
            "filter_unbiased_models": "invalid_value"  # Should be "yes" or "no"
        })
        .get()
    )
    print(f"Result: {invalid_value}")
except Exception as e:
    print(f"Exception caught: {str(e)}")

In [None]:
# Error 6: Incomplete query
print("=== Error Example 6: Incomplete Query ===")
try:
    incomplete_query = (cd
        .catalog("renewables")
        .installation("pv_utility")
        # Missing required parameters like variable, table_id, etc.
        .get()
    )
    print(f"Result: {incomplete_query}")
    if incomplete_query is not None:
        print(f"Dataset shape: {incomplete_query.dims}")
        print("This may return an empty dataset or error depending on implementation")
except Exception as e:
    print(f"Exception caught: {str(e)}")

In [None]:
# Debugging strategies
print("=== Debugging Strategies ===")

# 1. Check your current query state
print("1. Check current query state:")
partial_query = (cd
    .catalog("renewables")
    .installation("pv_utility")
    .experiment_id("historical")
)
partial_query.show_query()

print("\n2. Check available options at current state:")
partial_query.show_variable_options()

print("\n3. Reset and start over if needed:")
cd.reset()
print("Interface reset - ready for new query")

# 4. Build queries incrementally and validate
print("\n4. Build incrementally:")
step1 = cd.catalog("renewables")
print(f"Step 1 complete: {step1 is not None}")

step2 = step1.installation("pv_utility")
print(f"Step 2 complete: {step2 is not None}")

step3 = step2.variable("cf")
print(f"Step 3 complete: {step3 is not None}")

# Check the final state before getting data
step3.show_query()

## 5. Best Practices and Tips

Based on the demonstrations above, here are some best practices for using the new_core interface effectively:

In [None]:
# Reusable query configurations
print("=== Reusable Query Configurations ===")

# Define base configurations
base_renewables_config = {
    "catalog": "renewables",
    "installation": "pv_utility",
    "experiment_id": "historical",
    "table_id": "day",
    "grid_label": "d03"
}

base_climate_config = {
    "catalog": "data",
    "activity_id": "WRF",
    "experiment_id": "historical",
    "table_id": "mon",
    "grid_label": "d01"
}

# Create fresh instances and load configurations
print("Creating multiple queries from base configurations...")

# Query 1: Renewables capacity factor
renewables_query = ClimateData().load_query(base_renewables_config)
cf_data = renewables_query.variable("cf").get()
print(f"CF data retrieved: {cf_data is not None}")

# Query 2: Renewables generation
renewables_query2 = ClimateData().load_query(base_renewables_config)
gen_data = renewables_query2.variable("gen").get()
print(f"Generation data retrieved: {gen_data is not None}")

# Query 3: Climate temperature
climate_query = ClimateData().load_query(base_climate_config)
temp_data = climate_query.variable("t2max").get()
print(f"Temperature data retrieved: {temp_data is not None}")

print("\n✓ Reusable configurations enable efficient query management")

## Summary

This notebook has demonstrated the key features of the climakitae `new_core` user interface:

### ✅ **What's Included:**
- **Fluent Interface**: Chain methods for readable query building
- **Data Exploration**: Progressive discovery of available options
- **Three Processors**: 
  - `concatenate`: Merge datasets along dimensions
  - `filter_unbiased_models`: Control bias-corrected model inclusion
  - `update_attributes`: Add custom metadata
- **Error Handling**: Clear feedback for invalid queries
- **Multiple Catalogs**: Access to both `renewables` and `data` catalogs

### ✅ **Key Benefits:**
- **Intuitive API**: Natural language-like query building
- **Robust Error Handling**: Helpful suggestions when queries fail
- **Flexible Processing**: Combine multiple processors in a single query
- **Efficient Data Access**: Server-side processing reduces data transfer
- **Reusable Configurations**: Save and reuse query patterns

### ✅ **Demonstrated Capabilities:**
- Explore available data options systematically
- Retrieve data from multiple catalogs (renewables, climate)
- Apply processors individually and in combination
- Handle errors gracefully with informative feedback
- Build reusable query configurations
- Debug queries step-by-step

### 🔧 **Best Practices:**
1. **Explore First**: Use `show_*_options()` methods before building queries
2. **Build Incrementally**: Add parameters step-by-step and validate
3. **Handle Errors**: Always check if data is None before processing
4. **Use Safe Retrieval**: Wrap queries in error handling
5. **Reuse Configurations**: Define base configurations for common patterns

The `new_core` interface provides a solid foundation for climate data access while maintaining simplicity and providing clear feedback when issues arise. The three available processors (`concatenate`, `filter_unbiased_models`, `update_attributes`) cover the most common data processing needs while keeping the interface focused and maintainable.

---

**Ready to explore climate data with the new_core interface!** 🌍📊