# Hierarchical Configuration Inheritance Pattern: A Complete Guide

## The Problem: Configuration Duplication

### Why Traditional Configuration Management Is Painful

Imagine you're building a microservice that needs to run in multiple environments. Without inheritance patterns, your configuration might look like this:

In [1]:
# ❌ Traditional approach - lots of duplication
traditional_config = {
    "dev": {
        "database": {
            "host": "dev-db.example.com",
            "port": 5432,
            "pool_size": 10,
            "timeout": 30,
            "ssl_mode": "prefer",
            "retry_attempts": 3
        },
        "redis": {
            "host": "dev-redis.example.com", 
            "port": 6379,
            "timeout": 10,
            "pool_size": 20
        },
        "logging": {
            "level": "DEBUG",
            "format": "detailed"
        }
    },
    "staging": {
        "database": {
            "host": "staging-db.example.com",
            "port": 5432,           # 🔄 DUPLICATE
            "pool_size": 10,        # 🔄 DUPLICATE
            "timeout": 30,          # 🔄 DUPLICATE
            "ssl_mode": "prefer",   # 🔄 DUPLICATE
            "retry_attempts": 3     # 🔄 DUPLICATE
        },
        "redis": {
            "host": "staging-redis.example.com",
            "port": 6379,           # 🔄 DUPLICATE
            "timeout": 10,          # 🔄 DUPLICATE
            "pool_size": 20         # 🔄 DUPLICATE
        },
        "logging": {
            "level": "INFO",
            "format": "detailed"    # 🔄 DUPLICATE
        }
    },
    "prod": {
        "database": {
            "host": "prod-db.example.com",
            "port": 5432,           # 🔄 DUPLICATE
            "pool_size": 50,        # Different value, but pattern repeats
            "timeout": 30,          # 🔄 DUPLICATE
            "ssl_mode": "require",  # Different value
            "retry_attempts": 3     # 🔄 DUPLICATE
        },
        "redis": {
            "host": "prod-redis.example.com",
            "port": 6379,           # 🔄 DUPLICATE
            "timeout": 10,          # 🔄 DUPLICATE
            "pool_size": 50         # Different value
        },
        "logging": {
            "level": "ERROR",
            "format": "detailed"    # 🔄 DUPLICATE
        }
    }
}

### The Pain Points

1. **🔄 Massive Duplication**: 80% of configuration values are repeated across environments
2. **🐛 Error Prone**: Change a default port? You must remember to update it in 3+ places
3. **📈 Scales Poorly**: Adding a new environment means copying and modifying everything
4. **🔍 Hard to Understand**: What values are defaults vs environment-specific overrides?
5. **🚀 Maintenance Nightmare**: Updating shared settings requires touching multiple sections

---

## The Solution: Hierarchical Inheritance

### The DRY (Do Not Repeat Yourself) Approach with `_defaults`

The hierarchical pattern solves this by introducing a **`_defaults` section** that defines defaults, which automatically inherit to all environments unless specifically overridden:

In [2]:
# ✅ Hierarchical approach - DRY and maintainable
hierarchical_config = {
    "_defaults": {
        # 🎯 Define defaults ONCE
        "*.database.port": 5432,
        "*.database.pool_size": 10,
        "*.database.timeout": 30,
        "*.database.ssl_mode": "prefer",
        "*.database.retry_attempts": 3,
        "*.redis.port": 6379,
        "*.redis.timeout": 10,
        "*.redis.pool_size": 20,
        "*.logging.format": "detailed"
    },
    "dev": {
        "database": {
            "host": "dev-db.example.com"
            # 👆 All other database settings inherited from _defaults
        },
        "redis": {
            "host": "dev-redis.example.com"
            # 👆 All other redis settings inherited from _defaults
        },
        "logging": {
            "level": "DEBUG"
            # 👆 format inherited from _defaults
        }
    },
    "staging": {
        "database": {"host": "staging-db.example.com"},
        "redis": {"host": "staging-redis.example.com"},
        "logging": {"level": "INFO"}
    },
    "prod": {
        "database": {
            "host": "prod-db.example.com",
            "pool_size": 50,        # 🎯 Override default for production
            "ssl_mode": "require"   # 🎯 Override default for production
        },
        "redis": {
            "host": "prod-redis.example.com",
            "pool_size": 50         # 🎯 Override default for production
        },
        "logging": {"level": "ERROR"}
    }
}

---

## Core Concepts

### Setup

In [3]:
import json
from rich import print as rprint
from configcraft.api import DEFAULTS, apply_inheritance, inherit_value

def jprint(data: dict):
    """Pretty print JSON data"""
    rprint(json.dumps(data, indent=2))

### 1. The `_defaults` Section

The `_defaults` section is a **meta-configuration** that defines inheritable defaults:

In [4]:
config = {
    "_defaults": {
        "*.timeout": 30,           # Apply to all environments
        "*.retry_attempts": 3      # Apply to all environments  
    },
    "dev": {"host": "dev.com"},
    "prod": {"host": "prod.com"}
}

### 2. JSON Path Patterns

JSON paths specify **where** default values should be applied:

| Pattern | Meaning | Example |
|---------|---------|---------|
| `*.field` | All top-level keys | `*.timeout` → applies to dev.timeout, prod.timeout |
| `env.field` | Specific environment | `dev.timeout` → applies only to dev.timeout |
| `*.service.field` | Nested paths | `*.db.port` → applies to dev.db.port, prod.db.port |
| `*.services.*.field` | Multiple wildcards | `*.apps.*.memory` → all apps in all environments |

### 3. Non-Destructive Inheritance

**Key Principle**: Default values are only applied when the target key **doesn't already exist**.

In [5]:
config = {
    "_defaults": {"*.memory": 2},
    "dev": {},                    # ✅ Will get memory: 2
    "prod": {"memory": 8}         # ✅ Keeps existing memory: 8
}

---

## Basic Usage

### Example 1: Simple Environment Defaults

In [6]:
# Define configuration with default values
config_data = {
    "_defaults": {
        "*.memory": 2,           # Default memory for all environments
        "*.cpu": 1               # Default CPU for all environments
    },
    "dev": {},                   # Empty - will inherit all defaults
    "staging": {
        "memory": 4              # Override memory, inherit CPU
    },
    "prod": {
        "memory": 8,             # Override memory
        "cpu": 4                 # Override CPU
    }
}

# Apply inheritance
apply_inheritance(config_data)
jprint(config_data)

### Example 2: Nested Configuration Inheritance

In [7]:
config_data = {
    "_defaults": {
        "*.database.port": 5432,
        "*.database.pool_size": 10,
        "*.cache.ttl": 3600
    },
    "dev": {
        "database": {
            "host": "localhost"
            # port and pool_size will be inherited
        },
        "cache": {
            "host": "localhost"
            # ttl will be inherited
        }
    },
    "prod": {
        "database": {
            "host": "prod-db.com",
            "pool_size": 50        # Override default
            # port will be inherited
        },
        "cache": {
            "host": "prod-cache.com",
            "ttl": 7200           # Override default
        }
    }
}

apply_inheritance(config_data)
jprint(config_data)

### Path Execution Order: Exception-Then-Default Pattern

🚨 **Critical Behavior**: Within a single `_defaults` section, paths are processed **from top to bottom**. If multiple paths affect the same node, the **earlier path wins** due to `setdefault` behavior.

This enables powerful **exception-then-default** patterns:

In [8]:
# Example: CPU allocation with exceptions
config_data = {
    "_defaults": {
        # ⚠️ ORDER MATTERS! Exception MUST come first
        "*.servers.high_memory.cpu": 8,     # Exception: high_memory gets 8 CPU
        "*.servers.*.cpu": 2                # Default: all other servers get 2 CPU
    },
    "dev": {
        "servers": {
            "web": {},                      # Gets cpu=2 (default rule)
            "high_memory": {},              # Gets cpu=8 (exception rule)
            "worker": {}                    # Gets cpu=2 (default rule)
        }
    },
    "prod": {
        "servers": {
            "web": {},                      # Gets cpu=2 (default rule)  
            "high_memory": {},              # Gets cpu=8 (exception rule)
            "database": {"cpu": 16}         # Keeps cpu=16 (existing value)
        }
    }
}

apply_inheritance(config_data)
jprint(config_data)

**❌ Wrong Order Example:**

In [9]:
# This WON'T work as expected - wrong order!
config_data = {
    "_defaults": {
        "*.servers.*.cpu": 2,               # 🚫 Default comes first
        "*.servers.high_memory.cpu": 8      # 🚫 Exception comes second - TOO LATE!
    },
    "dev": {
        "servers": {
            "high_memory": {}               # Gets cpu=2 (not 8!) because default ran first
        }
    }
}

**✅ Design Logic:**

The child-override-parent behavior follows the inheritance processing order:
1. **Recursive Processing**: Children are processed before parents
2. **Child `_defaults`** runs first → sets `dev.services.web.memory = 2048`
3. **Parent `_defaults`** runs later → tries to set `dev.services.web.memory = 1024`, but key exists → **ignored**
4. **Result**: Child settings take precedence, parent fills gaps

**🎯 Real-World Use Cases:**

1. **Exception Handling**: Set specific values before wildcards
2. **Environment Overrides**: Child environments override global defaults  
3. **Service Specialization**: Specific services override category defaults
4. **Progressive Refinement**: Broad defaults → environment defaults → service specifics

### Working with Lists of Objects

The inheritance pattern works seamlessly with **lists of dictionaries**:

In [10]:
# Example: Nested inheritance hierarchy
config_data = {
    "_defaults": {
        "*.services.*.memory": 1024,        # Parent default: 1GB for all services
        "*.services.*.timeout": 30          # Parent default: 30s timeout
    },
    "dev": {
        "services": {
            "_defaults": {
                "*.memory": 2048,           # Child override: dev services get 2GB  
                "*.log_level": "DEBUG"      # Child addition: dev-specific setting
            },
            "web": {},                      # Gets memory=2048 (child), timeout=30 (parent), log_level=DEBUG (child)
            "worker": {"memory": 4096}      # Gets memory=4096 (explicit), timeout=30 (parent), log_level=DEBUG (child)
        }
    },
    "prod": {
        "services": {
            "web": {},                      # Gets memory=1024 (parent), timeout=30 (parent)
            "worker": {}                    # Gets memory=1024 (parent), timeout=30 (parent)
        }
    }
}

apply_inheritance(config_data)
jprint(config_data)

**✅ Design Logic:**

The child-override-parent behavior follows the inheritance processing order:
1. **Recursive Processing**: Children are processed before parents
2. **Child `_defaults`** runs first → sets `dev.services.web.memory = 2048`
3. **Parent `_defaults`** runs later → tries to set `dev.services.web.memory = 1024`, but key exists → **ignored**
4. **Result**: Child settings take precedence, parent fills gaps

**🎯 Real-World Use Cases:**

1. **Exception Handling**: Set specific values before wildcards
2. **Environment Overrides**: Child environments override global defaults  
3. **Service Specialization**: Specific services override category defaults
4. **Progressive Refinement**: Broad defaults → environment defaults → service specifics

### Working with Lists of Objects

The inheritance pattern works seamlessly with **lists of dictionaries**:

In [11]:
config_data = {
    "_defaults": {
        "*.databases.port": 5432,      # Apply to ALL database objects
        "*.databases.timeout": 30      # Apply to ALL database objects
    },
    "dev": {
        "databases": [
            {"host": "dev-primary.com", "type": "primary"},
            {"host": "dev-replica.com", "type": "replica"}
            # Both will inherit port and timeout
        ]
    },
    "prod": {
        "databases": [
            {"host": "prod-primary.com", "type": "primary"},
            {"host": "prod-replica.com", "type": "replica", "port": 5433}
            # First inherits port, second keeps override
        ]
    }
}

apply_inheritance(config_data)
jprint(config_data)

### Specific Environment Targeting

Sometimes you want to set defaults for **specific environments only**:

In [12]:
config_data = {
    "_defaults": {
        "dev.*.memory": 4,          # Only dev environments get 4GB
        "prod.*.memory": 16,        # Only prod environments get 16GB
        "*.log_level": "INFO"       # All environments get INFO logging
    },
    "dev": {
        "web": {},                  # Will get memory: 4, log_level: "INFO"
        "worker": {}                # Will get memory: 4, log_level: "INFO"
    },
    "staging": {
        "web": {},                  # Will get log_level: "INFO" only
        "worker": {"memory": 8}     # Custom memory, inherits log_level
    },
    "prod": {
        "web": {},                  # Will get memory: 16, log_level: "INFO"
        "worker": {"memory": 32}    # Custom memory, inherits log_level
    }
}

apply_inheritance(config_data)
jprint(config_data)

### Nested `_defaults` Sections (Advanced Override)

You can have **multiple levels** of `_defaults` sections for fine-grained control:

In [13]:
config_data = {
    "_defaults": {
        "*.*.memory": 2,            # Global default: 2GB for all services
        "*.*.log_level": "INFO"     # Global default: INFO logging
    },
    "dev": {
        "_defaults": {
            "*.memory": 4,          # Dev-specific: Override memory to 4GB
            "*.debug": True         # Dev-specific: Enable debug mode
        },
        "web": {},                  # Gets: memory=4, log_level="INFO", debug=True
        "worker": {"memory": 8}     # Gets: memory=8 (override), log_level="INFO", debug=True
    },
    "prod": {
        "_defaults": {
            "*.log_level": "ERROR"  # Prod-specific: Only log errors
        },
        "web": {},                  # Gets: memory=2, log_level="ERROR"
        "worker": {}                # Gets: memory=2, log_level="ERROR"
    }
}

apply_inheritance(config_data)
jprint(config_data)

---

## Understanding the API

The hierarchical configuration pattern provides two main functions with different purposes:

### `inherit_value()` - Low-Level Inheritance

This is the **core building block** that applies a single shared value to its target location(s):

In [14]:
from configcraft.api import inherit_value

# Example: Set a default value only where it doesn't exist
data = {
    "dev": {"host": "dev.com"},
    "prod": {"host": "prod.com", "port": 8080}  # Already has port
}

# Apply default port to all environments
inherit_value(path="*.port", value=3000, data=data)

jprint(data)

### `apply_inheritance()` - High-Level Configuration Processing

This is the **main entry point** that processes entire configuration structures with `_defaults` sections:

In [15]:
# Example: Process a complete configuration
config = {
    "_defaults": {
        "*.port": 3000,
        "*.timeout": 30
    },
    "dev": {"host": "dev.com"},
    "prod": {"host": "prod.com", "port": 8080}
}

apply_inheritance(config)

jprint(config)

**When to use `apply_inheritance()`:**
- ✅ Processing complete configuration files
- ✅ Standard use case with `_defaults` sections
- ✅ Production configuration management
- ✅ Most common use case - start here!

---

## Real-World Examples

### Example 1: Microservice Configuration

In [16]:
# Real-world microservice configuration
microservice_config = {
    "_defaults": {
        # Database defaults
        "*.database.pool_size": 10,
        "*.database.timeout": 30,
        "*.database.retry_attempts": 3,
        
        # Redis defaults  
        "*.redis.timeout": 5,
        "*.redis.pool_size": 20,
        
        # Logging defaults
        "*.logging.format": "json",
        "*.logging.level": "INFO",
        
        # HTTP defaults
        "*.http.timeout": 10,
        "*.http.retry_attempts": 3
    },
    "local": {
        "database": {
            "host": "localhost",
            "port": 5432,
            "name": "myapp_dev"
        },
        "redis": {
            "host": "localhost", 
            "port": 6379
        },
        "logging": {
            "level": "DEBUG"  # Override for local development
        },
        "http": {
            "base_url": "http://localhost:8000"
        }
    },
    "staging": {
        "database": {
            "host": "staging-db.company.com",
            "port": 5432,
            "name": "myapp_staging",
            "pool_size": 20  # Override for staging load
        },
        "redis": {
            "host": "staging-redis.company.com",
            "port": 6379
        },
        "logging": {},
        "http": {
            "base_url": "https://staging-api.company.com"
        }
    },
    "production": {
        "database": {
            "host": "prod-db.company.com",
            "port": 5432,
            "name": "myapp_prod",
            "pool_size": 50,      # Production needs more connections
            "timeout": 60         # Production can wait longer
        },
        "redis": {
            "host": "prod-redis.company.com",
            "port": 6379,
            "pool_size": 100      # Production needs larger pool
        },
        "logging": {
            "level": "ERROR"      # Production only logs errors
        },
        "http": {
            "base_url": "https://api.company.com",
            "timeout": 30         # Production can wait longer
        }
    }
}

apply_inheritance(microservice_config)

After processing, each environment gets:
- ✅ All the appropriate defaults from `_defaults`
- ✅ Environment-specific host/URL configurations  
- ✅ Performance tuning overrides where needed
- ✅ No duplication of common settings

In [17]:
jprint(microservice_config)

### Example 2: Multi-Tenant SaaS Configuration

In [18]:
# SaaS application with multiple tenants
saas_config = {
    "_defaults": {
        # Default resource limits
        "*.tenants.*.cpu_limit": 1,
        "*.tenants.*.memory_limit": 2, 
        "*.tenants.*.storage_limit": 10,
        
        # Default feature flags
        "*.tenants.*.features.analytics": True,
        "*.tenants.*.features.api_access": True,
        "*.tenants.*.features.custom_domain": False,
        
        # Default billing
        "*.tenants.*.billing.plan": "basic",
        "*.tenants.*.billing.trial_days": 14
    },
    "dev": {
        "tenants": {
            "test_tenant": {
                "name": "Test Company",
                # Gets all defaults
                "features": {},
                "billing": {}
            }
        }
    },
    "prod": {
        "tenants": {
            "startup_co": {
                "name": "Startup Co",
                "billing": {"plan": "startup"},  # Override plan
                # Other defaults inherited
                "features": {},
                "billing": {}
            },
            "enterprise_corp": {
                "name": "Enterprise Corp", 
                "cpu_limit": 8,           # Enterprise gets more resources
                "memory_limit": 16,
                "storage_limit": 1000,
                "features": {
                    "custom_domain": True,  # Enterprise feature
                    "sso": True            # Additional enterprise feature
                },
                "billing": {
                    "plan": "enterprise",
                    "trial_days": 30       # Longer trial
                }
            }
        }
    }
}

apply_inheritance(saas_config)

This pattern allows you to:
- 🎯 Set sensible defaults for all tenants
- 🚀 Quickly onboard new tenants with minimal configuration
- 💰 Easily implement tiered pricing with resource overrides
- 🔧 Maintain consistent feature flags across environments

In [19]:
jprint(saas_config)

---

## Best Practices

### 1. Design Patterns

#### ✅ DO: Start with Broad Defaults, Then Specialize

In [20]:
# Good: Broad defaults with specific overrides
config = {
    "_defaults": {
        "*.memory": 2,           # Broad default
        "prod.*.memory": 8       # Environment-specific override
    },
    "dev": {"api": {}, "worker": {}},
    "prod": {"api": {}, "worker": {"memory": 16}}  # Service-specific override
}

#### ❌ DON'T: Over-specify in Defaults

In [21]:
# Bad: Too specific in _defaults
config = {
    "_defaults": {
        "dev.api.memory": 2,
        "dev.worker.memory": 2,
        "prod.api.memory": 8,
        "prod.worker.memory": 8   # This defeats the purpose!
    }
}

#### ✅ DO: Use Nested `_defaults` for Logical Grouping

In [22]:
# Good: Logical grouping with nested _defaults
config = {
    "_defaults": {
        "*.log_level": "INFO"     # Global setting
    },
    "dev": {
        "_defaults": {
            "*.debug": True,       # Dev-specific settings
            "*.hot_reload": True
        },
        "api": {},
        "worker": {}
    }
}

### 2. Path Pattern Guidelines

#### Use Wildcards Strategically

In [23]:
# ✅ Good patterns
"*.timeout"              # All environments
"*.database.port"        # All database configs
"prod.*.memory"          # All prod services
"*.services.*.cpu"       # All services in all environments

# ❌ Avoid these patterns  
"*.*.*.*"               # Too generic
"very.specific.deep.path.field"  # Too specific
;

''

#### Establish Naming Conventions

In [24]:
# ✅ Consistent naming helps pattern matching
config = {
    "_defaults": {
        "*.database_primary.port": 5432,
        "*.database_replica.port": 5433,
        "*.cache_redis.port": 6379
    }
}

### 3. Configuration Organization

#### Group Related Settings

In [25]:
# ✅ Well-organized configuration
config = {
    "_defaults": {
        # Database cluster
        "*.database.port": 5432,
        "*.database.pool_size": 10,
        "*.database.timeout": 30,
        
        # Caching layer
        "*.cache.ttl": 3600,
        "*.cache.max_size": 1000,
        
        # Monitoring
        "*.monitoring.enabled": True,
        "*.monitoring.interval": 60
    }
}

#### Document Your Patterns

In [26]:
config = {
    "_defaults": {
        # Resource defaults - production overrides these
        "*.memory": 2,           # GB
        "*.cpu": 1,              # cores
        
        # Network timeouts - keep aggressive for responsiveness
        "*.timeout": 30,         # seconds
        "*.retry_attempts": 3,   # count
        
        # Feature flags - enable by default, disable selectively  
        "*.features.metrics": True,
        "*.features.tracing": True
    }
}

### 4. Test Your Configuration Processing

In [27]:
def test_config_inheritance():
    config = {
        "_defaults": {"*.port": 3000},
        "dev": {"host": "localhost"},
        "prod": {"host": "prod.com", "port": 8080}
    }
    
    apply_inheritance(config)
    
    # Validate inheritance worked
    assert config["dev"]["port"] == 3000      # Inherited
    assert config["prod"]["port"] == 8080     # Preserved override
    assert "_defaults" not in config            # Cleaned up

test_config_inheritance()

### 5. Error Prevention

#### Validate Paths Before Processing

In [28]:
def validate_defaults_paths(defaults_config):
    """Validate _defaults path patterns"""
    for path in defaults_config.keys():
        if path.endswith("*"):
            raise ValueError(f"Path cannot end with '*': {path}")
        
        if ".." in path:
            raise ValueError(f"Path cannot contain '..': {path}")

#### Handle Missing Intermediate Keys

In [29]:
# ❌ This will raise KeyError if 'database' doesn't exist
config = {
    "_defaults": {"*.database.port": 5432},
    "dev": {}  # No 'database' key
}

# ✅ Better: Ensure intermediate structures exist
config = {
    "_defaults": {"*.database.port": 5432},
    "dev": {"database": {}}  # Provide empty database config
}

## Summary

The Hierarchical Configuration Inheritance Pattern solves the fundamental problem of **configuration duplication** in multi-environment applications. By using `_defaults` sections and JSON path patterns, you can:

### 🎯 **Key Benefits**
- **Eliminate Duplication**: Define common settings once
- **Reduce Errors**: Single source of truth for defaults
- **Scale Easily**: Add new environments with minimal config
- **Override Flexibly**: Keep environment-specific customizations
- **Maintain Simply**: Change defaults in one place

### 🚀 **When to Use This Pattern**
- ✅ Multi-environment deployments (dev/staging/prod)
- ✅ Microservice configurations with shared defaults
- ✅ Multi-tenant applications with tiered features
- ✅ Configuration templates with customization points
- ✅ Any scenario with repetitive configuration data

### 🛠️ **Getting Started**
1. Identify duplicated configuration values
2. Extract them to a `_defaults` section  
3. Use `*.field` patterns for broad defaults
4. Use `env.field` patterns for specific overrides
5. Call `apply_inheritance()` to process your config

The pattern transforms configuration management from a maintenance burden into a powerful tool for organizing and scaling your application configurations.