# 01_BuckinghamPi - Physics-SR Framework v4.1

## Stage 1.1: Buckingham Pi Dimensional Analysis

**Author:** Zhengze Zhang  
**Affiliation:** Department of Statistics, Columbia University  
**Date:** January 2026  
**Version:** 4.1 (Structure-Guided Feature Library Enhancement + Computational Optimization)

---

### Mathematical Foundation

The Buckingham Pi theorem states that for a physical equation involving $n$ variables with $k$ fundamental dimensions, the equation can be rewritten using $(n-k)$ dimensionless groups.

**Key Insight:** If a physical law relates $n$ dimensional variables:
$$f(q_1, q_2, \ldots, q_n) = 0$$

and these variables span $k$ independent dimensions (e.g., M, L, T, $\Theta$), then the law can be expressed as:
$$F(\pi_1, \pi_2, \ldots, \pi_{n-k}) = 0$$

where each $\pi_i$ is a dimensionless group formed from products of powers of the original variables.

### Algorithm Overview

1. Construct dimensional matrix $D \in \mathbb{R}^{k \times n}$
2. Find null space: vectors $\mathbf{e}$ such that $D\mathbf{e} = 0$
3. Select simplest linearly independent integer solutions
4. Form $\pi$-groups: $\pi_j = \prod_i q_i^{e_{ij}}$

### Reference

Buckingham, E. (1914). On physically similar systems; illustrations of the use of dimensional equations. *Physical Review*, 4(4), 345.

---
## Section 1: Header and Imports

In [None]:
"""
01_BuckinghamPi.ipynb - Buckingham Pi Dimensional Analysis
===========================================================

Three-Stage Physics-Informed Symbolic Regression Framework v4.1

This module provides:
- BuckinghamPiAnalyzer: Complete implementation of Buckingham Pi theorem
- Automatic selection of simplest dimensionless groups
- Data transformation to dimensionless space

Algorithm:
    1. Construct dimensional matrix D in R^(k x n)
    2. Compute null space dimension: n_pi = n - rank(D)
    3. Enumerate all integer null vectors with |exp| <= max_exponent
    4. Score by complexity: 10*(nonzero count) + 1*(sum|exp|) + 0.1*max(|exp|)
    5. Select n_pi linearly independent vectors with lowest complexity

Output Format (expected):
    === Buckingham Pi Analysis Results ===
    Variables: 4, Rank: 2, Pi-groups needed: 2
    
    All candidates (sorted by complexity):
      [0] q_c                    complexity=11.0  [AUTO-SELECTED]
      [1] N_d * r_eff^3          complexity=24.3  [AUTO-SELECTED]
      ...
    
    Selected Pi-groups:
      Pi_1 = q_c
      Pi_2 = N_d * r_eff^3

Author: Zhengze Zhang
Affiliation: Department of Statistics, Columbia University
Contact: zz3239@columbia.edu
"""

# Import core module
%run 00_Core.ipynb

In [None]:
# Additional imports for Buckingham Pi
import itertools
from math import gcd
from functools import reduce
from typing import Dict, List, Tuple, Optional, Any, Callable

print("01_BuckinghamPi v4.1: Additional imports successful.")

---
## Section 2: Class Definition

In [None]:
# ==============================================================================
# BUCKINGHAM PI ANALYZER CLASS
# ==============================================================================

class BuckinghamPiAnalyzer:
    """
    Buckingham Pi Dimensional Analysis.
    
    Reduces n variables with k base units to (n-k) dimensionless groups.
    Automatically selects simplest integer exponent combinations.
    
    The algorithm finds all integer solutions to the null space of the
    dimensional matrix, scores them by complexity, and selects the
    simplest linearly independent set.
    
    Reference:
        Buckingham, E. (1914). On physically similar systems; illustrations
        of the use of dimensional equations. Physical Review, 4(4), 345.
    
    Attributes
    ----------
    max_exponent : int
        Maximum absolute value of exponents to search (default: 4)
    require_confirmation : bool
        If True, pause for user approval of selected groups
    analysis_results : Optional[Dict]
        Stored results from analyze() call
    selected_groups : Optional[List[np.ndarray]]
        Auto-selected simplest groups
    
    Methods
    -------
    analyze(variable_dimensions: Dict[str, List[float]]) -> Dict
        Perform complete Buckingham pi analysis
    transform_data(X: np.ndarray, var_names: List[str]) -> Tuple[np.ndarray, List[str]]
        Transform raw data to dimensionless pi-groups
    get_pi_group_expressions() -> List[str]
        Get human-readable pi-group expressions
    print_analysis_report() -> None
        Print detailed analysis report in v4.1 format
    
    Examples
    --------
    >>> analyzer = BuckinghamPiAnalyzer(max_exponent=4)
    >>> var_dims = {
    ...     'L': [0, 1, 0, 0],   # Length: m
    ...     'm': [1, 0, 0, 0],   # Mass: kg
    ...     'g': [0, 1, -2, 0],  # Acceleration: m/s^2
    ...     'T': [0, 0, 1, 0],   # Time: s
    ... }
    >>> result = analyzer.analyze(var_dims)
    >>> print(result['pi_group_names'])
    ['T^2 * g / L']  # Pendulum: T^2 * g / L is dimensionless
    """
    
    def __init__(
        self, 
        max_exponent: int = DEFAULT_MAX_EXPONENT, 
        require_confirmation: bool = False
    ):
        """
        Initialize BuckinghamPiAnalyzer.
        
        Parameters
        ----------
        max_exponent : int
            Maximum absolute value of exponents to search.
            Larger values find more candidates but take longer.
            Default: 4 (searches -4 to +4)
        require_confirmation : bool
            If True, print candidates and wait for user confirmation.
            Default: False (automatic selection)
        """
        self.max_exponent = max_exponent
        self.require_confirmation = require_confirmation
        
        # Public state (v4.1 naming convention)
        self.analysis_results = None
        self.selected_groups = None
        
        # Internal state (private, prefixed with underscore)
        self._dim_matrix = None
        self._var_names = None
        self._n_vars = 0
        self._n_dims = 0
        self._rank = 0
        self._n_pi_groups = 0
        self._all_candidates = []
        self._complexity_scores = []
        self._analysis_complete = False
    
    def analyze(self, variable_dimensions: Dict[str, List[float]]) -> Dict[str, Any]:
        """
        Perform complete Buckingham pi analysis.
        
        Parameters
        ----------
        variable_dimensions : Dict[str, List[float]]
            Dictionary mapping variable names to their dimensional exponents
            [M, L, T, Theta] where M=mass, L=length, T=time, Theta=temperature.
            Example: {'velocity': [0, 1, -1, 0]}  # m/s
        
        Returns
        -------
        Dict[str, Any]
            Dictionary containing:
            - n_variables: Number of input variables
            - n_base_units: Rank of dimensional matrix (k)
            - n_pi_groups: Number of dimensionless groups (n-k)
            - all_candidates: All valid pi-group candidates with complexity scores
            - selected_groups: Auto-selected simplest groups (exponent vectors)
            - transformation_matrix: Matrix to transform to pi-groups (n_groups x n_vars)
            - pi_group_names: Human-readable names for each pi-group
            - complexity_scores: Complexity score for each candidate
            - variable_names: Ordered list of variable names
        """
        # Store variable names in order
        self._var_names = list(variable_dimensions.keys())
        self._n_vars = len(self._var_names)
        
        # Construct dimensional matrix
        self._dim_matrix = self._construct_dimension_matrix(variable_dimensions)
        self._n_dims = self._dim_matrix.shape[0]
        
        # Compute rank and number of pi-groups
        self._rank = np.linalg.matrix_rank(self._dim_matrix)
        self._n_pi_groups = self._n_vars - self._rank
        
        # Handle special cases
        if self._n_pi_groups <= 0:
            # All variables are dimensionally independent
            self._analysis_complete = True
            self.analysis_results = {
                'n_variables': self._n_vars,
                'n_base_units': self._rank,
                'n_pi_groups': 0,
                'all_candidates': [],
                'selected_groups': [],
                'transformation_matrix': np.array([]),
                'pi_group_names': [],
                'complexity_scores': [],
                'variable_names': self._var_names,
                'message': 'All variables are dimensionally independent.'
            }
            self.selected_groups = []
            return self.analysis_results
        
        # Find all candidate null vectors
        self._all_candidates = self._enumerate_null_vectors(self._dim_matrix)
        
        # Compute complexity scores
        self._complexity_scores = [
            self._compute_complexity(vec) for vec in self._all_candidates
        ]
        
        # Sort candidates by complexity
        sorted_indices = np.argsort(self._complexity_scores)
        sorted_candidates = [self._all_candidates[i] for i in sorted_indices]
        
        # Select linearly independent groups
        self.selected_groups = self._select_independent(
            sorted_candidates, self._n_pi_groups
        )
        
        self._analysis_complete = True
        
        # Build all_candidates with complexity info (for reporting)
        candidates_with_info = []
        for i, (vec, score) in enumerate(zip(self._all_candidates, self._complexity_scores)):
            is_selected = any(np.allclose(vec, sel) for sel in self.selected_groups)
            candidates_with_info.append({
                'exponents': vec,
                'complexity': score,
                'expression': self._format_pi_group(vec, self._var_names),
                'selected': is_selected
            })
        
        # Sort by complexity for output
        candidates_with_info.sort(key=lambda x: x['complexity'])
        
        # Build transformation matrix
        transformation_matrix = np.array(self.selected_groups) if self.selected_groups else np.array([])
        
        self.analysis_results = {
            'n_variables': self._n_vars,
            'n_base_units': self._rank,
            'n_pi_groups': self._n_pi_groups,
            'all_candidates': candidates_with_info,
            'selected_groups': self.selected_groups,
            'transformation_matrix': transformation_matrix,
            'pi_group_names': self.get_pi_group_expressions(),
            'complexity_scores': self._complexity_scores,
            'variable_names': self._var_names,
        }
        
        return self.analysis_results
    
    def _construct_dimension_matrix(self, var_dims: Dict[str, List[float]]) -> np.ndarray:
        """
        Construct the dimensional matrix D.
        
        D[i,j] = exponent of base unit i in variable j.
        
        Parameters
        ----------
        var_dims : Dict[str, List[float]]
            Variable dimensions dictionary
            
        Returns
        -------
        np.ndarray
            Dimensional matrix D (k x n)
        """
        n_vars = len(var_dims)
        n_dims = len(list(var_dims.values())[0])
        
        D = np.zeros((n_dims, n_vars))
        for j, var_name in enumerate(self._var_names):
            D[:, j] = var_dims[var_name]
        
        return D
    
    def _compute_complexity(self, exponents: np.ndarray) -> float:
        """
        Compute complexity score for ranking candidates.
        
        complexity(e) = 10 * (# nonzero) + 1 * (sum|e|) + 0.1 * max(|e|)
        
        Lower is simpler (preferred).
        
        Parameters
        ----------
        exponents : np.ndarray
            Exponent vector
            
        Returns
        -------
        float
            Complexity score
        """
        n_nonzero = np.count_nonzero(exponents)
        sum_abs = np.sum(np.abs(exponents))
        max_abs = np.max(np.abs(exponents)) if len(exponents) > 0 else 0
        
        return 10 * n_nonzero + 1 * sum_abs + 0.1 * max_abs
    
    def _enumerate_null_vectors(self, dim_matrix: np.ndarray) -> List[np.ndarray]:
        """
        Enumerate all integer null vectors within bounds.
        
        Parameters
        ----------
        dim_matrix : np.ndarray
            Dimensional matrix D
            
        Returns
        -------
        List[np.ndarray]
            List of valid null vectors (each is n_vars dimensional)
        """
        candidates = []
        
        # Generate all possible exponent combinations
        exponent_range = range(-self.max_exponent, self.max_exponent + 1)
        
        for exponents in itertools.product(exponent_range, repeat=self._n_vars):
            vec = np.array(exponents, dtype=float)
            
            # Skip zero vector
            if np.all(vec == 0):
                continue
            
            # Check if it is in the null space (D @ vec = 0)
            if np.allclose(dim_matrix @ vec, 0, atol=1e-10):
                # Normalize to avoid duplicates (GCD normalization)
                normalized = self._normalize_vector(vec)
                
                # Check if already in candidates
                is_duplicate = False
                for existing in candidates:
                    if np.allclose(normalized, existing) or np.allclose(-normalized, existing):
                        is_duplicate = True
                        break
                
                if not is_duplicate:
                    candidates.append(normalized)
        
        return candidates
    
    def _normalize_vector(self, vec: np.ndarray) -> np.ndarray:
        """
        Normalize vector by GCD and ensure first nonzero element is positive.
        
        Parameters
        ----------
        vec : np.ndarray
            Input vector
            
        Returns
        -------
        np.ndarray
            Normalized vector
        """
        # Convert to integers
        int_vec = vec.astype(int)
        
        # Compute GCD of all nonzero elements
        nonzero_vals = [abs(x) for x in int_vec if x != 0]
        if len(nonzero_vals) == 0:
            return vec
        
        g = reduce(gcd, nonzero_vals)
        normalized = int_vec // g
        
        # Make first nonzero element positive
        first_nonzero = next((i for i, x in enumerate(normalized) if x != 0), None)
        if first_nonzero is not None and normalized[first_nonzero] < 0:
            normalized = -normalized
        
        return normalized.astype(float)
    
    def _select_independent(
        self, 
        candidates: List[np.ndarray], 
        n_groups: int
    ) -> List[np.ndarray]:
        """
        Select n_groups linearly independent candidates.
        
        Candidates should already be sorted by complexity (simplest first).
        
        Parameters
        ----------
        candidates : List[np.ndarray]
            Sorted list of candidate vectors
        n_groups : int
            Number of groups to select
            
        Returns
        -------
        List[np.ndarray]
            Selected linearly independent vectors
        """
        selected = []
        
        for vec in candidates:
            if len(selected) >= n_groups:
                break
            
            if self._is_linearly_independent(vec, selected):
                selected.append(vec)
        
        return selected
    
    def _is_linearly_independent(
        self, 
        vec: np.ndarray, 
        existing: List[np.ndarray]
    ) -> bool:
        """
        Check if vec is linearly independent from existing vectors.
        
        Parameters
        ----------
        vec : np.ndarray
            Vector to test
        existing : List[np.ndarray]
            List of existing vectors
            
        Returns
        -------
        bool
            True if linearly independent
        """
        if len(existing) == 0:
            return True
        
        # Stack vectors and check rank
        matrix = np.column_stack(existing + [vec])
        rank = np.linalg.matrix_rank(matrix)
        
        return rank == len(existing) + 1
    
    def _format_pi_group(self, exponents: np.ndarray, var_names: List[str]) -> str:
        """
        Format pi-group as human-readable string.
        
        Parameters
        ----------
        exponents : np.ndarray
            Exponent vector
        var_names : List[str]
            Variable names
            
        Returns
        -------
        str
            Human-readable expression
        """
        numerator_parts = []
        denominator_parts = []
        
        for j, exp in enumerate(exponents):
            if exp == 0:
                continue
            
            var_name = var_names[j]
            abs_exp = abs(int(exp))
            
            if abs_exp == 1:
                term = var_name
            else:
                term = f"{var_name}^{abs_exp}"
            
            if exp > 0:
                numerator_parts.append(term)
            else:
                denominator_parts.append(term)
        
        # Construct expression
        if numerator_parts and denominator_parts:
            if len(denominator_parts) == 1:
                expr = " * ".join(numerator_parts) + " / " + denominator_parts[0]
            else:
                expr = " * ".join(numerator_parts) + " / (" + " * ".join(denominator_parts) + ")"
        elif numerator_parts:
            expr = " * ".join(numerator_parts)
        elif denominator_parts:
            if len(denominator_parts) == 1:
                expr = "1 / " + denominator_parts[0]
            else:
                expr = "1 / (" + " * ".join(denominator_parts) + ")"
        else:
            expr = "1"
        
        return expr
    
    def transform_data(
        self, 
        X: np.ndarray, 
        var_names: List[str]
    ) -> Tuple[np.ndarray, List[str]]:
        """
        Transform raw data to dimensionless pi-groups.
        
        Parameters
        ----------
        X : np.ndarray
            Raw data matrix (n_samples, n_variables)
        var_names : List[str]
            Variable names matching columns of X
            
        Returns
        -------
        X_transformed : np.ndarray
            Transformed data (n_samples, n_pi_groups)
        pi_names : List[str]
            Names of pi-groups
        """
        if not self._analysis_complete:
            raise RuntimeError("Must call analyze() before transform_data()")
        
        if self.selected_groups is None or len(self.selected_groups) == 0:
            return X, var_names
        
        # Build mapping from var_names to column indices
        name_to_idx = {name: i for i, name in enumerate(var_names)}
        
        # Compute pi-groups
        n_samples = X.shape[0]
        n_groups = len(self.selected_groups)
        X_transformed = np.ones((n_samples, n_groups))
        
        for g, exponents in enumerate(self.selected_groups):
            for j, var_name in enumerate(self._var_names):
                if var_name in name_to_idx:
                    col_idx = name_to_idx[var_name]
                    exp = exponents[j]
                    if exp != 0:
                        X_transformed[:, g] *= safe_power(X[:, col_idx], exp)
        
        pi_names = self.get_pi_group_expressions()
        
        return X_transformed, pi_names
    
    def get_pi_group_expressions(self) -> List[str]:
        """
        Get human-readable expressions for selected pi-groups.
        
        Returns
        -------
        List[str]
            List of pi-group expressions
        """
        if not self._analysis_complete or self.selected_groups is None:
            return []
        
        return [self._format_pi_group(group, self._var_names) for group in self.selected_groups]
    
    def print_analysis_report(self) -> None:
        """
        Print detailed analysis report in v4.1 format.
        
        Expected output format:
            === Buckingham Pi Analysis Results ===
            Variables: 4, Rank: 2, Pi-groups needed: 2
            
            All candidates (sorted by complexity):
              [0] q_c                    complexity=11.0  [AUTO-SELECTED]
              [1] N_d * r_eff^3          complexity=24.3  [AUTO-SELECTED]
              ...
            
            Selected Pi-groups:
              Pi_1 = q_c
              Pi_2 = N_d * r_eff^3
        """
        if not self._analysis_complete:
            print("Analysis not yet performed. Call analyze() first.")
            return
        
        print("=" * 50)
        print("=== Buckingham Pi Analysis Results ===")
        print("=" * 50)
        print(f"Variables: {self._n_vars}, Rank: {self._rank}, Pi-groups needed: {self._n_pi_groups}")
        print()
        
        # Print dimensional matrix info
        print("Dimensional Matrix D:")
        print(f"  Variables: {self._var_names}")
        print("  Units: [M, L, T, Theta]")
        for i, row in enumerate(self._dim_matrix):
            unit_name = ['M', 'L', 'T', 'Theta'][i] if i < 4 else f'U{i}'
            print(f"  {unit_name}: {row}")
        print()
        
        # Print all candidates sorted by complexity
        if self.analysis_results and 'all_candidates' in self.analysis_results:
            candidates = self.analysis_results['all_candidates']
            print(f"All candidates (sorted by complexity): {len(candidates)} found")
            for i, cand in enumerate(candidates[:10]):  # Show top 10
                selected_tag = "  [AUTO-SELECTED]" if cand['selected'] else ""
                print(f"  [{i}] {cand['expression']:30s} complexity={cand['complexity']:.1f}{selected_tag}")
            if len(candidates) > 10:
                print(f"  ... and {len(candidates) - 10} more candidates")
            print()
        
        # Print selected groups
        print("Selected Pi-groups:")
        pi_names = self.get_pi_group_expressions()
        for i, name in enumerate(pi_names):
            print(f"  Pi_{i+1} = {name}")
        print()

print("BuckinghamPiAnalyzer class defined.")

---
## Section 3: Internal Tests

In [None]:
# ==============================================================================
# INTERNAL TEST FLAG
# ==============================================================================

# Set to True to run internal tests when this notebook is executed directly.
# Tests are automatically skipped when imported via %run.
_RUN_TESTS = False

print(f"Internal tests enabled: {_RUN_TESTS}")

In [None]:
# ==============================================================================
# TEST 1: Basic Analysis (All Dimensionless Variables)
# ==============================================================================

if _RUN_TESTS:
    print_section_header("Test 1: All Dimensionless Variables")
    
    # All variables are dimensionless (e.g., mixing ratios)
    all_dimensionless = {
        'ratio1': [0, 0, 0, 0],
        'ratio2': [0, 0, 0, 0],
        'ratio3': [0, 0, 0, 0],
    }
    
    analyzer = BuckinghamPiAnalyzer(max_exponent=4)
    result = analyzer.analyze(all_dimensionless)
    
    print(f"n_variables: {result['n_variables']}")
    print(f"n_base_units: {result['n_base_units']}")
    print(f"n_pi_groups: {result['n_pi_groups']}")
    print(f"pi_group_names: {result['pi_group_names']}")

In [None]:
# ==============================================================================
# TEST 2: Classic Pendulum Example
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 2: Classic Pendulum Example")
    
    pendulum_dims = {
        'L': [0, 1, 0, 0],      # Length: m
        'm': [1, 0, 0, 0],      # Mass: kg
        'g': [0, 1, -2, 0],     # Acceleration: m/s^2
        'T': [0, 0, 1, 0],      # Time: s
    }
    
    analyzer = BuckinghamPiAnalyzer(max_exponent=4)
    result = analyzer.analyze(pendulum_dims)
    
    print(f"n_variables: {result['n_variables']}")
    print(f"n_base_units (rank): {result['n_base_units']}")
    print(f"n_pi_groups: {result['n_pi_groups']}")
    print()
    print("Selected pi-groups:")
    for i, (exp, name) in enumerate(zip(result['selected_groups'], result['pi_group_names'])):
        print(f"  pi_{i+1} = {name}")
        print(f"       Exponents: {exp}")
    print()
    print("Expected: T^2 * g / L (or equivalent)")

In [None]:
# ==============================================================================
# TEST 3: Warm Rain Microphysics
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 3: Warm Rain Microphysics")
    
    warm_rain_dims = {
        'q_c':   [0, 0, 0, 0],     # kg/kg (dimensionless)
        'N_d':   [0, -3, 0, 0],    # m^-3
        'r_eff': [0, 1, 0, 0],     # m
        'LWC':   [1, -3, 0, 0],    # kg/m^3
    }
    
    analyzer = BuckinghamPiAnalyzer(max_exponent=4)
    result = analyzer.analyze(warm_rain_dims)
    
    print(f"n_variables: {result['n_variables']}")
    print(f"n_base_units (rank): {result['n_base_units']}")
    print(f"n_pi_groups: {result['n_pi_groups']}")
    print()
    print(f"Found {len(result['all_candidates'])} candidate pi-groups")
    print()
    print("Selected pi-groups:")
    for i, name in enumerate(result['pi_group_names']):
        print(f"  pi_{i+1} = {name}")

In [None]:
# ==============================================================================
# TEST 4: Data Transformation
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 4: Data Transformation")
    
    # Generate pendulum data
    np.random.seed(42)
    n = 100
    L = np.random.uniform(0.1, 2.0, n)
    m = np.random.uniform(0.01, 1.0, n)  # Should not appear in result
    g = np.random.uniform(9.7, 10.0, n)
    T = 2 * np.pi * np.sqrt(L / g)  # True period
    
    X = np.column_stack([L, m, g, T])
    feature_names = ['L', 'm', 'g', 'T']
    
    print("Generated pendulum data:")
    print(f"  n_samples: {n}")
    print(f"  Features: {feature_names}")
    print(f"  True relation: T = 2*pi*sqrt(L/g)")
    print()
    
    # Analyze and transform
    pendulum_dims = {
        'L': [0, 1, 0, 0],
        'm': [1, 0, 0, 0],
        'g': [0, 1, -2, 0],
        'T': [0, 0, 1, 0],
    }
    
    analyzer = BuckinghamPiAnalyzer(max_exponent=4)
    result = analyzer.analyze(pendulum_dims)
    
    print("Selected pi-groups:")
    for i, name in enumerate(result['pi_group_names']):
        print(f"  pi_{i+1} = {name}")
    print()
    
    # Transform data
    X_transformed, pi_names = analyzer.transform_data(X, feature_names)
    
    print(f"Transformed data shape: {X_transformed.shape}")
    print(f"Pi-group names: {pi_names}")
    print()
    
    # Verify: pi = T^2 * g / L should be constant = (2*pi)^2 = 4*pi^2
    print("Verification:")
    print("  If pi = T^2 * g / L, then pi should be constant = 4*pi^2")
    
    for i in range(X_transformed.shape[1]):
        pi_values = X_transformed[:, i]
        print(f"  pi_{i+1}: mean = {np.mean(pi_values):.6f}, std = {np.std(pi_values):.6f}")
        
        # Check if close to constant
        if np.std(pi_values) / np.abs(np.mean(pi_values)) < 1e-10:
            print(f"       -> Constant! Value = {np.mean(pi_values):.6f}")
            print(f"       -> 4*pi^2 = {4 * np.pi**2:.6f}")

In [None]:
# ==============================================================================
# TEST 5: Full Report Output
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 5: Full Analysis Report")
    
    # Use warm rain example for full report
    warm_rain_dims = {
        'q_c':   [0, 0, 0, 0],     # kg/kg (dimensionless)
        'N_d':   [0, -3, 0, 0],    # m^-3
        'r_eff': [0, 1, 0, 0],     # m
        'LWC':   [1, -3, 0, 0],    # kg/m^3
    }
    
    analyzer = BuckinghamPiAnalyzer(max_exponent=4)
    result = analyzer.analyze(warm_rain_dims)
    
    # Print full report
    analyzer.print_analysis_report()

In [None]:
# ==============================================================================
# TEST 6: Integration with UserInputs
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 6: Integration with UserInputs")
    
    # Generate warm rain data with UserInputs
    X, y, feature_names, user_inputs = generate_warm_rain_data(
        n_samples=500, noise_level=0.01, seed=42
    )
    
    print(f"Generated data: X.shape = {X.shape}")
    print(f"Feature names: {feature_names}")
    print(f"UserInputs variable dimensions:")
    for var, dims in user_inputs.variable_dimensions.items():
        print(f"  {var}: {dims}")
    print()
    
    # Run Buckingham Pi analysis using UserInputs
    analyzer = BuckinghamPiAnalyzer(max_exponent=4)
    result = analyzer.analyze(user_inputs.variable_dimensions)
    
    print(f"Pi-groups needed: {result['n_pi_groups']}")
    print("Selected pi-groups:")
    for i, name in enumerate(result['pi_group_names']):
        print(f"  pi_{i+1} = {name}")
    print()
    
    # Transform data
    X_dimless, pi_names = analyzer.transform_data(X, feature_names)
    print(f"Transformed data shape: {X_dimless.shape}")
    print(f"Sample pi-group values (first 5 rows):")
    for i in range(min(5, X_dimless.shape[0])):
        print(f"  {X_dimless[i, :]}")

---
## Section 4: Module Summary

In [None]:
# ==============================================================================
# MODULE SUMMARY
# ==============================================================================

print("=" * 70)
print(" 01_BuckinghamPi.ipynb v4.1 - Module Summary")
print("=" * 70)
print()
print("CLASS: BuckinghamPiAnalyzer")
print("-" * 70)
print()
print("Purpose:")
print("  Reduce n variables with k base units to (n-k) dimensionless groups.")
print("  Automatically selects the simplest linearly independent combinations.")
print()
print("Main Methods:")
print("  analyze(variable_dimensions)")
print("      Perform complete Buckingham Pi analysis")
print("      Returns: dict with pi-groups, exponents, and all candidates")
print()
print("  transform_data(X, feature_names)")
print("      Transform data from original variables to pi-groups")
print("      Returns: (X_transformed, pi_names)")
print()
print("  get_pi_group_expressions()")
print("      Get human-readable pi-group expressions")
print()
print("  print_analysis_report()")
print("      Print detailed analysis report")
print()
print("Usage Example:")
print("-" * 70)
print("""
# Define variable dimensions [M, L, T, Theta]
var_dims = {
    'L': [0, 1, 0, 0],      # Length: m
    'g': [0, 1, -2, 0],     # Acceleration: m/s^2
    'T': [0, 0, 1, 0],      # Time: s
}

# Create analyzer and run analysis
analyzer = BuckinghamPiAnalyzer(max_exponent=4)
result = analyzer.analyze(var_dims)

# Print selected pi-groups
for name in result['pi_group_names']:
    print(name)

# Transform data to dimensionless form
X_dimless, pi_names = analyzer.transform_data(X, feature_names)
""")
print()
print("=" * 70)
print("Module loaded successfully. Import via: %run 01_BuckinghamPi.ipynb")
print("=" * 70)