# Replication of Tables 1A and 1B - Smets & Wouters (2007)

This notebook replicates **Tables 1A and 1B** from:

> Smets, F. & Wouters, R. (2007). "Shocks and Frictions in US Business Cycles: A Bayesian DSGE Approach"
> *American Economic Review*, 97(3), 586-606.

- **Table 1A** (p.593): Prior and Posterior Distribution of Structural Parameters
- **Table 1B** (p.594): Prior and Posterior Distribution of Shock Processes

**Estimation Period**: 1966Q1 - 2004Q4

**Note on Posterior Statistics**:
- **Mode**: Available with `mh_replic=0` (current setting, fast)
- **Mean & Intervals**: Require MCMC sampling with `mh_replic>0` (slower, ~250,000 draws)

## 1. Setup and Configuration

In [1]:
# Imports
import numpy as np
import pandas as pd
import os
import sys
from pathlib import Path

# Add parent directory to path
sys.path.append(str(Path.cwd().parent.parent))

from direct_replication import DynareInterface

pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

In [2]:
# Configure paths - MODIFY FOR YOUR INSTALLATION
import os
os.environ['OCTAVE_EXECUTABLE'] = r'C:\Program Files\GNU Octave\Octave-10.3.0\mingw64\bin\octave-cli.exe'

DYNARE_PATH = r'C:\dynare\6.5\matlab'
REPO_PATH = Path.cwd().parent.parent / 'repo'
MODEL_PATH = Path.cwd().parent / 'model'

print(f"Octave executable: {os.environ['OCTAVE_EXECUTABLE']}")
print(f"Dynare path: {DYNARE_PATH}")
print(f"Model path: {MODEL_PATH}")
print(f"Model exists: {MODEL_PATH.exists()}")

Octave executable: C:\Program Files\GNU Octave\Octave-10.3.0\mingw64\bin\octave-cli.exe
Dynare path: C:\dynare\6.5\matlab
Model path: c:\Users\HP\OneDrive\Escritorio\David Guzzi\Github\MECTMT11\direct_replication\model
Model exists: True


## 2. Reference Values from Paper

These are the exact values from Tables 1A and 1B of Smets & Wouters (2007).

**Note on Priors**: Prior distributions are inputs specified by the researcher, not estimated outputs.

In [3]:
# =============================================================================
# TABLE 1A - STRUCTURAL PARAMETERS
# Prior and Posterior Distribution of Structural Parameters
# =============================================================================

TABLE_1A_REFERENCE = {
    # Parameter: (Dynare name, Prior Distr, Prior Mean, Prior SD, Post Mode, Post Mean, Post 5%, Post 95%)
    'phi':       ('csadjcost', 'Normal',  4.00, 1.50, 5.48, 5.74, 3.97, 7.42),   # Investment adjustment cost
    'sigma_c':   ('csigma',    'Normal',  1.50, 0.37, 1.39, 1.38, 1.16, 1.59),   # Risk aversion
    'h':         ('chabb',     'Beta',    0.70, 0.10, 0.71, 0.71, 0.64, 0.78),   # Habit formation
    'xi_w':      ('cprobw',    'Beta',    0.50, 0.10, 0.73, 0.70, 0.60, 0.81),   # Wage Calvo probability
    'sigma_l':   ('csigl',     'Normal',  2.00, 0.75, 1.92, 1.83, 0.91, 2.78),   # Labor supply elasticity
    'xi_p':      ('cprobp',    'Beta',    0.50, 0.10, 0.65, 0.66, 0.56, 0.74),   # Price Calvo probability
    'iota_w':    ('cindw',     'Beta',    0.50, 0.15, 0.59, 0.58, 0.38, 0.78),   # Wage indexation
    'iota_p':    ('cindp',     'Beta',    0.50, 0.15, 0.22, 0.24, 0.10, 0.38),   # Price indexation
    'psi':       ('czcap',     'Beta',    0.50, 0.15, 0.54, 0.54, 0.36, 0.72),   # Capacity utilization
    'Phi':       ('cfc',       'Normal',  1.25, 0.12, 1.61, 1.60, 1.48, 1.73),   # Fixed cost
    'r_pi':      ('crpi',      'Normal',  1.50, 0.25, 2.03, 2.04, 1.74, 2.33),   # Taylor rule: inflation
    'rho':       ('crr',       'Beta',    0.75, 0.10, 0.81, 0.81, 0.77, 0.85),   # Taylor rule: smoothing
    'r_y':       ('cry',       'Normal',  0.12, 0.05, 0.08, 0.08, 0.05, 0.12),   # Taylor rule: output gap
    'r_Delta_y': ('crdy',      'Normal',  0.12, 0.05, 0.22, 0.22, 0.18, 0.27),   # Taylor rule: output growth
    'pi_bar':    ('constepinf','Gamma',   0.62, 0.10, 0.81, 0.78, 0.61, 0.96),   # SS quarterly inflation
    'beta_const':('constebeta','Gamma',   0.25, 0.10, 0.16, 0.16, 0.07, 0.26),   # 100(beta^-1 - 1)
    'l_bar':     ('constelab', 'Normal',  0.00, 2.00, -0.10, 0.53, -1.30, 2.32), # SS hours
    'gamma_bar': ('ctrend',    'Normal',  0.40, 0.10, 0.43, 0.43, 0.40, 0.45),   # Trend growth
    'alpha':     ('calfa',     'Normal',  0.30, 0.05, 0.19, 0.19, 0.16, 0.21),   # Capital share
}

print(f"Table 1A: {len(TABLE_1A_REFERENCE)} structural parameters defined")

Table 1A: 19 structural parameters defined


In [4]:
# =============================================================================
# TABLE 1B - SHOCK PROCESSES
# Prior and Posterior Distribution of Shock Processes
# =============================================================================

# Mapping from paper notation to Dynare shock names for std devs
# These are extracted from oo_.posterior_mode.shocks_std
SHOCK_STD_MAPPING = {
    'sigma_a': 'ea',      # Technology shock
    'sigma_b': 'eb',      # Preference shock  
    'sigma_g': 'eg',      # Government spending shock
    'sigma_I': 'eqs',     # Investment shock
    'sigma_r': 'em',      # Monetary policy shock
    'sigma_p': 'epinf',   # Price markup shock
    'sigma_w': 'ew',      # Wage markup shock
}

TABLE_1B_REFERENCE = {
    # Shock standard deviations
    # Parameter: (Dynare param name, Prior Distr, Prior Mean, Prior SD, Post Mode, Post Mean, Post 5%, Post 95%)
    'sigma_a':   ('ea',        'InvGamma', 0.10, 2.00, 0.45, 0.45, 0.41, 0.50),  # Technology shock
    'sigma_b':   ('eb',        'InvGamma', 0.10, 2.00, 0.24, 0.23, 0.19, 0.27),  # Preference shock
    'sigma_g':   ('eg',        'InvGamma', 0.10, 2.00, 0.52, 0.53, 0.48, 0.58),  # Government spending shock
    'sigma_I':   ('eqs',       'InvGamma', 0.10, 2.00, 0.45, 0.45, 0.37, 0.53),  # Investment shock
    'sigma_r':   ('em',        'InvGamma', 0.10, 2.00, 0.24, 0.24, 0.22, 0.27),  # Monetary policy shock
    'sigma_p':   ('epinf',     'InvGamma', 0.10, 2.00, 0.14, 0.14, 0.11, 0.16),  # Price markup shock
    'sigma_w':   ('ew',        'InvGamma', 0.10, 2.00, 0.24, 0.24, 0.20, 0.28),  # Wage markup shock
    
    # AR(1) persistence coefficients (these are regular parameters)
    'rho_a':     ('crhoa',     'Beta',     0.50, 0.20, 0.95, 0.95, 0.94, 0.97),  # Technology
    'rho_b':     ('crhob',     'Beta',     0.50, 0.20, 0.18, 0.22, 0.07, 0.36),  # Preference
    'rho_g':     ('crhog',     'Beta',     0.50, 0.20, 0.97, 0.97, 0.96, 0.99),  # Government spending
    'rho_I':     ('crhoqs',    'Beta',     0.50, 0.20, 0.71, 0.71, 0.61, 0.80),  # Investment
    'rho_r':     ('crhoms',    'Beta',     0.50, 0.20, 0.12, 0.15, 0.04, 0.24),  # Monetary policy
    'rho_p':     ('crhopinf',  'Beta',     0.50, 0.20, 0.90, 0.89, 0.80, 0.96),  # Price markup
    'rho_w':     ('crhow',     'Beta',     0.50, 0.20, 0.97, 0.96, 0.94, 0.99),  # Wage markup
    
    # MA coefficients
    'mu_p':      ('cmap',      'Beta',     0.50, 0.20, 0.74, 0.69, 0.54, 0.85),  # Price markup MA
    'mu_w':      ('cmaw',      'Beta',     0.50, 0.20, 0.88, 0.84, 0.75, 0.93),  # Wage markup MA
    
    # Cross-effect
    'rho_ga':    ('cgy',       'Beta',     0.50, 0.20, 0.52, 0.52, 0.37, 0.66),  # Govt spending on tech
}

print(f"Table 1B: {len(TABLE_1B_REFERENCE)} shock process parameters defined")

Table 1B: 17 shock process parameters defined


## 3. Execute Dynare and Extract Parameters

In [5]:
# Initialize Dynare interface
di = DynareInterface(DYNARE_PATH, str(MODEL_PATH))
print("Dynare interface initialized")

    _pyeval at line 57 column 10

    _pyeval at line 57 column 10

    _pyeval at line 57 column 10

Dynare interface initialized


In [None]:
# Run model
print("Running usmodel.mod...")
print("(This may take a few moments)\n")

di.run_model('usmodel.mod')

print("\nDynare execution completed")

Running usmodel.mod...
(This may take a few moments)


Step 1: Closing Octave session to release file locks...
Waiting for Windows to release file handles...

Step 2: Cleaning up directories...
Searching for directories to clean up...
No directories found to clean up.

Step 3: Starting fresh Octave session...
    _pyeval at line 57 column 10

    _pyeval at line 57 column 10

    _pyeval at line 57 column 10

✓ Octave session ready

Step 4: Running Dynare estimation...
Command: dynare usmodel nograph
(This may take several minutes...)

Starting Dynare (version 6.5).
Calling Dynare with arguments: nograph
Starting preprocessing of the model file ...
Found 40 equation(s).
Evaluating expressions...
Computing static model derivatives (order 1).
Normalizing the static model...
Finding the optimal block decomposition of the static model...
11 block(s) found:
  9 recursive block(s) and 2 simultaneous block(s).
  the largest simultaneous block has 13 equation(s)
                               

In [None]:
# Extract estimated parameters (posterior mode)
params_df = di.get_parameters()

# Convert to dictionary for easier access
estimated_params = dict(zip(params_df['parameter'], params_df['value']))

print(f"Extracted {len(estimated_params)} parameters from Dynare")
print("\nKey parameters:")
for param in ['csigma', 'chabb', 'cprobw', 'cprobp', 'crpi', 'crr', 'calfa']:
    if param in estimated_params:
        print(f"  {param}: {estimated_params[param]:.4f}")

Extracted 57 parameters from Dynare

Key parameters:
  csigma: 1.3456
  chabb: 0.7241
  cprobw: 0.6934
  cprobp: 0.6513
  crpi: 2.0318
  crr: 0.8167
  calfa: 0.2946


In [None]:
# =============================================================================
# EXTRACT SHOCK STANDARD DEVIATIONS
# These are stored separately in Dynare's estimation results
# =============================================================================

def extract_shock_std_devs(di):
    """
    Extract shock standard deviations from Dynare estimation results.
    
    These are stored in:
    - oo_.posterior_mode.shocks_std (after mode-finding)
    - Or from the diagonal of Sigma_e (shock covariance matrix)
    
    Returns:
        Dictionary mapping shock names to their estimated std devs
    """
    shock_stds = {}
    
    # Get shock names
    n_shocks = int(di.oc.eval('M_.exo_nbr', nout=1))
    shock_names = []
    for i in range(n_shocks):
        name = di.oc.eval(f'deblank(M_.exo_names{{{i+1}}})', nout=1)
        shock_names.append(str(name).strip())
    
    # Strategy 1: Try oo_.posterior_mode.shocks_std
    try:
        has_shocks_std = di.oc.eval(
            'isfield(oo_, "posterior_mode") && isfield(oo_.posterior_mode, "shocks_std")',
            nout=1
        )
        if has_shocks_std:
            # Use struct2array to convert Octave struct to array
            shocks_std = di.oc.eval('struct2array(oo_.posterior_mode.shocks_std)', nout=1)
            if hasattr(shocks_std, 'flatten'):
                shocks_std = shocks_std.flatten()
            for i, name in enumerate(shock_names):
                if i < len(shocks_std):
                    shock_stds[name] = float(shocks_std[i])
            print("Extracted shock std devs from oo_.posterior_mode.shocks_std")
            return shock_stds
    except Exception as e:
        print(f"Strategy 1 failed: {e}")
    
    # Strategy 2: Try sqrt of diagonal of Sigma_e
    try:
        has_sigma = di.oc.eval('isfield(M_, "Sigma_e")', nout=1)
        if has_sigma:
            sigma_e = di.oc.eval('M_.Sigma_e', nout=1)
            for i, name in enumerate(shock_names):
                shock_stds[name] = float(np.sqrt(sigma_e[i, i]))
            print("Extracted shock std devs from M_.Sigma_e diagonal")
            return shock_stds
    except Exception as e:
        print(f"Strategy 2 failed: {e}")
    
    # Strategy 3: Try estim_params_ structure
    try:
        # Check for estimated shock variances in estim_params_
        n_estim = int(di.oc.eval('size(estim_params_.var_exo, 1)', nout=1))
        if n_estim > 0:
            for i in range(n_estim):
                idx = int(di.oc.eval(f'estim_params_.var_exo({i+1}, 1)', nout=1))
                if idx <= len(shock_names):
                    shock_name = shock_names[idx - 1]
                    # Get the estimated value from bayestopt_
                    # The mode is stored in xparam1 after estimation
                    try:
                        mode_val = di.oc.eval(f'oo_.posterior_mode.parameters({i+1})', nout=1)
                        shock_stds[shock_name] = float(mode_val)
                    except:
                        pass
            if shock_stds:
                print("Extracted shock std devs from estim_params_")
                return shock_stds
    except Exception as e:
        print(f"Strategy 3 failed: {e}")
    
    print("Could not extract shock std devs - will use reference values")
    return shock_stds

# Extract shock standard deviations
shock_std_estimates = extract_shock_std_devs(di)

if shock_std_estimates:
    print("\nEstimated shock standard deviations:")
    for name, val in shock_std_estimates.items():
        print(f"  {name}: {val:.4f}")
else:
    print("\nNo shock std devs extracted - check Dynare output structure")

Extracted shock std devs from oo_.posterior_mode.shocks_std

Estimated shock standard deviations:
  ea: 0.4518
  eb: 0.2425
  eg: 0.5200
  eqs: 0.4501
  em: 0.2398
  epinf: 0.1411
  ew: 0.2444


In [None]:
# =============================================================================
# CHECK FOR MCMC RESULTS (for posterior mean and intervals)
# =============================================================================

def check_mcmc_results(di):
    """
    Check if MCMC results are available (posterior mean and intervals).
    These require running with mh_replic > 0.
    
    Returns:
        Tuple of (has_mcmc, mcmc_results_dict)
    """
    mcmc_results = {
        'available': False,
        'param_means': {},
        'param_intervals': {},
        'shock_means': {},
        'shock_intervals': {}
    }
    
    try:
        # Check if MCMC was run
        has_mcmc = di.oc.eval(
            'isfield(oo_, "posterior_mean") && isfield(oo_.posterior_mean, "parameters")',
            nout=1
        )
        
        if has_mcmc:
            mcmc_results['available'] = True
            print("MCMC results available - extracting posterior means and intervals")
            
            # Extract parameter means - use struct2array to convert Octave struct
            n_params = int(di.oc.eval('length(fieldnames(oo_.posterior_mean.parameters))', nout=1))
            param_names_est = []
            for i in range(n_params):
                name = di.oc.eval(f'deblank(M_.param_names{{estim_params_.param_vals({i+1},1)}})', nout=1)
                param_names_est.append(str(name).strip())
            
            means = di.oc.eval('struct2array(oo_.posterior_mean.parameters)', nout=1).flatten()
            for i, name in enumerate(param_names_est):
                if i < len(means):
                    mcmc_results['param_means'][name] = float(means[i])
            
            # Extract HPD intervals if available - use posterior_hpdinf/posterior_hpdsup (Dynare 6.x)
            try:
                has_hpd = di.oc.eval('isfield(oo_, "posterior_hpdinf") && isfield(oo_.posterior_hpdinf, "parameters")', nout=1)
                if has_hpd:
                    hpd_inf = di.oc.eval('struct2array(oo_.posterior_hpdinf.parameters)', nout=1).flatten()
                    hpd_sup = di.oc.eval('struct2array(oo_.posterior_hpdsup.parameters)', nout=1).flatten()
                    for i, name in enumerate(param_names_est):
                        if i < len(hpd_inf):
                            mcmc_results['param_intervals'][name] = (float(hpd_inf[i]), float(hpd_sup[i]))
                    print(f"Extracted HPD intervals for {len(mcmc_results['param_intervals'])} parameters")
            except Exception as e:
                print(f"Could not extract HPD intervals: {e}")
            
            # Extract shock std means if available - use struct2array
            try:
                has_shock_means = di.oc.eval('isfield(oo_.posterior_mean, "shocks_std")', nout=1)
                if has_shock_means:
                    shock_means = di.oc.eval('struct2array(oo_.posterior_mean.shocks_std)', nout=1).flatten()
                    n_shocks = int(di.oc.eval('M_.exo_nbr', nout=1))
                    for i in range(n_shocks):
                        name = di.oc.eval(f'deblank(M_.exo_names{{{i+1}}})', nout=1)
                        name = str(name).strip()
                        if i < len(shock_means):
                            mcmc_results['shock_means'][name] = float(shock_means[i])
            except Exception as e:
                print(f"Could not extract shock means: {e}")
            
            # Extract shock HPD intervals if available - use posterior_hpdinf/posterior_hpdsup
            try:
                has_shock_hpd = di.oc.eval('isfield(oo_.posterior_hpdinf, "shocks_std")', nout=1)
                if has_shock_hpd:
                    shock_hpd_inf = di.oc.eval('struct2array(oo_.posterior_hpdinf.shocks_std)', nout=1).flatten()
                    shock_hpd_sup = di.oc.eval('struct2array(oo_.posterior_hpdsup.shocks_std)', nout=1).flatten()
                    n_shocks = int(di.oc.eval('M_.exo_nbr', nout=1))
                    for i in range(n_shocks):
                        name = di.oc.eval(f'deblank(M_.exo_names{{{i+1}}})', nout=1)
                        name = str(name).strip()
                        if i < len(shock_hpd_inf):
                            mcmc_results['shock_intervals'][name] = (float(shock_hpd_inf[i]), float(shock_hpd_sup[i]))
            except Exception as e:
                print(f"Could not extract shock HPD intervals: {e}")
                
        else:
            print("MCMC results NOT available (mh_replic=0)")
            print("To get posterior mean and intervals, run with mh_replic=250000")
            
    except Exception as e:
        print(f"Error checking MCMC results: {e}")
    
    return mcmc_results

# Check for MCMC results
mcmc_results = check_mcmc_results(di)

MCMC results available - extracting posterior means and intervals
Extracted HPD intervals for 29 parameters


## 4. Generate Table 1A - Structural Parameters

In [None]:
def create_table_1A(reference_data, estimated_params, mcmc_results=None):
    """
    Create Table 1A comparing estimated vs reference values.
    
    Args:
        reference_data: Dictionary with reference values from paper
        estimated_params: Dictionary with estimated parameter values from Dynare
        mcmc_results: Optional MCMC results with means and intervals
    
    Returns:
        DataFrame with Table 1A structure
    """
    rows = []
    
    for param_symbol, values in reference_data.items():
        dynare_name, prior_distr, prior_mean, prior_sd, post_mode, post_mean, post_5, post_95 = values
        
        # Get estimated mode from Dynare
        estimated_mode = estimated_params.get(dynare_name, np.nan)
        
        # Get MCMC results if available
        estimated_mean = np.nan
        estimated_5 = np.nan
        estimated_95 = np.nan
        
        if mcmc_results and mcmc_results['available']:
            estimated_mean = mcmc_results['param_means'].get(dynare_name, np.nan)
            if dynare_name in mcmc_results['param_intervals']:
                estimated_5, estimated_95 = mcmc_results['param_intervals'][dynare_name]
        
        # Calculate difference from paper's posterior mode
        if not np.isnan(estimated_mode):
            diff_pct = ((estimated_mode - post_mode) / post_mode) * 100
        else:
            diff_pct = np.nan
        
        rows.append({
            'Parameter': param_symbol,
            'Dynare Name': dynare_name,
            'Prior Distr.': prior_distr,
            'Prior Mean': prior_mean,
            'Prior SD': prior_sd,
            'Post. Mode (Paper)': post_mode,
            'Post. Mode (Replicated)': estimated_mode,
            'Diff (%)': diff_pct,
            'Post. Mean (Paper)': post_mean,
            'Post. Mean (Replicated)': estimated_mean,
            'Post. 5% (Paper)': post_5,
            'Post. 5% (Replicated)': estimated_5,
            'Post. 95% (Paper)': post_95,
            'Post. 95% (Replicated)': estimated_95,
        })
    
    return pd.DataFrame(rows)

# Generate Table 1A
table_1A = create_table_1A(TABLE_1A_REFERENCE, estimated_params, mcmc_results)
print("Table 1A generated")

Table 1A generated


In [None]:
# Display Table 1A - Prior Distribution
print("="*90)
print("TABLE 1A - PRIOR AND POSTERIOR DISTRIBUTION OF STRUCTURAL PARAMETERS")
print("Smets & Wouters (2007), American Economic Review, Table 1A (p.593)")
print("="*90)
print("\n** PRIOR DISTRIBUTION **\n")

prior_cols = ['Parameter', 'Prior Distr.', 'Prior Mean', 'Prior SD']
print(table_1A[prior_cols].to_string(index=False))

TABLE 1A - PRIOR AND POSTERIOR DISTRIBUTION OF STRUCTURAL PARAMETERS
Smets & Wouters (2007), American Economic Review, Table 1A (p.593)

** PRIOR DISTRIBUTION **

 Parameter Prior Distr.  Prior Mean  Prior SD
       phi       Normal        4.00      1.50
   sigma_c       Normal        1.50      0.37
         h         Beta        0.70      0.10
      xi_w         Beta        0.50      0.10
   sigma_l       Normal        2.00      0.75
      xi_p         Beta        0.50      0.10
    iota_w         Beta        0.50      0.15
    iota_p         Beta        0.50      0.15
       psi         Beta        0.50      0.15
       Phi       Normal        1.25      0.12
      r_pi       Normal        1.50      0.25
       rho         Beta        0.75      0.10
       r_y       Normal        0.12      0.05
 r_Delta_y       Normal        0.12      0.05
    pi_bar        Gamma        0.62      0.10
beta_const        Gamma        0.25      0.10
     l_bar       Normal        0.00      2.00
 gamma_ba

In [None]:
# Display Table 1A - Posterior Distribution Comparison
print("\n** POSTERIOR DISTRIBUTION (Mode Comparison) **\n")

posterior_cols = ['Parameter', 'Post. Mode (Paper)', 'Post. Mode (Replicated)', 'Diff (%)']
print(table_1A[posterior_cols].to_string(index=False, float_format=lambda x: f'{x:.2f}' if pd.notna(x) else 'N/A'))


** POSTERIOR DISTRIBUTION (Mode Comparison) **

 Parameter  Post. Mode (Paper)  Post. Mode (Replicated)  Diff (%)
       phi                5.48                     5.47     -0.11
   sigma_c                1.39                     1.35     -3.19
         h                0.71                     0.72      1.99
      xi_w                0.73                     0.69     -5.01
   sigma_l                1.92                     1.58    -17.52
      xi_p                0.65                     0.65      0.19
    iota_w                0.59                     0.56     -5.19
    iota_p                0.22                     0.26     17.33
       psi                0.54                     0.46    -15.23
       Phi                1.61                     1.70      5.79
      r_pi                2.03                     2.03      0.09
       rho                0.81                     0.82      0.83
       r_y                0.08                     0.08      3.32
 r_Delta_y                0

In [None]:
# Display Table 1A - Full Posterior Comparison (if MCMC available)
if mcmc_results and mcmc_results['available']:
    print("\n** FULL POSTERIOR DISTRIBUTION (Mean and Intervals) **\n")
    full_cols = ['Parameter', 'Post. Mean (Paper)', 'Post. Mean (Replicated)', 
                 'Post. 5% (Paper)', 'Post. 5% (Replicated)',
                 'Post. 95% (Paper)', 'Post. 95% (Replicated)']
    print(table_1A[full_cols].to_string(index=False, float_format=lambda x: f'{x:.2f}' if pd.notna(x) else 'N/A'))
else:
    print("\n** POSTERIOR DISTRIBUTION (Full - from Paper only) **")
    print("Note: Run with mh_replic=250000 to replicate mean and intervals\n")
    full_posterior_cols = ['Parameter', 'Post. Mode (Paper)', 'Post. Mean (Paper)', 'Post. 5% (Paper)', 'Post. 95% (Paper)']
    print(table_1A[full_posterior_cols].to_string(index=False))


** FULL POSTERIOR DISTRIBUTION (Mean and Intervals) **

 Parameter  Post. Mean (Paper)  Post. Mean (Replicated)  Post. 5% (Paper)  Post. 5% (Replicated)  Post. 95% (Paper)  Post. 95% (Replicated)
       phi                5.74                     5.47              3.97                   3.76               7.42                    7.30
   sigma_c                1.38                     1.35              1.16                   1.16               1.59                    1.56
         h                0.71                     0.72              0.64                   0.65               0.78                    0.79
      xi_w                0.70                     0.69              0.60                   0.58               0.81                    0.80
   sigma_l                1.83                     1.58              0.91                   0.61               2.78                    2.52
      xi_p                0.66                     0.65              0.56                   0.57       

## 5. Generate Table 1B - Shock Processes

In [None]:
def create_table_1B(reference_data, estimated_params, shock_std_estimates, mcmc_results=None):
    """
    Create Table 1B comparing estimated vs reference shock process values.
    
    Args:
        reference_data: Dictionary with reference values from paper
        estimated_params: Dictionary with estimated parameter values from Dynare
        shock_std_estimates: Dictionary with shock std dev estimates
        mcmc_results: Optional MCMC results with means and intervals
    
    Returns:
        DataFrame with Table 1B structure
    """
    rows = []
    
    for param_symbol, values in reference_data.items():
        dynare_name, prior_distr, prior_mean, prior_sd, post_mode, post_mean, post_5, post_95 = values
        
        # Determine if this is a shock std dev or a regular parameter
        is_shock_std = param_symbol.startswith('sigma_')
        
        if is_shock_std:
            # Get estimated shock std from shock_std_estimates
            estimated_mode = shock_std_estimates.get(dynare_name, np.nan)
            
            # Get MCMC mean if available
            estimated_mean = np.nan
            if mcmc_results and mcmc_results['available']:
                estimated_mean = mcmc_results['shock_means'].get(dynare_name, np.nan)
        else:
            # Regular parameter
            estimated_mode = estimated_params.get(dynare_name, np.nan)
            
            # Get MCMC results if available
            estimated_mean = np.nan
            if mcmc_results and mcmc_results['available']:
                estimated_mean = mcmc_results['param_means'].get(dynare_name, np.nan)
        
        # Calculate difference from paper's posterior mode
        if not np.isnan(estimated_mode):
            diff_pct = ((estimated_mode - post_mode) / post_mode) * 100
        else:
            diff_pct = np.nan
        
        rows.append({
            'Parameter': param_symbol,
            'Dynare Name': dynare_name,
            'Type': 'Shock Std' if is_shock_std else 'Parameter',
            'Prior Distr.': prior_distr,
            'Prior Mean': prior_mean,
            'Prior SD': prior_sd,
            'Post. Mode (Paper)': post_mode,
            'Post. Mode (Replicated)': estimated_mode,
            'Diff (%)': diff_pct,
            'Post. Mean (Paper)': post_mean,
            'Post. Mean (Replicated)': estimated_mean,
            'Post. 5% (Paper)': post_5,
            'Post. 95% (Paper)': post_95,
        })
    
    return pd.DataFrame(rows)

# Generate Table 1B
table_1B = create_table_1B(TABLE_1B_REFERENCE, estimated_params, shock_std_estimates, mcmc_results)
print("Table 1B generated")

Table 1B generated


In [None]:
# Display Table 1B - Prior Distribution
print("="*90)
print("TABLE 1B - PRIOR AND POSTERIOR DISTRIBUTION OF SHOCK PROCESSES")
print("Smets & Wouters (2007), American Economic Review, Table 1B (p.594)")
print("="*90)
print("\n** PRIOR DISTRIBUTION **\n")

prior_cols = ['Parameter', 'Prior Distr.', 'Prior Mean', 'Prior SD']
print(table_1B[prior_cols].to_string(index=False))

TABLE 1B - PRIOR AND POSTERIOR DISTRIBUTION OF SHOCK PROCESSES
Smets & Wouters (2007), American Economic Review, Table 1B (p.594)

** PRIOR DISTRIBUTION **

Parameter Prior Distr.  Prior Mean  Prior SD
  sigma_a     InvGamma         0.1       2.0
  sigma_b     InvGamma         0.1       2.0
  sigma_g     InvGamma         0.1       2.0
  sigma_I     InvGamma         0.1       2.0
  sigma_r     InvGamma         0.1       2.0
  sigma_p     InvGamma         0.1       2.0
  sigma_w     InvGamma         0.1       2.0
    rho_a         Beta         0.5       0.2
    rho_b         Beta         0.5       0.2
    rho_g         Beta         0.5       0.2
    rho_I         Beta         0.5       0.2
    rho_r         Beta         0.5       0.2
    rho_p         Beta         0.5       0.2
    rho_w         Beta         0.5       0.2
     mu_p         Beta         0.5       0.2
     mu_w         Beta         0.5       0.2
   rho_ga         Beta         0.5       0.2


In [None]:
# Display Table 1B - Posterior Distribution Comparison
print("\n** POSTERIOR DISTRIBUTION (Mode Comparison) **\n")

posterior_cols = ['Parameter', 'Post. Mode (Paper)', 'Post. Mode (Replicated)', 'Diff (%)']

# Split into sections for cleaner display
print("--- Shock Standard Deviations ---")
sigma_df = table_1B[table_1B['Type'] == 'Shock Std']
print(sigma_df[posterior_cols].to_string(index=False, float_format=lambda x: f'{x:.2f}' if pd.notna(x) else 'N/A'))

print("\n--- AR(1) Persistence ---")
rho_params = ['rho_a', 'rho_b', 'rho_g', 'rho_I', 'rho_r', 'rho_p', 'rho_w', 'rho_ga']
rho_df = table_1B[table_1B['Parameter'].isin(rho_params)]
print(rho_df[posterior_cols].to_string(index=False, float_format=lambda x: f'{x:.2f}' if pd.notna(x) else 'N/A'))

print("\n--- MA Coefficients ---")
ma_params = ['mu_p', 'mu_w']
ma_df = table_1B[table_1B['Parameter'].isin(ma_params)]
print(ma_df[posterior_cols].to_string(index=False, float_format=lambda x: f'{x:.2f}' if pd.notna(x) else 'N/A'))


** POSTERIOR DISTRIBUTION (Mode Comparison) **

--- Shock Standard Deviations ---
Parameter  Post. Mode (Paper)  Post. Mode (Replicated)  Diff (%)
  sigma_a                0.45                     0.45      0.40
  sigma_b                0.24                     0.24      1.03
  sigma_g                0.52                     0.52      0.00
  sigma_I                0.45                     0.45      0.02
  sigma_r                0.24                     0.24     -0.07
  sigma_p                0.14                     0.14      0.80
  sigma_w                0.24                     0.24      1.83

--- AR(1) Persistence ---
Parameter  Post. Mode (Paper)  Post. Mode (Replicated)  Diff (%)
    rho_a                0.95                     0.95      0.25
    rho_b                0.18                     0.21     14.51
    rho_g                0.97                     0.97     -0.40
    rho_I                0.71                     0.75      5.94
    rho_r                0.12                

In [None]:
# Display Table 1B - Full Posterior from Paper
print("\n** POSTERIOR DISTRIBUTION (Full - from Paper) **\n")

full_posterior_cols = ['Parameter', 'Post. Mode (Paper)', 'Post. Mean (Paper)', 'Post. 5% (Paper)', 'Post. 95% (Paper)']
print(table_1B[full_posterior_cols].to_string(index=False))


** POSTERIOR DISTRIBUTION (Full - from Paper) **

Parameter  Post. Mode (Paper)  Post. Mean (Paper)  Post. 5% (Paper)  Post. 95% (Paper)
  sigma_a                0.45                0.45              0.41               0.50
  sigma_b                0.24                0.23              0.19               0.27
  sigma_g                0.52                0.53              0.48               0.58
  sigma_I                0.45                0.45              0.37               0.53
  sigma_r                0.24                0.24              0.22               0.27
  sigma_p                0.14                0.14              0.11               0.16
  sigma_w                0.24                0.24              0.20               0.28
    rho_a                0.95                0.95              0.94               0.97
    rho_b                0.18                0.22              0.07               0.36
    rho_g                0.97                0.97              0.96            

## 6. Verification Summary

In [None]:
# Compute verification statistics
def compute_verification_stats(table_df, tolerance=5.0):
    """
    Compute verification statistics for a table.
    
    Args:
        table_df: DataFrame with 'Diff (%)' column
        tolerance: Acceptable percentage difference
    
    Returns:
        Dictionary with verification statistics
    """
    valid_diffs = table_df['Diff (%)'].dropna()
    
    n_total = len(table_df)
    n_estimated = len(valid_diffs)
    n_within_tol = (valid_diffs.abs() <= tolerance).sum()
    
    return {
        'total_params': n_total,
        'estimated_params': n_estimated,
        'within_tolerance': n_within_tol,
        'pass_rate': (n_within_tol / n_estimated * 100) if n_estimated > 0 else 0,
        'mean_abs_diff': valid_diffs.abs().mean() if len(valid_diffs) > 0 else np.nan,
        'max_abs_diff': valid_diffs.abs().max() if len(valid_diffs) > 0 else np.nan,
    }

# Verification for Table 1A
stats_1A = compute_verification_stats(table_1A)

# Verification for Table 1B
stats_1B = compute_verification_stats(table_1B)

print("="*70)
print("VERIFICATION SUMMARY")
print("="*70)
print(f"\nTolerance: 5% difference from paper's posterior mode")
print("\n--- Table 1A (Structural Parameters) ---")
print(f"  Total parameters: {stats_1A['total_params']}")
print(f"  Successfully estimated: {stats_1A['estimated_params']}")
print(f"  Within tolerance: {stats_1A['within_tolerance']}/{stats_1A['estimated_params']}")
print(f"  Pass rate: {stats_1A['pass_rate']:.1f}%")
print(f"  Mean absolute difference: {stats_1A['mean_abs_diff']:.2f}%")
print(f"  Max absolute difference: {stats_1A['max_abs_diff']:.2f}%")

print("\n--- Table 1B (Shock Processes) ---")
print(f"  Total parameters: {stats_1B['total_params']}")
print(f"  Successfully estimated: {stats_1B['estimated_params']}")
print(f"  Within tolerance: {stats_1B['within_tolerance']}/{stats_1B['estimated_params']}")
print(f"  Pass rate: {stats_1B['pass_rate']:.1f}%")
print(f"  Mean absolute difference: {stats_1B['mean_abs_diff']:.2f}%")
print(f"  Max absolute difference: {stats_1B['max_abs_diff']:.2f}%")
print("="*70)

VERIFICATION SUMMARY

Tolerance: 5% difference from paper's posterior mode

--- Table 1A (Structural Parameters) ---
  Total parameters: 19
  Successfully estimated: 19
  Within tolerance: 9/19
  Pass rate: 47.4%
  Mean absolute difference: 33.03%
  Max absolute difference: 408.33%

--- Table 1B (Shock Processes) ---
  Total parameters: 17
  Successfully estimated: 17
  Within tolerance: 12/17
  Pass rate: 70.6%
  Mean absolute difference: 4.03%
  Max absolute difference: 16.99%


In [None]:
# Highlight parameters with largest differences
print("\nParameters with largest differences (Table 1A):")
print("-" * 60)

table_1A_sorted = table_1A.dropna(subset=['Diff (%)']).sort_values('Diff (%)', key=abs, ascending=False)
for _, row in table_1A_sorted.head(5).iterrows():
    print(f"  {row['Parameter']:12s}  Paper: {row['Post. Mode (Paper)']:6.2f}  Replicated: {row['Post. Mode (Replicated)']:6.2f}  Diff: {row['Diff (%)']:+5.1f}%")

print("\nParameters with largest differences (Table 1B):")
print("-" * 60)

table_1B_sorted = table_1B.dropna(subset=['Diff (%)']).sort_values('Diff (%)', key=abs, ascending=False)
for _, row in table_1B_sorted.head(5).iterrows():
    print(f"  {row['Parameter']:12s}  Paper: {row['Post. Mode (Paper)']:6.2f}  Replicated: {row['Post. Mode (Replicated)']:6.2f}  Diff: {row['Diff (%)']:+5.1f}%")


Parameters with largest differences (Table 1A):
------------------------------------------------------------
  l_bar         Paper:  -0.10  Replicated:   0.31  Diff: -408.3%
  beta_const    Paper:   0.16  Replicated:   0.26  Diff: +64.2%
  alpha         Paper:   0.19  Replicated:   0.29  Diff: +55.0%
  pi_bar        Paper:   0.81  Replicated:   0.65  Diff: -19.2%
  sigma_l       Paper:   1.92  Replicated:   1.58  Diff: -17.5%

Parameters with largest differences (Table 1B):
------------------------------------------------------------
  rho_r         Paper:   0.12  Replicated:   0.14  Diff: +17.0%
  rho_ga        Paper:   0.52  Replicated:   0.60  Diff: +15.4%
  rho_b         Paper:   0.18  Replicated:   0.21  Diff: +14.5%
  mu_w          Paper:   0.88  Replicated:   0.82  Diff:  -7.0%
  rho_I         Paper:   0.71  Replicated:   0.75  Diff:  +5.9%


## 7. Export Results

In [None]:
# Export tables to CSV
output_dir = Path.cwd()

# Table 1A
table_1A.to_csv(output_dir / 'table_1A_replication.csv', index=False)
print(f"Table 1A saved to: {output_dir / 'table_1A_replication.csv'}")

# Table 1B
table_1B.to_csv(output_dir / 'table_1B_replication.csv', index=False)
print(f"Table 1B saved to: {output_dir / 'table_1B_replication.csv'}")

In [None]:
# Create formatted tables for paper-style display
print("\n" + "="*95)
print("FORMATTED TABLE 1A - Prior and Posterior Distribution of Structural Parameters")
print("="*95)
print("\n{:<12} {:>10} {:>8} {:>8}   {:>8} {:>8} {:>10} {:>10}".format(
    '', 'Prior', '', '', 'Posterior', '', '', ''))
print("{:<12} {:>10} {:>8} {:>8}   {:>8} {:>8} {:>10} {:>10}".format(
    'Parameter', 'Distr.', 'Mean', 'SD', 'Mode', 'Mean', '5%', '95%'))
print("-"*95)

for _, row in table_1A.iterrows():
    print("{:<12} {:>10} {:>8.2f} {:>8.2f}   {:>8.2f} {:>8.2f} {:>10.2f} {:>10.2f}".format(
        row['Parameter'],
        row['Prior Distr.'],
        row['Prior Mean'],
        row['Prior SD'],
        row['Post. Mode (Paper)'],
        row['Post. Mean (Paper)'],
        row['Post. 5% (Paper)'],
        row['Post. 95% (Paper)']
    ))

In [None]:
print("\n" + "="*95)
print("FORMATTED TABLE 1B - Prior and Posterior Distribution of Shock Processes")
print("="*95)
print("\n{:<12} {:>10} {:>8} {:>8}   {:>8} {:>8} {:>10} {:>10}".format(
    '', 'Prior', '', '', 'Posterior', '', '', ''))
print("{:<12} {:>10} {:>8} {:>8}   {:>8} {:>8} {:>10} {:>10}".format(
    'Parameter', 'Distr.', 'Mean', 'SD', 'Mode', 'Mean', '5%', '95%'))
print("-"*95)

for _, row in table_1B.iterrows():
    print("{:<12} {:>10} {:>8.2f} {:>8.2f}   {:>8.2f} {:>8.2f} {:>10.2f} {:>10.2f}".format(
        row['Parameter'],
        row['Prior Distr.'],
        row['Prior Mean'],
        row['Prior SD'],
        row['Post. Mode (Paper)'],
        row['Post. Mean (Paper)'],
        row['Post. 5% (Paper)'],
        row['Post. 95% (Paper)']
    ))

## 8. Cleanup

In [None]:
# Close Dynare/Octave session
di.close()
print("Octave session closed")
print("\nReplication complete!")

---

## Notes on Replication

### Current Configuration
- **mh_replic=5000**: MCMC sampling enabled (adjust in usmodel.mod)
- Shock standard deviations extracted from `M_.Sigma_e` diagonal or `oo_.posterior_mean.shocks_std`

### To Get Full Posterior (Mean and Intervals)
Modify `usmodel.mod` line 207 to use:
```matlab
estimation(..., mh_replic=250000, nodiagnostic, ...);
```

This will run MCMC sampling and provide:
- Posterior mean
- 5% and 95% HPD intervals

**Warning**: MCMC with 250,000 draws takes significantly longer (~21 hours based on timing tests).

### HPD Intervals Note
Dynare uses 90% HPD intervals by default (`mh_conf_sig=0.9`). The paper reports 5%-95% intervals which corresponds to 90% coverage.

In [None]:
# -*- coding: utf-8 -*-
"""
Chequeo empalme Índice de Salarios Privado Registrado (mensual) 2004–2025
- Soporta dos formatos de periodo:
    1) "ene-04", "sept-15", etc. (español)
    2) "1/10/2015", "1/11/2015", etc. (día/mes/año)
- Maneja el solapamiento en oct-2015 (dos registros): uno “viejo” (base abr-2012)
  y otro “nuevo” (base oct-2016). Empalma por ratio en el mes de solapamiento.
- Construye:
    a) serie nominal unificada (en escala de la serie vieja)
    b) serie nominal rebasada a 2004=100 (promedio anual 2004)
Opcional: imprime checks y guarda CSV.

Uso:
1) Copiá el bloque de datos tal cual en la variable RAW (debajo).
2) Corré el script.
"""

import re
from datetime import date
import pandas as pd

RAW = r"""
periodo	índice de salarios privado registrado
ene-04	20,08
feb-04	20,58
mar-04	20,74
abr-04	20,79
may-04	20,88
jun-04	20,91
jul-04	20,98
ago-04	21,05
sept-04	21,11
oct-04	21,23
nov-04	21,28
dic-04	21,38
ene-05	22,53
feb-05	22,95
mar-05	23,15
abr-05	23,51
may-05	23,97
jun-05	24,29
jul-05	24,78
ago-05	25,34
sept-05	25,74
oct-05	26,27
nov-05	26,52
dic-05	26,93
ene-06	27,32
feb-06	27,69
mar-06	28,02
abr-06	28,58
may-06	29,04
jun-06	29,46
jul-06	30,17
ago-06	30,69
sept-06	31,02
oct-06	31,48
nov-06	31,73
dic-06	32,16
ene-07	32,46
feb-07	32,75
mar-07	33,07
abr-07	33,60
may-07	34,22
jun-07	34,99
jul-07	35,65
ago-07	36,33
sept-07	36,96
oct-07	37,64
nov-07	38,10
dic-07	38,61
ene-08	37,45
feb-08	37,78
mar-08	38,21
abr-08	39,45
may-08	40,66
jun-08	41,29
jul-08	42,52
ago-08	43,51
sept-08	44,26
oct-08	44,98
nov-08	45,42
dic-08	45,77
ene-09	46,16
feb-09	46,37
mar-09	46,65
abr-09	47,25
may-09	47,92
jun-09	48,85
jul-09	49,98
ago-09	50,82
sept-09	51,57
oct-09	52,55
nov-09	52,98
dic-09	53,68
ene-10	54,47
feb-10	55,29
mar-10	56,54
abr-10	57,76
may-10	59,02
jun-10	60,43
jul-10	62,83
ago-10	64,73
sept-10	66,15
oct-10	67,59
nov-10	68,27
dic-10	69,39
ene-11	70,51
feb-11	71,20
mar-11	72,36
abr-11	74,03
may-11	77,02
jun-11	80,34
jul-11	84,22
ago-11	86,67
sept-11	89,09
oct-11	90,09
nov-11	91,36
dic-11	94,22
ene-12	94,96
feb-12	96,21
mar-12	97,60
abr-12	100,00
may-12	103,80
jun-12	105,83
jul-12	108,78
ago-12	110,93
sept-12	112,22
oct-12	113,69
nov-12	116,43
dic-12	117,55
ene-13	118,46
feb-13	119,39
mar-13	120,97
abr-13	122,96
may-13	128,98
jun-13	132,43
jul-13	135,81
ago-13	138,38
sept-13	140,71
oct-13	142,50
nov-13	145,71
dic-13	147,14
ene-14	148,88
feb-14	151,17
mar-14	153,69
abr-14	163,76
may-14	167,58
jun-14	171,12
jul-14	178,19
ago-14	182,03
sept-14	186,50
oct-14	189,91
nov-14	192,22
dic-14	193,48
ene-15	195,83
feb-15	197,36
mar-15	200,60
abr-15	203,14
may-15	212,06
jun-15	219,59
jul-15	227,88
ago-15	233,82
sept-15	237,46
oct-15	240,61
1/10/2015	73,97
1/11/2015	75,95
1/12/2015	77,36
1/1/2016	78,93
1/2/2016	80,04
1/3/2016	80,8
1/4/2016	85,92
1/5/2016	88,74
1/6/2016	90,15
1/7/2016	93,35
1/8/2016	95,05
1/9/2016	96,63
1/10/2016	100
1/11/2016	101,8
1/12/2016	102,89
1/1/2017	106,34
1/2/2017	107,69
1/3/2017	109,26
1/4/2017	112,93
1/5/2017	115,18
1/6/2017	117,23
1/7/2017	122,8
1/8/2017	124,55
1/9/2017	126
1/10/2017	128,23
1/11/2017	130,04
1/12/2017	131,01
1/1/2018	133,25
1/2/2018	134,31
1/3/2018	136,23
1/4/2018	141,01
1/5/2018	144,47
1/6/2018	146,47
1/7/2018	149,94
1/8/2018	154,2
1/9/2018	157,82
1/10/2018	163,86
1/11/2018	168,06
1/12/2018	170,85
1/1/2019	176,91
1/2/2019	181,76
1/3/2019	188,26
1/4/2019	193,47
1/5/2019	201,66
1/6/2019	206,99
1/7/2019	215,86
1/8/2019	221,87
1/9/2019	227,72
1/10/2019	236,31
1/11/2019	242,85
1/12/2019	246,54
1/1/2020	267,6
1/2/2020	279,97
1/3/2020	286,4
1/4/2020	285,74
1/5/2020	285,06
1/6/2020	285,44
1/7/2020	289,19
1/8/2020	295,13
1/9/2020	301,54
1/10/2020	317,66
1/11/2020	326,04
1/12/2020	331,26
1/1/2021	344,61
1/2/2021	362,33
1/3/2021	376,68
1/4/2021	394,54
1/5/2021	408,38
1/6/2021	415,55
1/7/2021	437
1/8/2021	451,64
1/9/2021	467,91
1/10/2021	485,29
1/11/2021	507
1/12/2021	514,34
1/1/2022	538,17
1/2/2022	559,01
1/3/2022	589,31
1/4/2022	622,29
1/5/2022	665,66
1/6/2022	699,36
1/7/2022	737,01
1/8/2022	797,04
1/9/2022	843,4
1/10/2022	888,1
1/11/2022	953,34
1/12/2022	996,62
1/1/2023	1042,37
1/2/2023	1114,31
1/3/2023	1202,08
1/4/2023	1284,57
1/5/2023	1389,24
1/6/2023	1468,48
1/7/2023	1626,81
1/8/2023	1762,46
1/9/2023	1995,58
1/10/2023	2169,81
1/11/2023	2385,55
1/12/2023	2648,87
1/1/2024	3178,28
1/2/2024	3625,42
1/3/2024	3987,49
1/4/2024	4464,6
1/5/2024	4807,34
1/6/2024	5129,04
1/7/2024	5450,78
1/8/2024	5725,38
1/9/2024	5941,03
1/10/2024	6175,77
1/11/2024	6378,33
1/12/2024	6556,47
1/1/2025	6707,3
1/2/2025	6862,13
1/3/2025	7010,54
1/4/2025	7187,95
1/5/2025	7329,99
1/6/2025	7457,02
1/7/2025	7619,01
1/8/2025	7789,94
1/9/2025	7896,9
1/10/2025	8061,11
1/11/2025	8232,68
""".strip()


# -----------------------
# Helpers
# -----------------------
MONTHS_ES = {
    "ene": 1, "feb": 2, "mar": 3, "abr": 4, "may": 5, "jun": 6,
    "jul": 7, "ago": 8, "sept": 9, "sep": 9, "oct": 10, "nov": 11, "dic": 12
}

def parse_spanish_number(x: str) -> float:
    """Convierte '1.234,56' o '240,61' a float."""
    x = x.strip()
    x = x.replace(".", "").replace(",", ".")
    return float(x)

def parse_period_to_date(p: str) -> date:
    """Parsea 'ene-04' o '1/10/2015' a date (primer día del mes)."""
    p = p.strip().lower()
    # Formato dd/mm/yyyy
    if re.match(r"^\d{1,2}/\d{1,2}/\d{4}$", p):
        dd, mm, yyyy = p.split("/")
        return date(int(yyyy), int(mm), 1)
    # Formato 'ene-04'
    m = re.match(r"^([a-zñ]+)-(\d{2})$", p)
    if m:
        mon_str, yy = m.group(1), int(m.group(2))
        mon = MONTHS_ES.get(mon_str)
        if mon is None:
            raise ValueError(f"Mes no reconocido: {mon_str}")
        year = 2000 + yy  # tu muestra arranca en 2004, así que '04' -> 2004
        return date(year, mon, 1)
    raise ValueError(f"Formato de periodo no reconocido: {p}")

def yyyymm(d: date) -> int:
    return d.year * 100 + d.month


# -----------------------
# Parse raw table
# -----------------------
lines = [ln for ln in RAW.splitlines() if ln.strip()]
# saltear header si existe
if "periodo" in lines[0].lower():
    lines = lines[1:]

rows = []
for ln in lines:
    parts = re.split(r"\t+", ln.strip())
    if len(parts) != 2:
        # si pegaste con espacios, intentamos split por múltiples espacios
        parts = re.split(r"\s{2,}", ln.strip())
        if len(parts) != 2:
            raise ValueError(f"No pude separar en 2 columnas esta línea: {ln}")
    per, val = parts[0].strip(), parts[1].strip()
    d = parse_period_to_date(per)
    rows.append({"periodo_raw": per, "fecha": pd.Timestamp(d), "yyyymm": yyyymm(d), "indice": parse_spanish_number(val)})

df = pd.DataFrame(rows).sort_values(["fecha"]).reset_index(drop=True)

# -----------------------
# Empalme en 201510 (oct-2015)
# -----------------------
OVERLAP = 201510
over = df[df["yyyymm"] == OVERLAP].copy()
if len(over) < 2:
    raise RuntimeError(
        f"Esperaba 2 registros en {OVERLAP} (oct-2015) y encontré {len(over)}. "
        "Revisá que estén ambas bases en ese mes."
    )

# Por defecto:
# - 1ra ocurrencia de 201510 en el orden del df = 'vieja'
# - 2da ocurrencia = 'nueva'
over_idx = over.index.to_list()
old_idx, new_idx = over_idx[0], over_idx[1]
old_oct = df.loc[old_idx, "indice"]
new_oct = df.loc[new_idx, "indice"]

factor = old_oct / new_oct  # multiplica la nueva para llevarla a escala vieja

# Construir serie unificada:
# - para meses <= 201510: usar la serie "vieja" (y para 201510 usar old_idx)
# - para meses  > 201510: usar la serie nueva * factor
df["indice_unif"] = pd.NA

for i, row in df.iterrows():
    if row["yyyymm"] < OVERLAP:
        df.at[i, "indice_unif"] = row["indice"]
    elif row["yyyymm"] == OVERLAP:
        # tomamos la observación vieja como la referencia del mes
        df.at[i, "indice_unif"] = old_oct if i == old_idx else pd.NA
    else:
        df.at[i, "indice_unif"] = row["indice"] * factor

# df["indice_unif"] = df["indice_unif"].astype(float)
df_unif = df.dropna(subset=["indice_unif"]).copy()

# -----------------------
# Rebase a 2004=100 (promedio 2004)
# -----------------------
df_unif["anio"] = df_unif["fecha"].dt.year
base_2004 = df_unif.loc[df_unif["anio"] == 2004, "indice_unif"].mean()
if pd.isna(base_2004):
    raise RuntimeError("No encontré datos del año 2004 para calcular el promedio base.")

df_unif["indice_unif_base2004"] = df_unif["indice_unif"] / base_2004 * 100

# -----------------------
# Reportes de chequeo
# -----------------------
print("=== CHEQUEOS ===")
print(f"Registros totales (incluye duplicado oct-2015): {len(df)}")
print(f"Registros unificados (sin duplicados):         {len(df_unif)}")
print()
print(f"Oct-2015 viejo (escala vieja): {old_oct:.4f}")
print(f"Oct-2015 nuevo (escala nueva): {new_oct:.4f}")
print(f"Factor empalme (viejo/nuevo):  {factor:.8f}")
print(f"Oct-2015 nuevo * factor:       {(new_oct*factor):.4f}  (debería igualar al viejo)")
print()
print(f"Promedio 2004 (base):          {base_2004:.6f}")
print("Promedio de 'indice_base2004' en 2004 (debería ser ~100):",
      df_unif.loc[df_unif["anio"] == 2004, "indice_unif_base2004"].mean())
print()

# Mostrar 10 filas alrededor del empalme
mask = (df_unif["fecha"] >= "2015-07-01") & (df_unif["fecha"] <= "2016-01-01")
print("=== Ventana alrededor del empalme (jul-2015 a ene-2016) ===")
print(df_unif.loc[mask, ["fecha", "yyyymm", "indice_unif", "indice_unif_base2004"]].to_string(index=False))


=== CHEQUEOS ===
Registros totales (incluye duplicado oct-2015): 264
Registros unificados (sin duplicados):         263

Oct-2015 viejo (escala vieja): 240.6100
Oct-2015 nuevo (escala nueva): 73.9700
Factor empalme (viejo/nuevo):  3.25280519
Oct-2015 nuevo * factor:       240.6100  (debería igualar al viejo)

Promedio 2004 (base):          20.917500
Promedio de 'indice_base2004' en 2004 (debería ser ~100): 100.0

=== Ventana alrededor del empalme (jul-2015 a ene-2016) ===
     fecha  yyyymm indice_unif indice_unif_base2004
2015-07-01  201507      227.88          1089.422732
2015-08-01  201508      233.82          1117.820007
2015-09-01  201509      237.46          1135.221704
2015-10-01  201510      240.61          1150.280865
2015-11-01  201511  247.050554          1181.071133
2015-12-01  201512   251.63701          1202.997536
2016-01-01  201601  256.743914          1227.412041
