In [1]:
from ipywidgets import interact, widgets
interact(lambda x: x**2, x=widgets.IntSlider(min=0, max=10));

interactive(children=(IntSlider(value=0, description='x', max=10), Output()), _dom_classes=('widget-interact',…

# 📉 Convergence Hypothesis: Do Poor Countries Catch Up?

A fundamental question in economic growth is whether poorer countries tend to grow faster than richer countries, leading to a narrowing of income gaps over time. This idea is known as the **convergence hypothesis**.

The standard **Solow growth model**, due to diminishing returns to capital, predicts **conditional convergence**: countries with similar underlying characteristics (like saving rates, population growth rates, and technology levels - i.e., converging to the same steady state) should exhibit convergence. Poorer countries (further below their steady state) will grow faster than richer countries (closer to their steady state).

Testing **unconditional convergence** (simply looking if initially poor countries grow faster than initially rich ones, without controlling for other factors) often yields mixed results globally. However, it's a useful starting point to examine the predictions of the basic model.

This simulation generates artificial data for a set of countries to visually test for unconditional convergence by plotting average growth rates against initial income levels.

# ⚙️ Simulation Setup

We simulate data for $N$ countries over $T$ years:

1.  **Initial Income ($Y_0$):** Each country $i$ starts with a randomly assigned initial log income per capita, $\log(Y_{i0})$, drawn from a uniform distribution. This creates a range of initially rich and poor countries.
    $$ \log(Y_{i0}) \sim U(\text{min\_logY, max\_logY}) $$

2.  **Generating Growth Rates ($g_i$):** The core idea is to generate average annual growth rates ($g_i$) that depend *negatively* on the initial income level, plus some random noise, consistent with the convergence hypothesis.
    $$ g_i = \text{constant} + \beta \log(Y_{i0}) + \epsilon_i $$
    - $\beta$: The "true" convergence parameter. If $\beta < 0$, poorer countries (lower $\log(Y_{i0})$) tend to have higher growth $g_i$. This is the parameter we control with the `True Slope β` slider.
    - $\epsilon_i$: A random noise term ($\epsilon_i \sim N(0, \sigma^2_{\text{noise}})$) representing country-specific factors, shocks, or measurement error affecting growth, controlled by the `Noise Level` slider.
    * *(Note: The simulation code uses a simplified direct generation: `g = beta_true * log_Y0 + noise`)*

3.  **Final Income ($Y_T$):** Final income after $T$ years is calculated assuming growth was constant at the generated average rate $g_i$.
    $$ Y_{iT} = Y_{i0} e^{g_i T} $$

4.  **Realized Average Growth:** We calculate the *realized* average annual growth rate from the simulated $Y_0$ and $Y_T$.
    $$ g_{\text{avg}, i} = \frac{\log(Y_{iT}) - \log(Y_{i0})}{T} $$
    *(Note: Due to the noise term $\epsilon_i$, this $g_{\text{avg}, i}$ will be close to, but not exactly identical to, the $g_i$ generated in step 2).*

# 📊 Testing for Convergence: Regression Analysis

To test for convergence in our simulated data, we perform a simple linear regression:

$$g_{\text{avg}, i} = \text{Intercept} + \beta_{\text{estimated}} \log(Y_{i0}) + \text{error}_i$$

We plot the average growth rate ($g_{\text{avg}, i}$) against the log of initial income ($\log(Y_{i0})$) for all $N$ countries and fit a regression line.

**Interpretation:**

* **Estimated Slope ($\beta_{\text{estimated}}$):**
    * If $\beta_{\text{estimated}} < 0$ and is statistically significant (low p-value), it supports the unconditional convergence hypothesis within this simulated dataset. This means, on average, countries that started poorer grew faster.
    * If $\beta_{\text{estimated}} \ge 0$ or is not statistically significant, we find no evidence for unconditional convergence in the sample.
* **R-squared ($R^2$):** Measures the proportion of the variation in growth rates explained by the initial income level. A higher $R^2$ means initial income is a stronger predictor of growth in this sample.
* **Noise:** The `Noise Level` slider controls the variance of $\epsilon_i$. Higher noise makes the relationship between initial income and growth weaker, potentially obscuring convergence (lower $R^2$, higher p-value, $\beta_{\text{estimated}}$ closer to zero) even if the true $\beta$ is negative.

Experiment with the sliders (`True Slope β`, `Noise Level`, `Number of Countries`, `Years`) to see how they affect the visual pattern and the regression results.

In [None]:
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import pandas as pd
from scipy import stats
from scipy.stats import linregress
from ipywidgets import interact, FloatSlider, IntSlider, Layout, Tab, VBox, HBox
from IPython.display import display, Markdown, HTML
import statsmodels.api as sm
import warnings
warnings.filterwarnings('ignore')

class ConvergenceAnalysis:
    def __init__(self, params):
        """Initialize convergence analysis with parameters"""
        self.params = params
        self.simulate_data()
        
    def simulate_data(self):
        """Generate simulated cross-country data"""
        p = self.params
        np.random.seed(42)
        
        # Generate initial incomes
        self.log_Y0 = np.random.uniform(
            p['log_y0_mean'] - p['log_y0_range']/2,
            p['log_y0_mean'] + p['log_y0_range']/2,
            p['N']
        )
        self.Y0 = np.exp(self.log_Y0)
        
        # Generate growth rates
        base_growth = 0.02
        growth_deterministic = p['beta_true'] * (self.log_Y0 - p['log_y0_mean'])
        growth_stochastic = np.random.normal(0, p['growth_noise_std'], p['N'])
        self.g_latent = base_growth + growth_deterministic + growth_stochastic
        
        # Calculate final incomes
        self.YT = self.Y0 * np.exp(self.g_latent * p['T'])
        self.log_YT = np.log(self.YT)
        
        # Calculate realized growth rates
        self.g_avg = (self.log_YT - self.log_Y0) / p['T']
        
        # Run regression analysis
        self.run_regression()
        
    def run_regression(self):
        """Perform regression analysis"""
        # Simple OLS
        X = sm.add_constant(self.log_Y0)
        model = sm.OLS(self.g_avg, X)
        self.results = model.fit()
        
        # Calculate additional statistics
        self.beta_est = self.results.params[1]
        self.std_err = self.results.bse[1]
        self.p_value = self.results.pvalues[1]
        self.r_squared = self.results.rsquared
        
        # Calculate confidence intervals
        self.ci_lower = self.results.conf_int()[1][0]
        self.ci_upper = self.results.conf_int()[1][1]
        
    def plot_analysis(self):
        """Create comprehensive visualization"""
        fig = make_subplots(
            rows=2, cols=2,
            subplot_titles=(
                'Growth vs Initial Income',
                'Income Distribution Over Time',
                'Growth Rate Distribution',
                'Parameter Uncertainty'
            )
        )
        
        # Plot 1: Growth vs Initial Income
        fig.add_trace(
            go.Scatter(
                x=self.log_Y0,
                y=self.g_avg * 100,
                mode='markers',
                name='Countries',
                marker=dict(
                    color=self.g_avg * 100,
                    colorscale='Viridis',
                    showscale=True,
                    colorbar=dict(title='Growth Rate (%)')
                )
            ),
            row=1, col=1
        )
        
        # Add regression line
        x_range = np.linspace(self.log_Y0.min(), self.log_Y0.max(), 100)
        y_pred = (self.results.params[0] + self.results.params[1] * x_range) * 100
        
        fig.add_trace(
            go.Scatter(
                x=x_range,
                y=y_pred,
                name='Regression Line',
                line=dict(color='red', dash='dash')
            ),
            row=1, col=1
        )
        
        # Plot 2: Income Distribution
        fig.add_trace(
            go.Histogram(
                x=self.log_Y0,
                name='Initial Income',
                opacity=0.75
            ),
            row=1, col=2
        )
        
        fig.add_trace(
            go.Histogram(
                x=self.log_YT,
                name='Final Income',
                opacity=0.75
            ),
            row=1, col=2
        )
        
        # Plot 3: Growth Rate Distribution
        fig.add_trace(
            go.Histogram(
                x=self.g_avg * 100,
                name='Growth Rates',
                nbinsx=20
            ),
            row=2, col=1
        )
        
        # Plot 4: Parameter Uncertainty
        bootstrap_betas = []
        for _ in range(1000):
            idx = np.random.choice(len(self.log_Y0), len(self.log_Y0), replace=True)
            X = sm.add_constant(self.log_Y0[idx])
            y = self.g_avg[idx]
            results = sm.OLS(y, X).fit()
            bootstrap_betas.append(results.params[1])
        
        fig.add_trace(
            go.Histogram(
                x=bootstrap_betas,
                name='Bootstrap β',
                nbinsx=30
            ),
            row=2, col=2
        )
        
        # Update layout
        fig.update_layout(
            height=800,
            width=1200,
            showlegend=True,
            title_text=f"Convergence Analysis (N={self.params['N']}, T={self.params['T']} years)"
        )
        
        # Update axes labels
        fig.update_xaxes(title_text="Log Initial Income", row=1, col=1)
        fig.update_xaxes(title_text="Log Income", row=1, col=2)
        fig.update_xaxes(title_text="Growth Rate (%)", row=2, col=1)
        fig.update_xaxes(title_text="Estimated β", row=2, col=2)
        
        fig.update_yaxes(title_text="Growth Rate (%)", row=1, col=1)
        fig.update_yaxes(title_text="Count", row=1, col=2)
        fig.update_yaxes(title_text="Count", row=2, col=1)
        fig.update_yaxes(title_text="Count", row=2, col=2)
        
        fig.show()
        
        # Display analysis
        self.display_analysis()
        
    def display_analysis(self):
        """Display comprehensive analysis results"""
        # Calculate additional statistics
        mean_growth = np.mean(self.g_avg) * 100
        std_growth = np.std(self.g_avg) * 100
        income_dispersion_initial = np.std(self.log_Y0)
        income_dispersion_final = np.std(self.log_YT)
        
        # Generate analysis text
        analysis = f"""
        ### 📊 Convergence Analysis Results
        
        #### 1. Regression Results:
        - **Estimated β:** {self.beta_est:.4f} (Std. Error: {self.std_err:.4f})
        - **95% CI:** [{self.ci_lower:.4f}, {self.ci_upper:.4f}]
        - **P-value:** {self.p_value:.4f}
        - **R²:** {self.r_squared:.3f}
        
        #### 2. Growth Statistics:
        - **Mean Growth Rate:** {mean_growth:.2f}%
        - **Growth Rate Std Dev:** {std_growth:.2f}%
        - **Growth Rate Range:** [{min(self.g_avg*100):.2f}%, {max(self.g_avg*100):.2f}%]
        
        #### 3. Income Dispersion:
        - **Initial Income Dispersion:** {income_dispersion_initial:.3f}
        - **Final Income Dispersion:** {income_dispersion_final:.3f}
        - **Change in Dispersion:** {((income_dispersion_final/income_dispersion_initial - 1)*100):.1f}%
        
        #### 4. Convergence Assessment:
        {self.generate_convergence_assessment()}
        """
        
        display(Markdown(analysis))
        
    def generate_convergence_assessment(self):
        """Generate detailed convergence assessment"""
        assessment = []
        
        # Statistical significance
        if self.p_value < 0.05:
            if self.beta_est < 0:
                assessment.append("✅ Strong evidence of convergence (statistically significant negative β)")
            else:
                assessment.append("❌ Evidence of divergence (statistically significant positive β)")
        else:
            assessment.append("⚠️ No statistically significant evidence of convergence")
        
        # Effect size
        if abs(self.beta_est) > 0.02:
            assessment.append(f"Strong {'convergence' if self.beta_est < 0 else 'divergence'} effect")
        else:
            assessment.append("Weak convergence/divergence effect")
        
        # Model fit
        if self.r_squared > 0.3:
            assessment.append("Initial income explains substantial variation in growth rates")
        else:
            assessment.append("Initial income explains limited variation in growth rates")
        
        # Income dispersion
        if np.std(self.log_YT) < np.std(self.log_Y0):
            assessment.append("Income differences have decreased over time")
        else:
            assessment.append("Income differences have increased over time")
        
        return "- " + "\n- ".join(assessment)

def run_convergence_simulation(
    N=30, T=20, growth_noise_std=0.01, beta_true=-0.02,
    log_y0_mean=8.0, log_y0_range=4.0
):
    """Run interactive convergence simulation"""
    params = {
        'N': N,
        'T': T,
        'growth_noise_std': growth_noise_std,
        'beta_true': beta_true,
        'log_y0_mean': log_y0_mean,
        'log_y0_range': log_y0_range
    }
    
    analysis = ConvergenceAnalysis(params)
    analysis.plot_analysis()

# Create interactive widgets
interact(
    run_convergence_simulation,
    N=IntSlider(
        value=30, min=10, max=200, step=5,
        description='Number of Countries:',
        style={'description_width': 'initial'},
        layout=Layout(width='500px')
    ),
    T=IntSlider(
        value=20, min=5, max=50, step=1,
        description='Time Horizon (years):',
        style={'description_width': 'initial'},
        layout=Layout(width='500px')
    ),
    growth_noise_std=FloatSlider(
        value=0.01, min=0.001, max=0.05, step=0.001,
        description='Growth Noise (σ):',
        style={'description_width': 'initial'},
        layout=Layout(width='500px')
    ),
    beta_true=FloatSlider(
        value=-0.02, min=-0.05, max=0.01, step=0.001,
        description='True β:',
        style={'description_width': 'initial'},
        layout=Layout(width='500px')
    ),
    log_y0_mean=FloatSlider(
        value=8.0, min=6.0, max=10.0, step=0.1,
        description='Mean Initial Income:',
        style={'description_width': 'initial'},
        layout=Layout(width='500px')
    ),
    log_y0_range=FloatSlider(
        value=4.0, min=1.0, max=6.0, step=0.1,
        description='Income Range:',
        style={'description_width': 'initial'},
        layout=Layout(width='500px')
    )
)

interactive(children=(IntSlider(value=30, description='Number of Countries (N):', max=200, min=10, step=5, sty…

In [None]:
class RealWorldConvergenceAnalysis:
    def __init__(self):
        """Initialize with real-world economic data"""
        self.data = self.load_country_data()
        
    def load_country_data(self):
        """Load and process cross-country economic data"""
        # Create sample data (replace with actual Penn World Table data)
        countries = {
            'Advanced': ['USA', 'JPN', 'DEU', 'GBR', 'FRA', 'ITA', 'CAN'],
            'Emerging Asia': ['CHN', 'IND', 'IDN', 'THA', 'MYS', 'PHL', 'VNM'],
            'Latin America': ['BRA', 'MEX', 'ARG', 'CHL', 'COL', 'PER'],
            'Africa': ['ZAF', 'NGA', 'KEN', 'ETH', 'GHA', 'SEN']
        }
        
        base_year = 1960
        end_year = 2020
        years = range(base_year, end_year + 1)
        
        data = []
        np.random.seed(42)
        
        for region, country_list in countries.items():
            for country in country_list:
                # Generate country-specific parameters
                if region == 'Advanced':
                    initial_gdp = np.exp(np.random.normal(9.5, 0.3))
                    growth_mean = 0.02
                    growth_std = 0.01
                elif region == 'Emerging Asia':
                    initial_gdp = np.exp(np.random.normal(7.5, 0.5))
                    growth_mean = 0.04
                    growth_std = 0.015
                else:
                    initial_gdp = np.exp(np.random.normal(7.0, 0.7))
                    growth_mean = 0.02
                    growth_std = 0.02
                
                gdp = initial_gdp
                for year in years:
                    growth = np.random.normal(growth_mean, growth_std)
                    data.append({
                        'Country': country,
                        'Region': region,
                        'Year': year,
                        'GDP_per_capita': gdp,
                        'Growth_Rate': growth
                    })
                    gdp *= (1 + growth)
        
        return pd.DataFrame(data)
    
    def analyze_convergence(self, start_year=1960, end_year=2020):
        """Analyze convergence patterns in the data"""
        # Calculate initial and final GDP
        initial_gdp = self.data[self.data['Year'] == start_year].set_index('Country')['GDP_per_capita']
        final_gdp = self.data[self.data['Year'] == end_year].set_index('Country')['GDP_per_capita']
        
        # Calculate average growth rates
        years = end_year - start_year
        growth_rates = (np.log(final_gdp) - np.log(initial_gdp)) / years
        
        # Combine data
        analysis_df = pd.DataFrame({
            'Initial_GDP': initial_gdp,
            'Final_GDP': final_gdp,
            'Growth_Rate': growth_rates
        })
        analysis_df['Log_Initial_GDP'] = np.log(analysis_df['Initial_GDP'])
        analysis_df['Region'] = self.data[self.data['Year'] == start_year].set_index('Country')['Region']
        
        # Run regression
        X = sm.add_constant(analysis_df['Log_Initial_GDP'])
        model = sm.OLS(analysis_df['Growth_Rate'], X)
        results = model.fit()
        
        self.analysis_df = analysis_df
        self.regression_results = results
        
        # Create visualization
        self.plot_convergence_analysis(start_year, end_year)
        
    def plot_convergence_analysis(self, start_year, end_year):
        """Create comprehensive convergence analysis plots"""
        fig = make_subplots(
            rows=2, cols=2,
            subplot_titles=(
                'Growth vs Initial Income',
                'Income Distribution Evolution',
                'Regional Growth Patterns',
                'Income Mobility'
            )
        )
        
        # Plot 1: Growth vs Initial Income with regression
        for region in self.analysis_df['Region'].unique():
            region_data = self.analysis_df[self.analysis_df['Region'] == region]
            fig.add_trace(
                go.Scatter(
                    x=region_data['Log_Initial_GDP'],
                    y=region_data['Growth_Rate'] * 100,
                    mode='markers',
                    name=region,
                    marker=dict(size=10)
                ),
                row=1, col=1
            )
        
        # Add regression line
        x_range = np.linspace(
            self.analysis_df['Log_Initial_GDP'].min(),
            self.analysis_df['Log_Initial_GDP'].max(),
            100
        )
        y_pred = (self.regression_results.params[0] +
                  self.regression_results.params[1] * x_range)
        
        fig.add_trace(
            go.Scatter(
                x=x_range,
                y=y_pred * 100,
                name='Regression Line',
                line=dict(color='black', dash='dash')
            ),
            row=1, col=1
        )
        
        # Plot 2: Income Distribution Evolution
        fig.add_trace(
            go.Histogram(
                x=np.log(self.analysis_df['Initial_GDP']),
                name=f'GDP {start_year}',
                opacity=0.7
            ),
            row=1, col=2
        )
        
        fig.add_trace(
            go.Histogram(
                x=np.log(self.analysis_df['Final_GDP']),
                name=f'GDP {end_year}',
                opacity=0.7
            ),
            row=1, col=2
        )
        
        # Plot 3: Regional Growth Patterns
        regional_growth = self.analysis_df.groupby('Region')['Growth_Rate'].mean() * 100
        fig.add_trace(
            go.Bar(
                x=regional_growth.index,
                y=regional_growth.values,
                name='Regional Growth'
            ),
            row=2, col=1
        )
        
        # Plot 4: Income Mobility
        fig.add_trace(
            go.Scatter(
                x=self.analysis_df['Log_Initial_GDP'],
                y=np.log(self.analysis_df['Final_GDP']),
                mode='markers',
                name='Income Mobility',
                marker=dict(
                    color=self.analysis_df['Growth_Rate'] * 100,
                    colorscale='Viridis',
                    showscale=True,
                    colorbar=dict(title='Growth Rate (%)')
                )
            ),
            row=2, col=2
        )
        
        # Add 45-degree line
        x_range = np.linspace(
            self.analysis_df['Log_Initial_GDP'].min(),
            self.analysis_df['Log_Initial_GDP'].max(),
            100
        )
        fig.add_trace(
            go.Scatter(
                x=x_range,
                y=x_range,
                name='No Change Line',
                line=dict(color='black', dash='dot')
            ),
            row=2, col=2
        )
        
        # Update layout
        fig.update_layout(
            height=800,
            width=1200,
            showlegend=True,
            title_text=f"Cross-Country Convergence Analysis ({start_year}-{end_year})"
        )
        
        fig.update_xaxes(title_text="Log Initial GDP", row=1, col=1)
        fig.update_xaxes(title_text="Log GDP per Capita", row=1, col=2)
        fig.update_xaxes(title_text="Region", row=2, col=1)
        fig.update_xaxes(title_text="Log Initial GDP", row=2, col=2)
        
        fig.update_yaxes(title_text="Growth Rate (%)", row=1, col=1)
        fig.update_yaxes(title_text="Count", row=1, col=2)
        fig.update_yaxes(title_text="Average Growth Rate (%)", row=2, col=1)
        fig.update_yaxes(title_text="Log Final GDP", row=2, col=2)
        
        fig.show()
        
        # Display analysis
        self.display_convergence_analysis()
    
    def display_convergence_analysis(self):
        """Display comprehensive convergence analysis"""
        # Calculate key statistics
        beta = self.regression_results.params[1]
        p_value = self.regression_results.pvalues[1]
        r_squared = self.regression_results.rsquared
        
        # Calculate additional metrics
        income_dispersion_initial = np.std(np.log(self.analysis_df['Initial_GDP']))
        income_dispersion_final = np.std(np.log(self.analysis_df['Final_GDP']))
        
        regional_growth = self.analysis_df.groupby('Region')['Growth_Rate'].agg(['mean', 'std']) * 100
        
        analysis = f"""
        ### 📊 Cross-Country Convergence Analysis
        
        #### 1. Convergence Test Results:
        - **β coefficient:** {beta:.4f} (p-value: {p_value:.4f})
        - **R-squared:** {r_squared:.3f}
        - **Interpretation:** {"Evidence of convergence" if beta < 0 and p_value < 0.05 else "No significant evidence of convergence"}
        
        #### 2. Income Dispersion:
        - **Initial dispersion:** {income_dispersion_initial:.3f}
        - **Final dispersion:** {income_dispersion_final:.3f}
        - **Change:** {((income_dispersion_final/income_dispersion_initial - 1)*100):.1f}%
        
        #### 3. Regional Patterns:
        """
        
        for region in regional_growth.index:
            analysis += f"""
        - **{region}:**
          * Average growth: {regional_growth.loc[region, 'mean']:.2f}%
          * Growth volatility: {regional_growth.loc[region, 'std']:.2f}%
            """
        
        analysis += """
        #### 4. Key Findings:
        """
        analysis += self.generate_key_findings()
        
        display(Markdown(analysis))
    
    def generate_key_findings(self):
        """Generate key findings from the analysis"""
        findings = []
        
        # Convergence assessment
        beta = self.regression_results.params[1]
        p_value = self.regression_results.pvalues[1]
        if beta < 0 and p_value < 0.05:
            findings.append("Strong evidence of unconditional convergence")
        elif beta < 0:
            findings.append("Weak evidence of convergence (not statistically significant)")
        else:
            findings.append("No evidence of convergence")
        
        # Income dispersion
        initial_dispersion = np.std(np.log(self.analysis_df['Initial_GDP']))
        final_dispersion = np.std(np.log(self.analysis_df['Final_GDP']))
        if final_dispersion < initial_dispersion:
            findings.append("Income differences have decreased over time")
        else:
            findings.append("Income differences have increased over time")
        
        # Regional patterns
        regional_growth = self.analysis_df.groupby('Region')['Growth_Rate'].mean()
        fastest_region = regional_growth.idxmax()
        slowest_region = regional_growth.idxmin()
        findings.append(f"Fastest growing region: {fastest_region}")
        findings.append(f"Slowest growing region: {slowest_region}")
        
        # Mobility
        mobility = (np.log(self.analysis_df['Final_GDP']) - 
                   np.log(self.analysis_df['Initial_GDP'])).std()
        if mobility > 0.5:
            findings.append("Substantial income mobility across countries")
        else:
            findings.append("Limited income mobility across countries")
        
        return "- " + "\n- ".join(findings)

# Run real-world analysis
real_world_analysis = RealWorldConvergenceAnalysis()
real_world_analysis.analyze_convergence()

## 📝 Exercises and Problems

### Exercise 1: Basic Convergence Analysis
Using the `RealWorldConvergenceAnalysis` class we just created:

1. Analyze convergence for different time periods:
   - Compare 1960-1990 vs 1990-2020
   - What differences do you observe in the convergence patterns?
   - How do you interpret these differences?

2. Regional Analysis:
   - Which regions show stronger evidence of convergence?
   - Calculate the speed of convergence for each region
   - What factors might explain the differences in convergence speeds?

### Exercise 2: Advanced Topics

1. Club Convergence:
   - Modify the code to test for convergence clubs
   - Try different methods of identifying clubs (e.g., initial income thresholds, clustering)
   - Compare the convergence patterns within and between clubs

2. Conditional Convergence:
   - Add control variables to the analysis (e.g., education, institutions)
   - Compare the results with unconditional convergence
   - Discuss the policy implications of your findings

### Exercise 3: Research Project

Choose one of the following research questions:

1. **The Asian Growth Miracle**
   - Analyze the convergence patterns of Asian economies
   - Compare with other regions
   - What factors contributed to their success?

2. **Middle Income Trap**
   - Identify countries that might be in a middle-income trap
   - Analyze their growth patterns
   - Propose potential policy solutions

3. **Impact of Global Events**
   - How did major events (e.g., financial crises, pandemic) affect convergence?
   - Are some regions more resilient than others?
   - What are the implications for convergence theory?

### Coding Challenge

1. Enhance the `RealWorldConvergenceAnalysis` class:
   - Add methods for conditional convergence analysis
   - Implement club convergence detection
   - Create new visualizations for your analysis

2. Create a dashboard:
   - Use interactive widgets to explore the data
   - Allow users to select different time periods and regions
   - Add dynamic updates to all plots

### Discussion Questions

1. What are the main drivers of economic convergence?
2. Why do some countries converge while others don't?
3. How do institutional factors affect convergence patterns?
4. What role does technology diffusion play in convergence?
5. How might future trends (e.g., AI, climate change) affect convergence?

### Data Project

1. Replace the simulated data with real data:
   - Download data from Penn World Table or World Bank
   - Clean and prepare the data
   - Replicate the analysis with real data
   - Compare your findings with the literature

### Policy Analysis

1. Based on the convergence patterns you observe:
   - What policies might promote faster convergence?
   - How do different policy choices affect convergence speed?
   - What are the trade-offs involved in these policies?

### Bonus Challenge

1. Create a convergence prediction model:
   - Use machine learning techniques
   - Make predictions about future convergence patterns
   - Evaluate the accuracy of your predictions

## 💡 Solution Approaches and Hints

### Exercise 1 Hints:

1. Time Period Analysis:
   ```python
   # Example approach:
   analysis = RealWorldConvergenceAnalysis()
   # For 1960-1990
   analysis.analyze_convergence(start_year=1960, end_year=1990)
   # For 1990-2020
   analysis.analyze_convergence(start_year=1990, end_year=2020)
   ```

2. Regional Analysis:
   - Consider extending the class with a new method for regional analysis
   - Use pandas groupby operations for regional comparisons
   - Calculate half-life of convergence: ln(2)/β

### Exercise 2 Hints:

1. Club Convergence:
   - Consider using clustering algorithms (e.g., K-means)
   - Look for natural breaks in the income distribution
   - Test convergence within each identified club

2. Conditional Convergence:
   - Use multiple regression analysis
   - Consider adding control variables like:
     * Human capital indicators
     * Investment rates
     * Institutional quality measures

### Research Project Tips:

1. Asian Growth Miracle:
   - Focus on specific time periods (e.g., 1960-1990)
   - Compare growth rates before and after major policy changes
   - Look for structural breaks in the data

2. Middle Income Trap:
   - Define income thresholds
   - Track transition probabilities
   - Identify stuck" countries"

3. Global Events Impact:
   - Use dummy variables for crisis periods
   - Consider difference-in-differences analysis
   - Look for structural breaks

### Coding Challenge Tips:

1. Class Enhancement:
   ```python
   def add_control_variables(self, controls_df):
       # Merge with existing data
       self.data = pd.merge(self.data, controls_df,
                           on=['Country', 'Year'])
   
   def detect_clubs(self, n_clubs=3):
       # Use clustering
       from sklearn.cluster import KMeans
       # Implementation details...
   ```

2. Dashboard Creation:
   ```python
   import ipywidgets as widgets
   
   # Example widget setup
   year_slider = widgets.IntRangeSlider(
       value=[1960, 2020],
       min=1960,
       max=2020,
       step=1
   )
   ```

### Data Project Guidance:

1. Data Sources:
   - Penn World Table: https://www.rug.nl/ggdc/productivity/pwt/
   - World Bank WDI: https://databank.worldbank.org/source/world-development-indicators
   - IMF WEO: https://www.imf.org/en/Publications/WEO

2. Data Cleaning Tips:
   - Handle missing values appropriately
   - Check for outliers
   - Ensure consistent units and base years

### Policy Analysis Framework:

1. Steps for Policy Analysis:
   - Identify policy variables
   - Create policy scenarios
   - Simulate outcomes
   - Compare results

2. Policy Evaluation:
   - Consider both short and long-run effects
   - Account for implementation challenges
   - Evaluate distributional impacts

### Machine Learning Approach:

1. Prediction Model:
   ```python
   from sklearn.model_selection import train_test_split
   from sklearn.ensemble import RandomForestRegressor
   
   # Example approach
   X = # Feature matrix
   y = # Target variable
   
   X_train, X_test, y_train, y_test = train_test_split(
       X, y, test_size=0.2
   )
   
   model = RandomForestRegressor()
   model.fit(X_train, y_train)
   ```

2. Model Evaluation:
   - Use cross-validation
   - Consider different time horizons
   - Evaluate prediction uncertainty

### Key Concepts to Remember:

1. β-convergence vs σ-convergence
2. Absolute vs conditional convergence
3. Club convergence vs global convergence
4. Growth accounting fundamentals
5. Role of initial conditions

### Common Pitfalls to Avoid:

1. Not accounting for:
   - Heteroskedasticity
   - Serial correlation
   - Endogeneity

2. Overlooking:
   - Data quality issues
   - Structural breaks
   - Non-linear relationships

In [None]:
class AdvancedConvergenceAnalysis(RealWorldConvergenceAnalysis):
    def __init__(self):
        super().__init__()
        self.add_simulated_controls()
    
    def add_simulated_controls(self):
        """Add simulated control variables"""
        np.random.seed(42)
        
        # Add control variables
        countries = self.data['Country'].unique()
        years = self.data['Year'].unique()
        
        # Create control variables
        controls = []
        for country in countries:
            # Country-specific base values
            education_base = np.random.normal(8, 2)  # years of schooling
            investment_base = np.random.normal(0.2, 0.05)  # investment rate
            institutions_base = np.random.normal(0.6, 0.2)  # institutional quality
            
            for year in years:
                # Add time trends and random variations
                education = education_base + 0.05 * (year - 1960) + np.random.normal(0, 0.2)
                investment = investment_base + np.random.normal(0, 0.02)
                institutions = institutions_base + 0.001 * (year - 1960) + np.random.normal(0, 0.05)
                
                controls.append({
                    'Country': country,
                    'Year': year,
                    'Education': education,
                    'Investment_Rate': investment,
                    'Institutional_Quality': institutions
                })
        
        controls_df = pd.DataFrame(controls)
        self.data = pd.merge(self.data, controls_df, on=['Country', 'Year'])
    
    def analyze_conditional_convergence(self, start_year=1960, end_year=2020):
        """Analyze conditional convergence with control variables"""
        # Prepare data
        initial_data = self.data[self.data['Year'] == start_year].set_index('Country')
        final_data = self.data[self.data['Year'] == end_year].set_index('Country')
        
        # Calculate growth rates
        years = end_year - start_year
        growth_rates = (np.log(final_data['GDP_per_capita']) - 
                       np.log(initial_data['GDP_per_capita'])) / years
        
        # Prepare control variables (average over period)
        controls = self.data.groupby('Country')[
            ['Education', 'Investment_Rate', 'Institutional_Quality']
        ].mean()
        
        # Combine data
        analysis_df = pd.DataFrame({
            'Growth_Rate': growth_rates,
            'Initial_GDP': initial_data['GDP_per_capita'],
            'Log_Initial_GDP': np.log(initial_data['GDP_per_capita']),
            'Education': controls['Education'],
            'Investment_Rate': controls['Investment_Rate'],
            'Institutional_Quality': controls['Institutional_Quality'],
            'Region': initial_data['Region']
        })
        
        # Run regressions
        # 1. Unconditional convergence
        X1 = sm.add_constant(analysis_df['Log_Initial_GDP'])
        model1 = sm.OLS(analysis_df['Growth_Rate'], X1)
        results1 = model1.fit()
        
        # 2. Conditional convergence
        X2 = sm.add_constant(analysis_df[['Log_Initial_GDP', 'Education', 
                                        'Investment_Rate', 'Institutional_Quality']])
        model2 = sm.OLS(analysis_df['Growth_Rate'], X2)
        results2 = model2.fit()
        
        self.analysis_df = analysis_df
        self.unconditional_results = results1
        self.conditional_results = results2
        
        # Create visualizations
        self.plot_conditional_convergence(start_year, end_year)
    
    def plot_conditional_convergence(self, start_year, end_year):
        """Create comprehensive conditional convergence analysis plots"""
        fig = make_subplots(
            rows=2, cols=2,
            subplot_titles=(
                'Unconditional vs Conditional Convergence',
                'Control Variables Impact',
                'Regional Patterns with Controls',
                'Convergence Speed Analysis'
            )
        )
        
        # Plot 1: Unconditional vs Conditional Convergence
        for region in self.analysis_df['Region'].unique():
            region_data = self.analysis_df[self.analysis_df['Region'] == region]
            fig.add_trace(
                go.Scatter(
                    x=region_data['Log_Initial_GDP'],
                    y=region_data['Growth_Rate'] * 100,
                    mode='markers',
                    name=region,
                    marker=dict(size=10)
                ),
                row=1, col=1
            )
        
        # Add regression lines
        x_range = np.linspace(
            self.analysis_df['Log_Initial_GDP'].min(),
            self.analysis_df['Log_Initial_GDP'].max(),
            100
        )
        
        # Unconditional
        y_pred1 = (self.unconditional_results.params[0] +
                   self.unconditional_results.params[1] * x_range)
        fig.add_trace(
            go.Scatter(
                x=x_range,
                y=y_pred1 * 100,
                name='Unconditional',
                line=dict(color='black', dash='dash')
            ),
            row=1, col=1
        )
        
        # Plot 2: Control Variables Impact
        control_vars = ['Education', 'Investment_Rate', 'Institutional_Quality']
        coefficients = self.conditional_results.params[2:]
        p_values = self.conditional_results.pvalues[2:]
        
        fig.add_trace(
            go.Bar(
                x=control_vars,
                y=coefficients * 100,
                name='Effect Size',
                error_y=dict(
                    type='data',
                    array=self.conditional_results.bse[2:] * 100,
                    visible=True
                )
            ),
            row=1, col=2
        )
        
        # Plot 3: Regional Patterns with Controls
        residuals = self.conditional_results.resid
        regional_effects = self.analysis_df.groupby('Region').agg({
            'Growth_Rate': 'mean',
            'Log_Initial_GDP': 'mean'
        })
        
        fig.add_trace(
            go.Scatter(
                x=regional_effects['Log_Initial_GDP'],
                y=regional_effects['Growth_Rate'] * 100,
                mode='markers+text',
                text=regional_effects.index,
                textposition='top center',
                name='Regional Effects'
            ),
            row=2, col=1
        )
        
        # Plot 4: Convergence Speed Analysis
        # Calculate half-life for different income levels
        income_levels = np.linspace(
            self.analysis_df['Log_Initial_GDP'].min(),
            self.analysis_df['Log_Initial_GDP'].max(),
            20
        )
        
        unconditional_speed = -self.unconditional_results.params[1]
        conditional_speed = -self.conditional_results.params[1]
        
        half_life_uncond = np.log(2) / unconditional_speed
        half_life_cond = np.log(2) / conditional_speed
        
        fig.add_trace(
            go.Bar(
                x=['Unconditional', 'Conditional'],
                y=[half_life_uncond, half_life_cond],
                name='Convergence Half-Life'
            ),
            row=2, col=2
        )
        
        # Update layout
        fig.update_layout(
            height=800,
            width=1200,
            showlegend=True,
            title_text=f"Conditional Convergence Analysis ({start_year}-{end_year})"
        )
        
        fig.show()
        
        # Display analysis
        self.display_conditional_analysis()
    
    def display_conditional_analysis(self):
        """Display comprehensive conditional convergence analysis"""
        # Calculate key statistics
        uncond_beta = self.unconditional_results.params[1]
        cond_beta = self.conditional_results.params[1]
        
        uncond_r2 = self.unconditional_results.rsquared
        cond_r2 = self.conditional_results.rsquared
        
        analysis = f"""
        ### 📊 Conditional vs Unconditional Convergence Analysis
        
        #### 1. Convergence Coefficients:
        - **Unconditional β:** {uncond_beta:.4f} (p-value: {self.unconditional_results.pvalues[1]:.4f})
        - **Conditional β:** {cond_beta:.4f} (p-value: {self.conditional_results.pvalues[1]:.4f})
        
        #### 2. Model Fit:
        - **Unconditional R²:** {uncond_r2:.3f}
        - **Conditional R²:** {cond_r2:.3f}
        - **Improvement:** {((cond_r2/uncond_r2 - 1)*100):.1f}%
        
        #### 3. Control Variables Impact:
        """
        
        controls = ['Education', 'Investment_Rate', 'Institutional_Quality']
        for i, control in enumerate(controls, 2):
            coef = self.conditional_results.params[i]
            p_val = self.conditional_results.pvalues[i]
            analysis += f"""
        - **{control}:**
          * Coefficient: {coef:.4f}
          * P-value: {p_val:.4f}
          * Significance: {"***" if p_val < 0.01 else "**" if p_val < 0.05 else "*" if p_val < 0.1 else "Not significant"}
            """
        
        analysis += """
        #### 4. Key Findings:
        """
        analysis += self.generate_conditional_findings()
        
        display(Markdown(analysis))
    
    def generate_conditional_findings(self):
        """Generate key findings from the conditional analysis"""
        findings = []
        
        # Compare models
        if self.conditional_results.rsquared > self.unconditional_results.rsquared:
            improvement = ((self.conditional_results.rsquared / 
                          self.unconditional_results.rsquared - 1) * 100)
            findings.append(f"Controlling for additional variables improves model fit by {improvement:.1f}%")
        
        # Analyze convergence speed
        uncond_speed = -self.unconditional_results.params[1]
        cond_speed = -self.conditional_results.params[1]
        
        if cond_speed > uncond_speed:
            findings.append("Conditional convergence is faster than unconditional convergence")
        else:
            findings.append("Controlling for additional variables reduces the estimated convergence speed")
        
        # Analyze control variables
        controls = ['Education', 'Investment_Rate', 'Institutional_Quality']
        for i, control in enumerate(controls, 2):
            coef = self.conditional_results.params[i]
            p_val = self.conditional_results.pvalues[i]
            if p_val < 0.05:
                direction = "positive" if coef > 0 else "negative"
                findings.append(f"{control} has a significant {direction} effect on growth")
        
        # Regional patterns
        regional_residuals = self.analysis_df.groupby('Region')['Growth_Rate'].std()
        most_volatile = regional_residuals.idxmax()
        findings.append(f"Highest growth volatility observed in {most_volatile}")
        
        return "- " + "\n- ".join(findings)

# Example usage
advanced_analysis = AdvancedConvergenceAnalysis()
advanced_analysis.analyze_conditional_convergence()

# 🏁 Conclusion

This simulation illustrates the concept of economic convergence.

* When the **True Slope $\beta$** is negative, the underlying tendency is for initially poorer countries (low $\log Y_0$) to grow faster.
* The **regression analysis** attempts to recover this underlying relationship from the simulated data.
* **Noise** in growth rates can obscure the true relationship, making it harder to detect convergence statistically (i.e., leading to a higher p-value or an estimated slope closer to zero), especially with a small number of countries ($N$).
* Real-world tests of unconditional convergence often find weak or no evidence, suggesting that factors *other* than just the initial income level (differences in saving, education, institutions, etc., leading to different steady states) are crucial determinants of growth - motivating the concept of **conditional convergence**.

# 📘 Convergence Hypothesis

This model tests if poorer countries grow faster — a key prediction of the Solow model with diminishing returns.

We simulate:
\[
g = \frac{1}{T} \log\left(\frac{Y_T}{Y_0}\right)
\quad \text{vs.} \quad \log(Y_0)
\]

- A **negative slope** in the regression suggests **convergence**
- A flat or positive slope suggests **divergence**

**Sources**:  
- GrowthEcon [Ch. 6](https://growthecon.com/StudyGuide/convergence.html)  
- Charles Jones, *Macroeconomics*, Ch. 6

# 📝 Guided Student Exercise: Convergence in Practice
Apply your understanding!

1. **Suppose you set the true slope $\beta$ to -0.03 and noise to 0.01.**
    - What do you expect the regression slope to be?
    - Use the interactive plot above to check your answer. Does the estimated slope match the true value?

2. **Experiment:**
    - Increase the noise level. How does this affect the clarity of the convergence pattern and the regression results?
    - Try increasing the number of countries. Does the regression become more reliable?

3. **Challenge:**
    - Set $\beta$ to zero. What does the plot and regression show?

---
# 🌍 Real-World Data Extension: Global Growth and Convergence
Let's see how convergence looks in real data. We'll plot average growth rates vs. initial income for a sample of countries using World Bank data.

```python
import pandas as pd
import plotly.express as px
wb_url = 'https://databankfiles.worldbank.org/public/ddpext_download/ISG/INT/CSV/NY.GDP.PCAP.KD.csv' # Example: World Bank GDP per capita
df = pd.read_csv(wb_url, skiprows=4)
# Filter for years and countries as needed
df = df[['Country Name', '1960', '2020']]
df = df.dropna()
df['g_avg'] = (np.log(df['2020']) - np.log(df['1960'])) / (2020-1960)
df['log_y0'] = np.log(df['1960'])
fig = px.scatter(df, x='log_y0', y='g_avg', hover_name='Country Name', trendline='ols', labels={'log_y0':'Log Initial GDP per Capita (1960)','g_avg':'Avg. Annual Growth Rate (1960-2020)'})
fig.update_layout(title='Global Convergence: Growth vs. Initial Income (1960-2020)')
fig.show()
```

- Is the slope negative?
- Are there outliers or regional patterns?

---
# 📚 Further Reading & Resources
- [Mankiw, N. G. (2021). *Macroeconomics* (11th Edition), Chapter 8: Economic Growth II.](https://www.macmillanlearning.com/college/us/product/Macroeconomics/p/1319243584)
- [World Bank: World Development Indicators](https://databank.worldbank.org/source/world-development-indicators)
- [GrowthEcon Study Guide: Convergence](https://growthecon.com/StudyGuide/convergence.html)

---
# 🎨 Tips for Visual Exploration
- Use the interactive plot to try different values of $\beta$, noise, and sample size.
- Discuss with classmates: What real-world factors might explain why some countries do not converge?
- How does conditional convergence differ from unconditional convergence?

---
# 🚀 Next Steps
- Continue to the next notebook to explore the Solow model, growth accounting, and the role of technology!