# Inflación

### Estrategia 'Confia en tus amigos' pero un poco modificada

En general, si la diferencia entre WB e IMF es pequeña, confío en WB.

Pero si hay una gran diferencia (mayor al threshold), prefiero IMF porque:

- WB Puede demorar en reflejar eventos económicos abruptos porque espera confirmación oficial y por lo tanto es más concervadora.
- IMF publica estimaciones económicas más actualizadas y responsivas.

In [1]:
import pandas as pd
import numpy as np

In [2]:
inflacion_wb = pd.read_csv('../Inflacion/inflacion_wb_clean.csv', encoding='latin1')
inflacion_imf = pd.read_csv('../Inflacion/inflacion_imf_clean.csv')

In [3]:
def fusion_prioriza_wb_con_tolerancia(wb_df, imf_df, threshold=20):
    # Asegura que Israel esté en ambos DataFrames
    for df in [wb_df, imf_df]:
        if 'Israel' not in df['Country'].values:
            empty_row = pd.DataFrame([['Israel'] + [None]*(df.shape[1]-1)], columns=df.columns)
            df = pd.concat([df, empty_row], ignore_index=True)

    wb = wb_df.set_index('Country')
    imf = imf_df.set_index('Country')

    # Filtrar columnas numéricas comunes que son años
    year_cols = [col for col in wb.columns if col in imf.columns and str(col).isdigit()]
    
    # Crear índice combinado con inclusión forzada de Israel
    all_countries = wb.index.union(imf.index).union(['Israel'])

    fusionada = pd.DataFrame(index=all_countries)
    
    for col in year_cols:
        wb_val = wb[col] if col in wb.columns else pd.Series(index=all_countries, dtype='float64')
        imf_val = imf[col] if col in imf.columns else pd.Series(index=all_countries, dtype='float64')

        # Alinear con todos los países
        wb_val = wb_val.reindex(all_countries)
        imf_val = imf_val.reindex(all_countries)

        # Calcular diferencia absoluta
        diferencia = (wb_val - imf_val).abs()

        # Usar IMF si la diferencia supera el umbral
        usar_imf = diferencia > threshold

        # Combinar
        fusion_col = wb_val.copy()
        fusion_col[usar_imf & imf_val.notnull()] = imf_val[usar_imf & imf_val.notnull()]
        fusion_col = fusion_col.combine_first(imf_val)  # Si WB es NaN, usar IMF

        fusionada[col] = fusion_col

    return fusionada.reset_index()


inflacion_fusionada = fusion_prioriza_wb_con_tolerancia(inflacion_wb, inflacion_imf, threshold=20)

inflacion_fusionada

Unnamed: 0,Country,1980,1981,1982,1983,1984,1985,1986,1987,1988,...,2015,2016,2017,2018,2019,2020,2021,2022,2023,2024
0,ASEAN-5,16.600000,12.000000,7.800000,7.300000,14.000000,6.800000,2.900000,5.300000,6.500000,...,3.000000,2.000000,2.800000,2.600000,1.900000,0.900000,2.000000,4.800000,3.500000,2.000000
1,Advanced economies,13.600000,11.100000,8.400000,6.100000,6.500000,5.500000,2.900000,3.200000,3.600000,...,0.300000,0.700000,1.700000,2.000000,1.400000,0.700000,3.100000,7.300000,4.600000,2.600000
2,Afghanistan,5.253530,5.253530,5.253530,5.253530,5.253530,5.253530,5.253530,5.253530,5.253530,...,-0.661709,4.383892,4.975952,0.626149,2.302373,5.601888,5.133203,13.712102,-4.644709,-6.601186
3,Africa (Region),16.900000,15.900000,14.800000,16.200000,14.000000,13.900000,17.500000,17.000000,14.900000,...,7.200000,9.600000,12.600000,11.400000,9.400000,11.100000,12.300000,14.200000,18.200000,20.100000
4,Africa Eastern and Southern,15.066512,14.461591,12.139918,11.567524,10.983863,13.006566,13.891972,12.563443,12.522258,...,5.245878,6.596505,6.399343,4.720805,4.644967,5.405162,7.240978,10.773751,7.126975,10.100749
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
251,Western Hemisphere (Region),22.900000,21.200000,20.900000,23.000000,24.200000,26.100000,21.500000,29.400000,41.900000,...,1.600000,2.400000,3.400000,3.700000,3.500000,2.700000,6.200000,9.800000,7.200000,6.900000
252,World,5.365736,12.442437,10.221727,8.669272,8.080320,6.807567,5.778113,5.710119,7.113407,...,1.443857,1.605539,2.254277,2.442583,2.206073,1.905664,3.475403,7.930929,5.733163,5.365736
253,Yemen,21.079394,17.254706,16.787135,18.415789,25.305263,89.905263,23.697076,97.301734,53.413295,...,22.000000,21.300000,30.400000,33.600000,15.700000,21.700000,31.500000,29.500000,0.900000,33.900000
254,"Yemen, Rep.",17.495677,17.495677,17.495677,17.495677,17.495677,17.495677,17.495677,17.495677,17.495677,...,17.495677,17.495677,17.495677,17.495677,17.495677,17.495677,17.495677,17.495677,17.495677,17.495677


In [4]:
inflacion_fusionada.to_csv('../Inflacion/inflacion_fusionada.csv', index=False, encoding='latin1')

# Esperanza de vida

### Estrategia 'Confia en tus amigos' pero si difieren mucho obtener el promedio

Confiar por defecto en los datos de la ONU (UN) pero si hay una gran discrepancia con el dato del WB (por ejemplo, > 1 año de diferencia), entonces se toma el promedio de ambos.

In [5]:
life_wb = pd.read_csv('../Life_expectancy/life_expectancy_wb_clean.csv', encoding='latin1')
life_un = pd.read_csv('../Life_expectancy/life_expectancy_un_clean.csv')

In [6]:
# años en comun
common_years = [str(year) for year in range(1990, 2025)]

life_un_filtered = life_un[['Country', 'ISO3'] + common_years]
life_wb_filtered = life_wb[['Country', 'ISO3'] + common_years]

# merge por pais y ISO3
vida_fusionada = pd.merge(
    life_un_filtered,
    life_wb_filtered,
    on=['Country', 'ISO3'],
    how='outer',
    suffixes=('_UN', '_WB')
)

umbral_diferencia = 1.0  # años
final_vida = vida_fusionada[['Country', 'ISO3']].copy()

for year in common_years:
    col_un = f"{year}_UN"
    col_wb = f"{year}_WB"

    def elegir_valor(un, wb):
        if pd.notna(un) and pd.notna(wb):
            if abs(un - wb) > umbral_diferencia: # si la diferencia es mayor al umbral obtener el promedio
                return (un + wb) / 2
            else:
                return un # de otra forma devolver el valor un
        elif pd.notna(un):
            return un
        elif pd.notna(wb):
            return wb
        else:
            return np.nan

    final_vida[year] = [
        elegir_valor(un_val, wb_val)
        for un_val, wb_val in zip(vida_fusionada[col_un], vida_fusionada[col_wb])
    ]

final_vida = final_vida.drop_duplicates(subset=['Country', 'ISO3']).reset_index(drop=True)
print("Dimensiones del DataFrame final:", final_vida.shape)
print(final_vida.head())


Dimensiones del DataFrame final: (268, 37)
          Country ISO3     1990     1991     1992     1993     1994     1995  \
0     Afghanistan  AFG  45.1183  45.5207  46.5691  51.0212  50.9689  52.1032   
1         Albania  ALB  72.7096  73.0011  73.3030  73.6377  73.8367  74.0218   
2         Algeria  DZA  67.6584  67.6919  67.7245  67.7974  67.2844  67.6914   
3  American Samoa  ASM  71.1074  71.3120  71.5106  71.5399  71.5688  71.5652   
4         Andorra  AND  78.9608  79.3458  79.7941  80.1860  80.5868  80.8037   

      1996     1997  ...     2015     2016     2017     2018     2019  \
0  52.8302  53.2123  ...  62.2695  62.6459  62.4062  62.4434  62.9411   
1  74.1134  73.3832  ...  78.3580  78.6433  78.9003  79.2377  79.4669   
2  68.2186  68.8594  ...  75.1589  75.3099  75.4313  75.5554  75.6818   
3  71.5763  71.6065  ...  72.6543  72.6399  72.8009  72.7940  72.7511   
4  81.0090  81.1446  ...  84.5316  84.4885  84.3595  84.2416  84.0980   

      2020     2021     2022     2023

In [7]:
final_vida.to_csv('../Life_expectancy/life_expectancy_fusionada.csv', index=False, encoding='latin1')