**Table of contents**<a id='toc0_'></a>    
- [Consumption of Banxico API and creation of a dataset](#toc1_)    
  - [Libraries](#toc1_1_)    
  - [API call and save of the results](#toc1_2_)    
  - [Creation of the dataset from the json file](#toc1_3_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

# <a id='toc1_'></a>[Consumption of Banxico API and creation of a dataset](#toc0_)

## <a id='toc1_1_'></a>[Libraries](#toc0_)

Shot out to the user @EliasManJ who created the library for Banxico API for Python

In [6]:
import warnings
warnings.filterwarnings("ignore")

In [7]:
import pandas as pd
import os
import json
from banxicoapi import banxico_api

## <a id='toc1_2_'></a>[API call and save of the results](#toc0_)

**Note**: All the variables are set to be in a monthly count and in millions of dollars or millions of pesos depending of the metric  

Variables in millions of pesos:
1. Public Sector Balance
2. Monetary Base

Variables in millions of dollars:
1. International Reserves

In [8]:
api_token = os.environ['BANXICO_TOKEN']
api = banxico_api.BanxicoApi(api_token)

start_date = "1995-12-01"
end_date = "2024-11-01"

series = ["SP30577", 	# National Consumer Price Index
			"SF29652",	# Monetary base
			"SF283",	# Interest rate for public debt instruments (CETES) 28 days
			"SF31991", 	# International reserves of the Bank of Mexico (USD)
			"SG41"		# Public sector balance
			]
data = api.get(series, start_date=start_date, end_date=end_date)

# trasnform to json
with open("api_call.json", "w") as f:
	json.dump(data, f)

## <a id='toc1_3_'></a>[Creation of the dataset from the json file](#toc0_)

In [9]:
def clean_column_name(series_id, title):
    """Create a clean column name from series ID and title."""
    clean_title = title.replace('í', 'i').replace('é', 'e').replace('á', 'a').replace('ó', 'o').replace('ú', 'u')
    return f"{series_id}_{clean_title}"

def convert_json_to_csv(json_data, output_dir='output'):
    # Create output directory if it doesn't exist
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    
    # Parse JSON data
    data = json.loads(json_data)
    
    # Dictionary to store all series data
    all_dates = set()
    series_data = {}
    
    # First pass: collect all dates and prepare series data
    for series in data:
        series_id = series['idSerie']
        series_title = series['titulo']
        column_name = clean_column_name(series_id, series_title)
        
        # Create a dictionary for this series
        date_value_dict = {}
        
        for entry in series['datos']:
            try:
                date = pd.to_datetime(entry['fecha'], format='%d/%m/%Y')
                value = entry['dato']
                
                # Handle numeric values with comma as thousand separator
                if isinstance(value, str):
                    if value == 'N/E':
                        value = pd.NA
                    else:
                        # Remove commas and convert to float
                        value = float(value.replace(',', ''))
                
                date_value_dict[date] = value
                all_dates.add(date)
                
            except Exception as e:
                print(f"Error processing entry {entry}: {str(e)}")
        
        series_data[column_name] = date_value_dict
            
        print(f"Processed values for {column_name}: {len(date_value_dict)}")
    
    # Convert to DataFrame
    all_dates = sorted(list(all_dates))
    df_dict = {}
    
    for column_name, date_value_dict in series_data.items():
        # Create a series with all dates, filling missing values with NA
        series_values = [date_value_dict.get(date, pd.NA) for date in all_dates]
        df_dict[column_name] = series_values
    
    # Create final DataFrame
    combined_df = pd.DataFrame(df_dict, index=all_dates)
    
    # Sort by date
    combined_df.sort_index(inplace=True)
    
    # Save to CSV
    output_path = os.path.join(output_dir, 'Inflation_Rate_Variables_Dataset.csv')
    combined_df.to_csv(output_path, encoding='utf-8')
    print(f"\nCreated: {output_path}")

# Load JSON data
with open("api_call.json", "r") as f:
    json_data = f.read()

# Convert JSON to CSV
convert_json_to_csv(json_data)

Processed values for SP30577_Índice Nacional de Precios al consumidor Variacion mensual: 348
Processed values for SF29652_Base Monetaria: 348
Processed values for SF31991_Banco de Mexico, Recursos en moneda extranjera, Reserva Internacional (Definida de acuerdo con la Ley del Banco de Mexico de Abril de 1994): 348
Processed values for SF283_TIIE a 28 dias Tasa de interes promedio mensual, en por ciento anual: 348
Processed values for SG41_Ingresos y Gastos Presupuestales del Sector Publico Medicion por Ingreso-Gasto, Flujos de Caja Balance publico Balance presupuestario: 348

Created: output\Inflation_Rate_Variables_Dataset.csv
