In [17]:
# Reload modules to pick up code changes
%load_ext autoreload
%autoreload 2

# Generate Variables and Estimates with OpenRouter

This notebook demonstrates how to use the `VariableGenerator` and `EstimateGenerator` to decompose a forecasting question into relevant variables and generate probability/expectation estimates.

In [1]:
import os
from dotenv import load_dotenv

# Load API key from .env file
load_dotenv()

# Verify API key is available
api_key = os.getenv("OPENROUTER_API_KEY")
print(f"API key loaded: {'Yes' if api_key else 'No - please set OPENROUTER_API_KEY'}")

API key loaded: Yes


In [2]:
from calibrated_response.generation.prompts import PROMPTS
from calibrated_response.llm.openrouter import OpenRouterClient
from calibrated_response.generation.variable_generator import VariableGenerator
from calibrated_response.generation.estimate_generator import EstimateGenerator

# Initialize the OpenRouter client
# You can change the model to any supported by OpenRouter
client = OpenRouterClient(
    model="openai/gpt-4o-mini",  # or "anthropic/claude-3.5-sonnet", etc.
    # model="openai/gpt-oss-120b",  # or "anthropic/claude-3.5-sonnet", etc.
    # providers=["cerebras"]
)

print(f"Using model: {client.model_name}")

Using model: openai/gpt-4o-mini


## Define a Forecasting Question

Let's use an example forecasting question to demonstrate the pipeline.

In [3]:
# Example forecasting question
question = "What will be the total number of electric vehicles sold in the United States in 2026?"
print(f"Forecasting Question: {question}")

Forecasting Question: What will be the total number of electric vehicles sold in the United States in 2026?


## Step 1: Generate Relevant Variables

Use the `VariableGenerator` to identify variables that could influence the answer.

In [4]:
# Create the variable generator
var_generator = VariableGenerator(llm_client=client)

# Generate variables
variables = var_generator.generate(
    question=question,
    n_variables=5,
)

print(f"Generated {len(variables)} variables:\n")
for i, var in enumerate(variables, 1):
    print(f"{i}. {var.name} ({var.type.value})")
    print(f"   Description: {var.description}")
    if hasattr(var, 'lower_bound') and hasattr(var, 'upper_bound'):
        unit = getattr(var, 'unit', '') or ''
        print(f"   Range: [{var.lower_bound}, {var.upper_bound}] {unit}")
    print()

Generated 5 variables:

1. total_ev_sales_2026 (continuous)
   Description: Total number of electric vehicles sold in the US in 2026
   Range: [1000000.0, 5000000.0] vehicles

2. government_incentives (binary)
   Description: Presence of government incentives for electric vehicles

3. charging_stations_count (continuous)
   Description: Number of public charging stations available in the US
   Range: [20000.0, 100000.0] stations

4. average_ev_price (continuous)
   Description: Average price of electric vehicles in the US
   Range: [25000.0, 80000.0] dollars

5. consumer_interest (continuous)
   Description: Level of consumer interest in electric vehicles
   Range: [0.0, 100.0] percent



## Step 2: Generate Estimates

Use the `EstimateGenerator` to generate probability and expectation estimates for these variables.

In [6]:
from calibrated_response.generation.prompts import format_variables_for_prompt, PROMPTS
# print(PROMPTS['natural_estimate_generation'])

# # Create the estimate generator
# est_generator = EstimateGenerator(llm_client=client)
# # Generate estimates
# estimates = est_generator.generate(
#     question=question,
#     variables=variables,
#     num_estimates=25,
# )
# print(f"Generated {len(estimates)} estimates:\n")
# for estimate in estimates:
#     # print(type(estimate).__name__)
#     print(estimate.to_query_estimate())

## Alternative: Natural Language Estimates

Use the `NaturalEstimateGenerator` for a more token-efficient format that works better with simpler models.

In [7]:
from calibrated_response.generation.natural_estimate_generator import NaturalEstimateGenerator

# Create the natural estimate generator
natural_gen = NaturalEstimateGenerator(llm_client=client)

# Generate estimates using natural language format
natural_estimates = natural_gen.generate(
    question=question,
    variables=variables,
    num_estimates=15,
)

print(f"Generated {len(natural_estimates)} natural estimates:\n")
for est in natural_estimates:
    print(est.to_query_estimate())

Generated 15 natural estimates:

P(total_ev_sales_2026 > 3000000.0) = 0.6
E[total_ev_sales_2026] = 4000000.0
P(total_ev_sales_2026 > 2000000.0 | consumer_interest > 70.0) = 0.8
E[total_ev_sales_2026 | government_incentives = True] = 4500000.0
P(total_ev_sales_2026 > 2500000.0 | charging_stations_count > 50000.0) = 0.7
P(total_ev_sales_2026 > 1500000.0 | average_ev_price < 40000.0) = 0.65
E[consumer_interest] = 85.0
E[consumer_interest | government_incentives = True] = 90.0
E[charging_stations_count] = 70000.0
P(total_ev_sales_2026 < 2000000.0 | average_ev_price > 60000.0) = 0.4
E[total_ev_sales_2026 | charging_stations_count > 30000.0] = 3500000.0
P(total_ev_sales_2026 > 4000000.0 | government_incentives = True) = 0.5
E[average_ev_price] = 50000.0
P(total_ev_sales_2026 > 1000000.0 | consumer_interest > 50.0) = 0.75
E[total_ev_sales_2026 | charging_stations_count > 80000.0] = 4800000.0


In [8]:
for r in natural_gen.last_result.estimates:
    print(r.logic)
    print(r.expression)

Based on projected growth in the EV market and increased consumer adoption.
P(total_ev_sales_2026 > 3000000.0) = 0.6
Expected shift in consumer preferences towards electric vehicles.
E[total_ev_sales_2026] = 4000000.0
Higher consumer interest typically correlates with increased sales.
P(total_ev_sales_2026 > 2000000.0 | consumer_interest > 70.0) = 0.8
Government incentives significantly boost EV sales figures.
E[total_ev_sales_2026 | government_incentives = 1] = 4500000.0
The number of charging stations influences buyer confidence and sales.
P(total_ev_sales_2026 > 2500000.0 | charging_stations_count > 50000.0) = 0.7
Average EV price affects affordability and thus sales volume.
P(total_ev_sales_2026 > 1500000.0 | average_ev_price < 40000.0) = 0.65
Consumer interest in EVs is expected to rise over the years.
E[consumer_interest] = 85.0
Presence of government incentives correlates with higher consumer interest.
E[consumer_interest | government_incentives = 1] = 90.0
The number of chargin