# Continuous Answer Type

This notebook demonstrates how to generate forecasting questions with numeric/continuous answers. Continuous questions are perfect for quantitative predictions like sports statistics, financial metrics, or any numeric outcome.

In [None]:
%pip install -e ..
%pip install dotenv

from IPython.display import clear_output
clear_output()

In [None]:
import os
from dotenv import load_dotenv
from lightningrod import LightningRod

load_dotenv()

api_key = os.getenv("LIGHTNINGROD_API_KEY")
base_url = os.getenv("LIGHTNINGROD_BASE_URL", "https://api.lightningrod.ai/api/public/v1")

if not api_key:
    raise ValueError("LIGHTNINGROD_API_KEY is not set")

# Note: base_url param can be omitted
lr = LightningRod(api_key=api_key, base_url=base_url)

## Continuous Answer Type Overview

Continuous questions return numeric values. They're ideal for:
- Sports statistics (points, rebounds, assists)
- Financial predictions (stock prices, revenue)
- Metrics forecasting (user growth, sales numbers)
- Any quantitative outcome

Continuous questions work with any data source: News Search, Top Aggregated News, or Custom Documents.

In [None]:
from datetime import datetime, timedelta
from lightningrod import AnswerType, AnswerTypeEnum, NewsSeedGenerator, QuestionGenerator, QuestionPipeline, WebSearchLabeler, QuestionRenderer

answer_type = AnswerType(answer_type=AnswerTypeEnum.CONTINUOUS)

seed_generator = NewsSeedGenerator(
    start_date=datetime.now() - timedelta(days=30),
    end_date=datetime.now(),
    interval_duration_days=7,
    search_query=[
        "NBA player statistics",
        "NBA player performance",
        "NBA game statistics",
    ],
    articles_per_search=15,
)

question_generator = QuestionGenerator(
    instructions=(
        "Write forecasting questions that require a single, specific numeric answer. "
        "Avoid questions that could be answered with a range or categoryâ€”ask about concrete statistics or totals. "
        "Make sure the question is clear, unambiguous, and focused on a quantifiable outcome occurring in the future."
    ),
    examples=[
        "How many points will LeBron James score in his next game?",
        "How many rebounds will the Lakers grab in their next game?",
        "How many points will the Bulls score in their next game?",
        "How many three-pointers will Steph Curry make in his next game?",
    ],
    bad_examples=[
        "Will LeBron James score more than 30 points in his next game?",         # Binary, not a continuous answer
        "Which team will win the next Lakers game?",                             # Categorical, not numeric
        "How likely is it that the Bulls will score over 100 points?",           # Probability, not a specific number
        "How many rebounds did the Lakers grab in their last game?",             # Backward-looking, not a forecast
        "How has LeBron's scoring average changed this season?",                 # Vague, not a concrete forecast
    ],
    answer_type=answer_type,
)

# Labeler automatically finds answers to questions using web search
labeler = WebSearchLabeler(
    answer_type=answer_type,
    confidence_threshold=0.5,
)

pipeline_config = QuestionPipeline(
    seed_generator=seed_generator,
    question_generator=question_generator,
    labeler=labeler,
    renderer=QuestionRenderer(answer_type=answer_type),
)

## Run the Pipeline

Generate continuous questions from the news articles.

In [None]:
dataset = lr.transforms.run(pipeline_config, max_questions=10) # keep max questions low when testing

> Note: This can take a few minutes to complete processing.

## View Results

Inspect the generated questions and answers. Each sample contains `seed`, `question`, `label`, `prompt`, and optional `context` and `meta` fields. See [API.md](../API.md) for the complete sample structure.

In [7]:
%pip install pandas

from IPython.display import clear_output
clear_output()

In [None]:
import pandas as pd

# Download samples to memory
samples = dataset.download()
print(f"Generated {dataset.num_rows} samples\n")

# Convert cached samples to a list of dictionaries
rows = dataset.flattened()

df = pd.DataFrame(rows)
df

Generated 30 samples

                              question.question_text   label.label  \
0  What will be the closing price of the S&P 500 ...  undetermined   
1  How many combined points will be scored by the...  undetermined   
2  What will be the total number of regular-seaso...  undetermined   
3  What will be the final sale price (in US dolla...  undetermined   
4  How many total points will LaMelo Ball score i...  undetermined   

   label.label_confidence                                    label.reasoning  \
0                     1.0  The closing price of the S&P 500 index on Dece...   
1                     1.0  The game between the Toronto Raptors and Los A...   
2                     1.0  The question asks for a specific statistical t...   
3                     1.0  The question asks for the final sale price of ...   
4                     1.0  The question asks for the exact number of poin...   

                                label.answer_sources  \
0  https://vertexais

## Use Cases for Continuous Questions

**Sports Statistics:**
- "How many points will Player X score?"
- "How many rebounds will Team Y grab?"

**Financial Predictions:**
- "What will Stock X's price be?"
- "What will Company Y's revenue be?"

**Metrics Forecasting:**
- "How many users will the app have?"
- "What will the conversion rate be?"