futuresearch · dschwarz26 · Feb 24, 2026 · Feb 22, 2026 · Feb 22, 2026 · Feb 23, 2026
diff --git a/docs/api.md b/docs/api.md
@@ -1,11 +1,11 @@
 ---
 title: API Reference
-description: Complete API reference for everyrow — screen, rank, dedupe, merge, and research operations powered by LLM web research agents.
+description: Complete API reference for everyrow — screen, rank, dedupe, merge, forecast, and research operations powered by LLM web research agents.
 ---
 
 # API Reference
 
-Five operations for processing data with LLM-powered web research agents. Each takes a DataFrame and a natural-language instruction.
+Six operations for processing data with LLM-powered web research agents. Each takes a DataFrame and a natural-language instruction.
 
 ## screen
 
@@ -55,6 +55,17 @@ result = await merge(task=..., left_table=df1, right_table=df2)
 Guides: [Fuzzy Join Without Matching Keys](/docs/fuzzy-join-without-keys)
 Case Studies: [LLM Merging at Scale](/docs/case-studies/llm-powered-merging-at-scale), [Match Software Vendors to Requirements](/docs/case-studies/match-software-vendors-to-requirements)
 
+## forecast
+
+```python
+result = await forecast(input=questions_df)
+```
+
+`forecast` takes a DataFrame of binary questions and produces a calibrated probability estimate (0–100) and rationale for each row. Each question is researched across six dimensions in parallel, then synthesized by an ensemble of forecasters. Validated against 1500 hard forecasting questions and 15M research documents.
+
+[Full reference →](/docs/reference/FORECAST)
+Blog posts: [Automating Forecasting Questions](https://futuresearch.ai/automating-forecasting-questions/), [arXiv paper](https://arxiv.org/abs/2506.21558)
+
 ## agent_map / single_agent
 
 ```python

diff --git a/docs/reference/FORECAST.md b/docs/reference/FORECAST.md
@@ -0,0 +1,86 @@
+---
+title: forecast
+description: API reference for the EveryRow forecast tool, which produces calibrated probability estimates for binary questions using web research and an ensemble of forecasters.
+---
+
+# Forecast
+
+`forecast` takes a DataFrame of binary questions and produces a calibrated probability estimate (0–100) and rationale for each row. The approach is validated against FutureSearch's past-casting environment of 1500 hard forecasting questions and 15M research documents. See more at [Automating Forecasting Questions](https://futuresearch.ai/automating-forecasting-questions/) and [arXiv:2506.21558](https://arxiv.org/abs/2506.21558).
+
+## Examples
+
+```python
+from pandas import DataFrame
+from everyrow.ops import forecast
+
+questions = DataFrame([
+    {
+        "question": "Will the US Federal Reserve cut rates by at least 25bp before July 1, 2027?",
+        "resolution_criteria": "Resolves YES if the Fed announces at least one rate cut of 25bp or more at any FOMC meeting between now and June 30, 2027.",
+    },
+])
+
+result = await forecast(input=questions)
+print(result.data[["question", "probability", "rationale"]])
+```
+
+The output DataFrame contains the original columns plus `probability` (int, 0–100) and `rationale` (str).
+
+### Batch context
+
+When all rows share common framing, pass it via `context` instead of repeating it in every row:
+
+```python
+result = await forecast(
+    input=geopolitics_questions,
+    context="Focus on EU regulatory and diplomatic sources. Assume all questions resolve by end of 2027.",
+)
+```
+
+Leave `context` empty when rows are self-contained—a well-specified question with resolution criteria needs no additional instruction.
+
+## Input columns
+
+The input DataFrame should contain at minimum a `question` column. All columns are passed to the research agents and forecasters.
+
+| Column | Required | Purpose |
+|--------|----------|---------|
+| `question` | Yes | The binary question to forecast |
+| `resolution_criteria` | Recommended | Exactly how YES/NO is determined—the "contract" |
+| `resolution_date` | Optional | When the question closes |
+| `background` | Optional | Additional context the forecasters should know |
+
+Column names are not enforced—research agents infer meaning from content. A column named `scenario` instead of `question` works fine.
+
+## Parameters
+
+| Name | Type | Description |
+|------|------|-------------|
+| `input` | DataFrame | Rows to forecast, one question per row |
+| `context` | str \| None | Optional batch-level instructions that apply to every row |
+| `session` | Session | Optional, auto-created if omitted |
+
+## Output
+
+Two columns are added to each input row:
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `probability` | int | 0–100, calibrated probability of YES resolution |
+| `rationale` | str | Detailed reasoning with citations from web research |
+
+Probabilities are clamped to [3, 97]—even near-certain outcomes retain residual uncertainty.
+
+## Performance
+
+| Rows | Time | Cost |
+|------|------|------|
+| 1 | ~5 min | ~$0.60 |
+| 5 | ~6 min | ~$3 |
+| 20 | ~10 min | ~$12 |
+
+## Related docs
+
+### Blog posts
+- [Automating Forecasting Questions](https://futuresearch.ai/automating-forecasting-questions/)
+- [arXiv paper: Automated Forecasting](https://arxiv.org/abs/2506.21558)
diff --git a/everyrow-mcp/manifest.json b/everyrow-mcp/manifest.json
@@ -49,6 +49,10 @@
       "name": "everyrow_agent",
       "description": "Run web research agents on each row of a CSV file."
     },
+    {
+      "name": "everyrow_forecast",
+      "description": "Forecast the probability of binary questions from a CSV file."
+    },
     {
       "name": "everyrow_single_agent",
       "description": "Run a single web research agent on a task, optionally with context data."

diff --git a/everyrow-mcp/src/everyrow_mcp/models.py b/everyrow-mcp/src/everyrow_mcp/models.py
@@ -259,6 +259,30 @@ def validate_csv_paths(cls, v: str) -> str:
         return v
 
 
+class ForecastInput(BaseModel):
+    """Input for the forecast operation."""
+
+    model_config = ConfigDict(str_strip_whitespace=True, extra="forbid")
+
+    input_csv: str = Field(
+        ...,
+        description="Absolute path to the input CSV file containing a binary "
+        "question and optional resolution criteria on each row.",
+    )
+    context: str | None = Field(
+        default=None,
+        description="Optional batch-level context or instructions that apply to every row "
+        "(e.g. 'Focus on EU regulatory sources' or 'Assume resolution by end of 2027'). "
+        "Leave empty when the rows are self-contained.",
+    )
+
+    @field_validator("input_csv")
+    @classmethod
+    def validate_input_csv(cls, v: str) -> str:
+        validate_csv_path(v)
+        return v
+
+
 class SingleAgentInput(BaseModel):
     """Input for a single agent operation (no CSV)."""
 

diff --git a/everyrow-mcp/src/everyrow_mcp/tools.py b/everyrow-mcp/src/everyrow_mcp/tools.py
@@ -21,6 +21,7 @@
 from everyrow.ops import (
     agent_map_async,
     dedupe_async,
+    forecast_async,
     merge_async,
     rank_async,
     screen_async,
@@ -40,6 +41,7 @@
 from everyrow_mcp.models import (
     AgentInput,
     DedupeInput,
+    ForecastInput,
     MergeInput,
     ProgressInput,
     RankInput,
@@ -533,6 +535,78 @@ async def everyrow_merge(params: MergeInput) -> list[TextContent]:
     ]
 
 
+@mcp.tool(
+    name="everyrow_forecast",
+    structured_output=False,
+    annotations=ToolAnnotations(
+        title="Probability Forecast",
+        readOnlyHint=False,
+        destructiveHint=False,
+        idempotentHint=False,
+        openWorldHint=True,
+    ),
+)
+async def everyrow_forecast(params: ForecastInput) -> list[TextContent]:
+    """Forecast the probability of binary questions from a CSV file.
+
+    Each row is forecast using an approach validated against FutureSearch's
+    past-casting environment of 1500 hard forecasting questions and 15M research
+    documents, see more at https://futuresearch.ai/automating-forecasting-questions/
+    and https://arxiv.org/abs/2506.21558.
+
+    The CSV should contain at minimum a ``question`` column.  Recommended additional
+    columns: ``resolution_criteria``, ``resolution_date``, ``background``.  All
+    columns are passed to the research agents and forecasters.
+
+    The optional ``context`` parameter provides batch-level instructions that apply
+    to every row (e.g. "Focus on EU regulatory sources").  Leave it empty when the
+    rows are self-contained.
+
+    Output columns added: ``rationale`` (str) and ``probability`` (int, 0-100).
+
+    This function submits the task and returns immediately with a task_id and session_url.
+    After receiving a result from this tool, share the session_url with the user.
+    Then immediately call everyrow_progress(task_id) to monitor.
+    Once the task is completed, call everyrow_results to save the output.
+    """
+    client = _get_client()
+
+    _clear_task_state()
+    df = pd.read_csv(params.input_csv)
+
+    async with create_session(client=client) as session:
+        session_url = session.get_url()
+        cohort_task = await forecast_async(
+            task=params.context or "",
+            session=session,
+            input=df,
+        )
+        task_id = str(cohort_task.task_id)
+        _write_task_state(
+            task_id,
+            task_type=PublicTaskType.FORECAST,
+            session_url=session_url,
+            total=len(df),
+            completed=0,
+            failed=0,
+            running=0,
+            status=TaskStatus.RUNNING,
+            started_at=datetime.now(UTC),
+        )
+
+    return [
+        TextContent(
+            type="text",
+            text=(
+                f"Submitted: {len(df)} rows for forecasting (6 research dimensions + dual forecaster per row).\n"
+                f"Session: {session_url}\n"
+                f"Task ID: {task_id}\n\n"
+                f"Share the session_url with the user, then immediately call everyrow_progress(task_id='{task_id}')."
+            ),
+        )
+    ]
+
+
 @mcp.tool(
     name="everyrow_progress",
     structured_output=False,