Skip to content

Add Ibis Hotdata backend examples and Arrow result path#2

Merged
eddietejeda merged 2 commits into
mainfrom
docs/readme-ibis-examples
May 8, 2026
Merged

Add Ibis Hotdata backend examples and Arrow result path#2
eddietejeda merged 2 commits into
mainfrom
docs/readme-ibis-examples

Conversation

@eddietejeda
Copy link
Copy Markdown
Contributor

Summary

  • Add runnable Ibis Hotdata examples and README guidance for TPC-H-backed workspaces.
  • Refactor the HTTP layer to use the official Hotdata Python SDK.
  • Switch query execution to async-only result polling with Arrow IPC materialization.

Test plan

  • uv run ruff check src tests examples
  • uv run ruff format --check src tests examples
  • uv run pytest tests -q

- backend: information_schema page size constant, dedupe HTTP→Ibis errors,
  parse_qsl import, version from importlib.metadata
- http: poll sleep respects deadline, guard missing columns, typed _safe_call
- types: trim trailing blank lines
- ruff format/import order
Always submit Hotdata queries asynchronously and materialize successful results from the Arrow IPC result endpoint so the backend has one typed execution path.
Comment thread src/ibis_hotdata/http.py
Comment on lines +237 to 257
def _arrow_payload_from_table(
self,
table: pa.Table,
*,
result_id: str,
) -> dict[str, Any]:
sch = table.schema
columns = sch.names
nullable = [sch.field(i).nullable for i in range(len(columns))]
return {
"format": "arrow",
"pa_table": table,
"columns": columns,
"nullable": nullable,
"rows": list(data["rows"]) if data.get("rows") is not None else [],
"row_count": data.get("row_count"),
"execution_time_ms": data.get("execution_time_ms"),
"query_run_id": data.get("query_run_id"),
"result_id": data.get("result_id"),
"warning": data.get("warning"),
"rows": [],
"result_id": result_id,
"row_count": table.num_rows,
"execution_time_ms": None,
"query_run_id": None,
"warning": None,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (not blocking) the backend now consumes only pa_table from this payload — _get_schema_using_query reads data["pa_table"].schema and _safe_raw_sql yields payload["pa_table"]. The columns, nullable, rows, result_id, row_count, execution_time_ms, query_run_id, and warning fields are dead now that the JSON-row code path is gone. Worth shrinking to just {"pa_table": table} (or whatever subset is actually read elsewhere) to avoid the impression that they carry meaningful data.

Comment thread src/ibis_hotdata/http.py
Comment on lines +205 to +233
if status == 200 and ctype == APPLICATION_ARROW_STREAM.lower():
table = _ipc_stream_bytes_to_table(body)
return self._arrow_payload_from_table(table, result_id=result_id)

if status == 202:
_sleep_until(deadline, poll_interval_s)
continue

if status == 409:
d = _json_utf8(body) if body else {}
raise HotdataAPIError(
d.get("error_message") or "Result failed",
status_code=409,
body=d,
)

if status == 404:
d = _json_utf8(body) if body else {}
raise HotdataAPIError(
d.get("detail") or f"Result {result_id!r} not found",
status_code=404,
body=d,
)

raise HotdataAPIError(
f"Unexpected GET /v1/results/{result_id} status {status}",
status_code=status,
body=body,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (not blocking) if the server returns 200 with a non-Arrow content-type (e.g. accidental JSON), the first branch is skipped and execution falls all the way through to Unexpected GET /v1/results/{id} status 200. The "status 200" wording is misleading because the real problem is the content-type mismatch. Consider adding an explicit branch like:

if status == 200:
    raise HotdataAPIError(
        f"Unexpected Content-Type {ctype!r} for /v1/results/{result_id} (expected {APPLICATION_ARROW_STREAM})",
        status_code=200,
        body=body,
    )

so the failure mode is diagnosable.

@eddietejeda eddietejeda merged commit 11f733b into main May 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant