-
Notifications
You must be signed in to change notification settings - Fork 0
Distributed Execution
github-actions[bot] edited this page Jun 25, 2026
·
1 revision
execute_on_ray runs a LazyFrame across an already-initialised
Ray cluster by splitting the work into calendar periods. Each
period becomes a Ray task that filters the source to its own time window and collects
independently; the results are streamed back and concatenated. Because the split is
driven by a predicate on the date column, each task only reads the data for its slice.
import polars as pl
import ray
import polars_io_tools # registers the .piot namespace
ray.init() # connect to or start a Ray cluster first
lf = scan_db(
"SELECT date, symbol, price FROM trades",
connection="...",
)
result = (
lf.filter(pl.col("date").is_between(pl.date(2024, 1, 1), pl.date(2024, 3, 31)))
.piot.execute_on_ray(date_column="date", time_unit="monthly")
.collect()
)-
date_columnnames theDate/Datetimecolumn that defines the partitioning axis. -
time_unitchooses the granularity of each task:"daily","monthly", or"yearly". - The query above splits into three monthly tasks (January, February, March 2024), each fetching only its month.
-
Call
ray.init()first.execute_on_rayraises if Ray is not initialised. -
Provide a bounded predicate on
date_column. The function derives each task's time range from the pushed-down filter, so a two-sided bound (such asis_between(...), or a pair of>=/<=filters) is required. Without it the call raises.
-
return_ascontrols how each worker ships its result:"arrow"(zero-copy Arrow buffers, the default),"ipc", or"parquet". -
remote_optionsis passed through to each Ray task's.options(...)— use it to requestnum_cpus,num_gpus, or a runtime environment (for example to setPOLARS_MAX_THREADS). -
max_concurrencycaps how many tasks run at once (default 100), following Ray's limit-pending-tasks pattern.
lf.piot.execute_on_ray(
date_column="date",
time_unit="daily",
remote_options={"num_cpus": 4},
max_concurrency=50,
).collect()-
Query Optimization — build the
LazyFrameyou distribute. - API Reference — full signature.
This wiki is autogenerated. To made updates, open a PR against the original source file in docs/wiki.
Get Started
Guides
Reference
Contribute