v0.3.0
ExArrow 0.3.0 — Release Notes
Released: 2026-03-10
Hex: https://hex.pm/packages/ex_arrow/0.3.0
Docs: https://hexdocs.pm/ex_arrow/0.3.0
Changelog: https://github.com/thanos/ex_arrow/blob/main/CHANGELOG.md#030---2026-03-10
What's new
Arrow compute kernels (ExArrow.Compute)
Three native operations on RecordBatch values — entirely in Rust Arrow
memory, no BEAM-side data copy:
| Function | Description |
|---|---|
filter/2 |
Keep rows where the first column of a boolean batch is true |
project/2 |
Select and reorder columns by name |
sort/3 |
Sort a batch by a named column, ascending or descending |
Results are new ExArrow.RecordBatch handles that compose with IPC writers,
Flight do_put, further compute calls, or the Parquet writer.
{:ok, filtered} = ExArrow.Compute.filter(batch, mask_batch)
{:ok, projected} = ExArrow.Compute.project(filtered, ["id", "score"])
{:ok, sorted} = ExArrow.Compute.sort(projected, "score", direction: :desc)Backed by the arrow-select and arrow-ord Rust crates (both part of the
arrow-rs 56 release family, so no extra dependency resolution).
Parquet support (ExArrow.Parquet.Reader / ExArrow.Parquet.Writer)
Read and write the Apache Parquet format using the parquet Rust crate.
Both file paths and in-memory binaries are supported on both sides.
# Read
{:ok, stream} = ExArrow.Parquet.Reader.from_file("/data/events.parquet")
{:ok, schema} = ExArrow.Stream.schema(stream)
batches = ExArrow.Stream.to_list(stream)
# Write
:ok = ExArrow.Parquet.Writer.to_file("/out/result.parquet", schema, batches)
# In-memory round-trip
{:ok, bytes} = ExArrow.Parquet.Writer.to_binary(schema, batches)
{:ok, stream2} = ExArrow.Parquet.Reader.from_binary(bytes)Parquet streams use the same ExArrow.Stream interface as IPC and ADBC
streams — schema/1, next/1, and to_list/1 work identically across all
three backends. No new consumption patterns to learn.
Explorer bridge (ExArrow.Explorer)
One-call conversion between ExArrow.Stream / ExArrow.RecordBatch and
Explorer.DataFrame:
# ExArrow → Explorer
{:ok, stream} = ExArrow.IPC.Reader.from_file("/data/events.arrow")
{:ok, df} = ExArrow.Explorer.from_stream(stream)
# Explorer → ExArrow (e.g. write to Parquet or send via Flight)
df = Explorer.DataFrame.new(x: [1, 2, 3], y: ["a", "b", "c"])
{:ok, stream} = ExArrow.Explorer.to_stream(df)All four bridge functions: from_stream/1, from_record_batch/1,
to_stream/1, to_record_batches/1.
Opt in by adding {:explorer, "~> 0.8"} to your mix.exs. When Explorer is
absent every function returns {:error, "Explorer is not available…"} rather
than failing to compile.
Nx bridge (ExArrow.Nx)
Zero-copy column extraction to Nx.Tensor by sharing the raw byte buffer:
{:ok, stream} = ExArrow.Parquet.Reader.from_file("/data/features.parquet")
batch = ExArrow.Stream.next(stream)
{:ok, tensor} = ExArrow.Nx.column_to_tensor(batch, "price")
# Or extract all numeric columns at once
{:ok, tensors} = ExArrow.Nx.to_tensors(batch)
# %{"price" => #Nx.Tensor<f64[1000]>, "qty" => #Nx.Tensor<s32[1000]>, …}Reverse path: from_tensor/2 writes an Nx.Tensor into a
single-column RecordBatch without going through an Elixir list.
Supported Arrow types: Int8/16/32/64, UInt8/16/32/64, Float32/64.
Non-numeric columns return {:error, …} from column_to_tensor/2 and are
silently skipped by to_tensors/1.
Opt in by adding {:nx, "~> 0.9"} to your mix.exs.
New public API surface
| Module | New functions |
|---|---|
ExArrow.Compute |
filter/2, project/2, sort/3 |
ExArrow.Parquet.Reader |
from_file/1, from_binary/1 |
ExArrow.Parquet.Writer |
to_file/3, to_binary/2 |
ExArrow.Explorer |
from_stream/1, from_record_batch/1, to_stream/1, to_record_batches/1 |
ExArrow.Nx |
column_to_tensor/2, to_tensors/1, from_tensor/2 |
Optional dependencies
Add to mix.exs |
Unlocks |
|---|---|
{:explorer, "~> 0.8"} |
ExArrow.Explorer bridge |
{:nx, "~> 0.9"} |
ExArrow.Nx bridge |
Both are optional. ExArrow compiles and works without them; the bridge modules
gracefully degrade to {:error, "… is not available…"} at runtime.
Upgrade guide from 0.2.0
# mix.exs — bump the version pin
{:ex_arrow, "~> 0.3.0"}
# Optional: add if you use the new bridges
{:explorer, "~> 0.8", optional: true}
{:nx, "~> 0.9", optional: true}No breaking changes to existing IPC, Flight, or ADBC APIs. All existing calls
continue to work without modification.
Fixes
- Dialyzer
call_without_opaqueerrors inExArrow.ExplorerandExArrow.Nx
— function heads no longer pattern-match on the concrete struct behind
@opaquetypes; resource extraction uses the dedicatedresource_ref/1
helpers instead. - Credo warnings in new test files (alias ordering, pipe chains,
length/1
vs!= []).
What's next (v0.4.0)
v0.4.0 is already released and ships:
- Arrow C Data Interface (
ExArrow.CDI) — zero-copy batch transfer via
raw C-struct pointers; foundation for a future zero-copy Explorer bridge. ExArrow.Nx.from_tensors/1— multi-columnRecordBatchfrom a tensor
map in one call.- Lazy Parquet streaming — row groups decoded on demand via
Stream.next/1; peak memory proportional to the largest row group, not
the full file.