Releases · ryan-evans-git/ematix-flow

21 May 23:39

v0.7.0

f925d82

v0.7.0 — Workflow trigger model + AllOf/AnyOf composite triggers Latest

Latest

Workflow trigger model. v0.6.0's centralised depends_on={dict} is replaced
by a richer trigger surface on the workflow plus per-job depends_on for the
within-workflow DAG. Hard break, no backwards compat (we're in alpha).

Highlights

triggered_by= accepts a workflow/job name, a list (AND-conjoined), or a
boolean expression built from AllOf(...) / AnyOf(...) for arbitrarily
nested AND/OR composition. Example:

ematix.workflow(
    name="combined_report",
    triggered_by=AllOf("workflow_A", AnyOf("workflow_B", "workflow_C")),
    schedule="0 21 * * *",
    timezone="America/New_York",
    jobs=[...],
)

schedule= + timezone= on the workflow (not per-job). Cron tick is one
trigger condition; AND-conjoined with triggered_by / on_message.
on_message=<source> for per-message firing (mutually exclusive with
triggered_by / schedule).
Per-job depends_on=[…] declares the within-workflow DAG.

Web UI

Per-element trigger pills on every Workflows card — colored dot per leaf
AND per composite (ALL / ANY) showing the rolled-up state
(🟢 ready · 🟡 pending · 🔴 failed).
▶ Run now button on every workflow card (with job-subset checkboxes)
and every job card (with optional cascade-downstream toggle).
Endpoints: POST /api/workflows/{name}/run-now, POST /api/jobs/{name}/run-now.

Migration from v0.6.0

# old
ematix.workflow(name="W", jobs=["a","b","c"], depends_on={"b":["a"], "c":["b"]})

# new
@ematix.job(name="b", depends_on=["a"], ...)
@ematix.job(name="c", depends_on=["b"], ...)
ematix.workflow(name="W", jobs=["a","b","c"], schedule="0 * * * *")

register_workflow raises with a pointer to the new model if it sees the
old shape.

Docs

USER_GUIDE: trigger surface + AllOf/AnyOf composite section.
ematix.dev: scheduling page rewritten with 6 real-world scenarios + honest
competitive framing (dbt / Airflow Datasets / Prefect Automations / Dagster
sensors).

Full changelog: https://github.com/ryan-evans-git/ematix-flow/blob/v0.7.0/CHANGELOG.md

Assets 2

21 May 20:07

ryan-evans-git

v0.6.0

942da4d

v0.6.0 — Workflow + Job model

What's new

The previous flat Pipelines page becomes Workflows (user-named groupings) on top of Jobs (individual tasks). The DAG between jobs lives on the workflow declaration, not on individual jobs. Existing @ematix.pipeline code keeps working as single-job workflows-of-one, so this is a non-breaking model upgrade.

Added

ematix.workflow(name=..., jobs=[...], depends_on={...}) — new declaration. depends_on reads as {downstream: [upstream, ...]}. Edges are mirrored into the existing per-job depends-on table so the scheduler's freshness gating keeps working.
@ematix.job — alias for @ematix.pipeline. New code should prefer .job.
/api/workflows — endpoint returning declared workflows + their jobs/edges, plus synthetic single-job workflows for any job not in a declared workflow.
flow web --module <name> — pre-imports a pipelines module so the UI can render schedule, next-run, and DAG before any scheduler tick.
Pipelines API now forecasts next_run_at for batch jobs from the registered cron + timezone when no scheduler record exists yet.

Changed

Web UI restructured to Workflows | Jobs | Runs | DAG tabs.
DAG view is now an SVG flowchart with cubic-Bézier arrows replacing the rank-as-column layout.
New shared DagFlowchart.svelte used by both the Workflows card preview and the full DAG view.
Loopback bind no longer requires a bearer token by default.

See CHANGELOG.md for the full set of additions + migration notes, and docs/USER_GUIDE.md for the new Workflows section.

Install

pip install ematix-flow==0.6.0

Wheels published: linux-x86_64 (py3.11/3.12/3.13/3.14), macos-arm64 (py3.11/3.12/3.13/3.14), plus sdist.

Assets 2

21 May 15:47

ryan-evans-git

v0.5.0

e021e2d

v0.5.0 — operational milestone

[0.5.0] — 2026-05-21

Operational milestone — adds the user-facing surface (CLIs, Web UI,
alerters, observability) on top of v0.4.0's backend matrix. Same
query-execution surface as v0.4.0; per-query TPC-H times unchanged.
Highlights: four new CLI subcommands (flow doctor / init / logs
/ secrets test), bearer-token Web UI auth + cross-pipeline DAG
view, email + PagerDuty alerters, OTEL trace spans + a starter
Grafana dashboard, AWS Glue Schema Registry end-to-end Kafka
dispatch, Arrow-native warehouse adapters, streaming pipeline live
throughput in the Web UI, and the Rust executor for
@ematix.warehouse_pipeline via the new PyO3 callback bridge.

Added

@ematix.warehouse_pipeline decorator (Phase 2d slice 1, #125).
Wires WarehouseSource / WarehouseTarget into the
scheduler-registered pipeline registry so warehouse-shaped pipelines
participate in cron scheduling, retries, depends_on DAG, and
flow run-due the same way DB-backed @ematix.pipeline pipelines
do. The wrapped function is zero-arg; returning a str forwards
it as transform_sql= to run_warehouse_pipeline (DuckDB transform
in-flight on the Arrow table). Slice 2 ships in v0.5.0 across
#135 (PyO3 callback bridge) + #136 (Rust invoke_warehouse_pipeline
executor); slice 3 adds warehouse-side watermark cursors.
AWS Glue Schema Registry — end-to-end (#126 + #135). Slice 1
(#126) shipped the typed GlueSchemaRegistryConnection (kind
glue_schema_registry, fields registry_name / region / auth
via aws_profile= / explicit static creds / boto3 default chain),
the Rust glue_schema_registry module with the Glue wire format
(0x03 header + 16-byte UUID + 1-byte compression byte) exposed as
parse_glue_frame / build_glue_frame / GlueFrame / GlueCodec,
and the [schema-registry-glue] extra (boto3 +
aws-glue-schema-registry). Slice 2 (#135) wired the Rust Kafka
backend to dispatch on registry kind for both consumer and
producer paths via a SchemaRegistryKind::{Confluent, Glue {…}}
enum, added per-backend schema caches (one boto3 round-trip per
UUID / per schema name), zlib codec (GlueCodec::Zlib,
byte 0x05) using flate2, producer-side encode
(encode_batch_as_glue_avro) via Arrow → JSONL → Avro, kafka
connection-time validation (Glue + non-Avro fails at construction),
and a LocalStack integration suite at
tests/python/integration/test_glue_localstack.py (gated on
EMATIX_FLOW_LOCALSTACK_ENDPOINT). Confluent path
(kind = "schema_registry") unchanged.
PyO3 callback bridge (#135, task #559 slice 2). Process-global
registry of named Rust callbacks at
ematix_flow_core::py_callbacks (register / unregister /
is_registered / invoke, JSON-in / JSON-out adapter). Concrete
Python wiring in ematix-flow-py::py_callbacks exposes
register_python_callback / unregister_python_callback /
is_python_callback_registered / invoke_python_callback. The
Glue Kafka backend routes schema lookups (by-UUID for consumers,
by-name for producers) through this primitive; same primitive
carries the warehouse-pipeline executor in #136.
Rust executor for @ematix.warehouse_pipeline (#136,
task #559 slice 2 final piece). New
ematix_flow_core::warehouse_executor::invoke_warehouse_pipeline(name)
dispatches a registered warehouse pipeline by name through
py_callbacks, no subprocess. Python side: the
@ematix.warehouse_pipeline decorator now registers every wrapped
function as a callback at
ematix_flow.warehouse_pipeline:<pipeline_name> so the Rust
scheduler / worker can drive it directly. Three error variants
surface common failure modes (NotRegistered /
CallbackFailed / BadResponseShape); response shape is the
same dict the in-process scheduler builds (status / pipeline /
rows_read / rows_written / duration_ms / watermark).
flow init project scaffold (#136). flow init <dir> writes
a runnable starter project: pipelines.py, connections.toml,
Dockerfile, flow.service (systemd unit), .gitignore,
README.md. Refuses to overwrite without --force. Maven-archetype
shape — one command to a working flow run-due loop.
flow logs <run_id> (#136). Tails the captured stdout / stderr
for a given run. Capture is opt-in via the
EMATIX_FLOW_CAPTURE_LOGS=1 env var (so existing deployments see
no disk / latency change). Logs land at
$EMATIX_FLOW_LOGS_DIR/<run_id>.log
(default ~/.ematix-flow/logs/); run_id is pinned to
<pipeline>-<UTC ts>-<attempt> so the same record matches the
RunLog. Tee-based capture (original stream still gets the bytes);
atomic write (tmp + rename) + 30-day prune helper.
flow doctor (#136). Connection health probes by kind:
postgres (SELECT 1), kafka (TCP bootstrap probe), glue
(list_registries), pubsub (get_topic), rabbitmq (AMQP
handshake), s3 (head_bucket), snowflake / bigquery (SELECT 1).
Renders a one-pass status table; non-zero exit on any failure so it
fits CI / pre-deploy checks.
flow secrets test (#136). Resolves every ${…} placeholder
in connections.toml (or whichever path is passed) and reports
per-secret outcome (provider, key, OK / missing / error) without
printing the resolved value. Useful for validating Vault / AWS /
GCP secret-store wiring before a deploy.
Web UI bearer-token auth (#136). New --token <value> flag on
flow web (and bearer_token= kwarg on create_app /
run_server) gates every /api/* route except /api/health
behind Authorization: Bearer <token>, compared with
hmac.compare_digest. /api/health stays open for load-balancer
probes. When a token is set, the "non-loopback bind without auth"
warning at startup is suppressed.
Cross-pipeline DAG view in the Web UI (#136). New /api/dag
endpoint returns {nodes, edges} from the depends_on registry
(each node carries name / schedule / timezone); new Svelte
route #/dag lays nodes out in topological-rank columns
(upstreams always left-of downstreams) with fan-out counts.
Email + PagerDuty alerters (#136). Two new alerter URL
schemes register through the same --alerter <url> flag the
Slack / webhook / stdout alerters use:
- email://user:pass@host:port?from=...&to=...&starttls=1 — stdlib
  smtplib; default port 587 (STARTTLS) / 465 (implicit SSL).
  Errors are caught + logged (an alerter outage never breaks the
  pipeline run).
- pagerduty://<routing_key>?service=...&severity=... — Events
  API v2 trigger / resolve, with dedup_key = "<service>:<pipeline>"
  so a recovered pipeline auto-resolves its open incident. Maps
  failed → trigger(error), gave_up → trigger(critical),
  recovered → resolve.
OpenTelemetry trace spans for pipeline runs (#136). New
ematix_flow.tracing module: pipeline_run_span(name, attempt)
context manager wraps every @ematix.pipeline /
@ematix.warehouse_pipeline execution; configure once via
configure_tracer_from_url(...) with otel://stdout,
otel+otlp+grpc://collector:4317, or
otel+otlp+http://collector:4318. Span attributes include
pipeline name, attempt number, run_id, status, and exception
on failure. Sits alongside the existing OTel-metrics export — same
collector can receive both.
Streaming-pipeline live stats in the Web UI (#136). Streaming
pipelines used to show a useless "Median duration: —" on the
Pipelines view (one open-ended record, no duration). The
flow consume daemon now opens an optional RunLog
(--run-log-url / $EMATIX_FLOW_RUN_LOG_URL) and spawns a
background thread that scrapes its own /metrics endpoint every
~30s, computing rolling 1m + 5m windows from
ematix_streaming_rows_consumed_total /
ematix_streaming_rows_written_total /
ematix_streaming_batches_total /
ematix_streaming_errors_total and writing the result into the
running record's extras. /api/pipelines surfaces these
fields; Pipelines.svelte renders "Throughput: X rps in (1m) /
Y rps in (5m) · Batch cycle: A ms avg (1m)" in place of the median
footer when kind == "streaming". SqliteRunLog now also
implements the rich-history protocol
(record_run_record / list_runs / get_run against a
new run_records table), so the same SQLite file backs both the
flow consume daemon and flow web.
Grafana dashboard JSON (#136). New
examples/grafana/ematix-flow-dashboard.json — 6-panel starter
board driven by the Prometheus metrics ematix-flow already
exports: runs/min by outcome, success-rate stat, in-flight retries,
p50 / p95 / p99 duration, per-pipeline run rate, top-20 failure
counts. $pipeline templated variable. Import in Grafana via the
JSON Model field.
Cron schedule timezone support (#127, task #558 slice 1).
is_due() now accepts a keyword-only tz= argument that, when
set, interprets the cron expression in that timezone instead of
whatever timezone now carries (effectively UTC for
flow run-due today). Accepts a zoneinfo.ZoneInfo instance or a
tz name string. DST transitions are honored via croniter's
zoneinfo-aware path. tz=None (the default) preserves today's
behavior bit-for-bit. Slice 2 wires timezone= into
@ematix.pipeline / @ematix.warehouse_pipeline; slice 3 surfaces
the configured tz in the Web UI's "Next: …" rendering.

Design

Arrow-native warehouse adapters (#128, task #557). Three-slice
plan to drop pandas from the Snowflake / BigQuery / Redshift
adapter paths. Slice 1 = Snowflake PUT + COPY INTO via parquet
staging; slice 2 = Redshift S3 + COPY (extend merge-mode path to
append); slice 3 = BigQuery GCS + load_table_from_uri. Open
questions on type fidelity, Storage Write API vs GCS staging, and
the backward-compat shim policy captured for review before slice...

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

Web UI

Migration from v0.6.0

Docs

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's new

Added

Changed

Install

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.5.0] — 2026-05-21

Added

Design

Uh oh!

Releases: ryan-evans-git/ematix-flow

v0.7.0 — Workflow trigger model + AllOf/AnyOf composite triggers

Highlights

Web UI

Migration from v0.6.0

Docs

Uh oh!

v0.6.0 — Workflow + Job model

What's new

Added

Changed

Install

Uh oh!

v0.5.0 — operational milestone

[0.5.0] — 2026-05-21

Added

Design

Uh oh!