Skip to content

Speed up Dag serialization by skipping redundant asset roundtrip#67702

Open
shahar1 wants to merge 1 commit into
apache:mainfrom
shahar1:fix/dag-serialization-dep-single-pass
Open

Speed up Dag serialization by skipping redundant asset roundtrip#67702
shahar1 wants to merge 1 commit into
apache:mainfrom
shahar1:fix/dag-serialization-dep-single-pass

Conversation

@shahar1
Copy link
Copy Markdown
Contributor

@shahar1 shahar1 commented May 29, 2026

Problem

_DependencyDetector.detect_task_dependencies builds the asset dependency_id
for every asset outlet of every task via:

serialized_asset = ensure_serialized_asset(obj)
dependency_id = SerializedAssetUniqueKey.from_asset(serialized_asset).to_str()

ensure_serialized_asset() runs a full decode_asset_like(encode_asset_like(obj))
encode→decode roundtrip — rebuilding the entire SerializedAsset (group, extra,
watchers, access_control) — purely to read back .name/.uri for the unique key.

Fix

encode_asset_like copies name/uri verbatim and _decode_asset reconstructs
them verbatim, so the roundtrip cannot change either field. The key is built
directly from the object instead:

dependency_id = SerializedAssetUniqueKey(name=obj.name, uri=obj.uri).to_str()

One full encode+decode per asset outlet is removed. The asset-alias branch is
left unchanged (rare path that genuinely needs the serialized object).

Correctness

  • Output is byte-identical — URI normalization happens once at
    Asset.__init__, never inside the roundtrip, so obj.name/obj.uri already
    hold the normalized values.
  • All 194 tests in
    airflow-core/tests/unit/serialization/test_dag_serialization.py pass.
  • The benchmark's [0] section asserts identical keys across asset shapes
    (name+uri, uri-only, name-only, special characters, with group/extra).

Benchmark

Comparing two separate process runs proved unreliable on a loaded machine
(absolute timings swung ~2× from background load alone). The credible
measurement is a single-process interleaved A/B that patches in both
implementations and times them round-by-round under identical instantaneous
load (bench_asset_roundtrip_ab.py):

Scenario OLD (roundtrip) min NEW (direct) min speedup
100 tasks × 1 outlet 10.73 ms 10.46 ms 1.03×
100 tasks × 5 outlets 15.28 ms 14.52 ms 1.05×
500 tasks × 5 outlets 77.24 ms 70.82 ms 1.09×
1000 tasks × 5 outlets 143.46 ms 137.79 ms 1.04×
200 tasks × 20 outlets 59.53 ms 54.79 ms 1.09×

Consistent ~3–9% end-to-end serialization speedup (largest on asset-heavy
Dags), and ~1.5–2× on the changed line in isolation.

Benchmark scripts: https://gist.github.com/shahar1/841592531adfd66def64fc67fcc3ea6c


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (Opus 4.8)

Generated-by: Claude Code (Opus 4.8) following the guidelines

@shahar1 shahar1 requested review from ashb and bolkedebruin as code owners May 29, 2026 09:49
@shahar1 shahar1 changed the title Perf: fold dep detection into task-serialization loop in serialize_dag Fold dep detection into task-serialization loop in serialize_dag May 29, 2026
@shahar1 shahar1 requested a review from kaxil May 29, 2026 09:50
@shahar1 shahar1 marked this pull request as draft May 29, 2026 09:51
@shahar1 shahar1 closed this May 29, 2026
detect_task_dependencies built the asset dependency id via
ensure_serialized_asset(), which runs a full encode→decode roundtrip
(rebuilding group/extra/watchers/access_control) on every asset outlet of
every task, only to read back name/uri for the unique key. Asset
encode/decode copies name/uri verbatim, so the key can be built directly
from the object. This removes the redundant roundtrip per outlet, cutting
end-to-end serialization time by ~3-9% on asset-heavy Dags (larger gains
the more outlets per task) with byte-identical output.
@shahar1 shahar1 reopened this May 29, 2026
@shahar1 shahar1 force-pushed the fix/dag-serialization-dep-single-pass branch from 341bc5c to 3fd4dcb Compare May 29, 2026 12:41
@shahar1 shahar1 changed the title Fold dep detection into task-serialization loop in serialize_dag Speed up Dag serialization by skipping redundant asset roundtrip May 29, 2026
@shahar1 shahar1 marked this pull request as ready for review May 29, 2026 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant