perf(is-seg): swap pydantic response for dataclass twins on workflow path by aseembits93 · Pull Request #28 · aseembits93/inference

aseembits93 · 2026-04-30T21:04:16Z

Summary

InferenceModelsInstanceSegmentationAdapter.postprocess built a full pydantic tree per frame — Point × V per polygon vertex, InstanceSegmentationPrediction × N, then InstanceSegmentationInferenceResponse. The workflow block then called response.model_dump(by_alias=True, exclude_none=True) to get a plain dict for sv.Detections.from_inference. Neither the validation nor the serializer is needed on that path — the block only consumes the dict.

This change adds slotted dataclass twins (PointDC, InferenceResponseImageDC, InstanceSegmentationPredictionDC, InstanceSegmentationInferenceResponseDC) plus _is_pred_dc_to_dict / _is_response_dc_to_dict helpers that emit the exact dict model_dump(by_alias=True, exclude_none=True) produces (same keys, same alias \"class\", same exclude_none behavior, same mask_format=\"polygon\" constant).

The adapter gates on kwargs.get(\"source\") == \"workflow-execution\" (and not return_in_rle) and returns the dataclass response on that path. Every other caller — HTTP response_model at http_api.py:1640, isinstance-based cache dispatch at cache/serializers.py:71, draw_predictions visualization, RLE response mode — keeps the pydantic path untouched.

The v3 workflow block detects the dataclass via isinstance and calls _is_response_dc_to_dict; it falls back to model_dump for any other response type (e.g. if a non-rfdetr backend is ever bound to the same block).

Why bother

We already tried model_construct on this branch's ancestor; it was 2× slower than pydantic's Rust-validated __init__. Swapping to dataclasses works because the saving isn't in construction alone — it's construction + model_dump combined. Pydantic v2's serializer is Python-heavy for nested types with aliases + exclude_none, whereas the hand-rolled dict walk is a dozen dict literals.

Numbers

Microbench (4 dets × 6-vertex polygon, construct + dump):

	µs/frame
pydantic	81
dataclass	33

Δ = −48 µs/frame (2.43× faster).

End-to-end (rfdetr-seg-nano TRT + Triton preproc + Triton fullpost + CUDA graphs, vehicles_312px.mp4, 538 frames, 4 runs each):

	run 1	run 2	run 3	run 4	mean
baseline	153.72	154.27	153.11	153.63	153.68
this change	159.00	157.62	157.18	157.03	157.71

Δ = +4.03 FPS (+2.6%).

Correctness

Parity tested on real inputs: _is_response_dc_to_dict(dc) byte-equals pyd.model_dump(by_alias=True, exclude_none=True) (modulo detection_id UUIDs from default_factory, which both paths generate). Tests cover:

Mixed detection set with varying polygon lengths (0, 3, 6 vertices)
Post-construction mutation of .time / .inference_id by Model.infer_from_request
Empty-predictions edge case

Model.infer_from_request assigns response.time and response.inference_id at inference/core/models/base.py:154,157. Those two fields are declared in InstanceSegmentationInferenceResponseDC, so the slotted dataclass permits the reassignment.

Test plan

pytest tests/workflows/unit_tests/core_steps/models/roboflow/instance_segmentation/test_v3.py — 23/23 pass
Parity test (_is_response_dc_to_dict(dc) == pyd.model_dump(by_alias=True, exclude_none=True))
Microbench
End-to-end FPS benchmark
HTTP regression: hit /infer/instance_segmentation against a local RF-DETR seg model, confirm the JSON response is byte-identical to pre-change (should be — adapter falls through to the pydantic branch because the HTTP request doesn't set source=\"workflow-execution\")
RLE path: send a request with response_mask_format=\"rle\" via workflows, confirm it still goes through the pydantic InstanceSegmentationRLEPrediction branch (gate excludes RLE)

🤖 Generated with Claude Code

…path `InferenceModelsInstanceSegmentationAdapter.postprocess` built a full pydantic tree per frame: Point × V (per polygon vertex) InstanceSegmentationPrediction × N InstanceSegmentationInferenceResponse The workflow block then called `response.model_dump(by_alias=True, exclude_none=True)` to get a plain dict for `sv.Detections.from_inference`. Neither the pydantic validation nor the serializer machinery is needed on that path; the block only consumes the dict. This change adds slotted dataclass twins in `inference.py`: PointDC InferenceResponseImageDC InstanceSegmentationPredictionDC InstanceSegmentationInferenceResponseDC plus `_is_pred_dc_to_dict` / `_is_response_dc_to_dict` helpers that emit the exact dict `model_dump(by_alias=True, exclude_none=True)` would produce (same keys, same alias `"class"`, same `exclude_none` behavior, same `mask_format="polygon"` constant). The adapter gates on `kwargs.get("source") == "workflow-execution"` (and `not return_in_rle`) and returns the dataclass response on that path. Every other caller (HTTP `response_model`, `isinstance`-based cache dispatch, `draw_predictions` visualization, RLE response mode) keeps the pydantic path untouched. The v3 workflow block detects the dataclass via isinstance and calls `_is_response_dc_to_dict` instead of `model_dump`; it falls back to `model_dump` for any other response type. Microbench (4 dets × 6-vertex polygon, construct + dump): * pydantic: 81 us/frame * dataclass: 33 us/frame (2.43x faster) End-to-end benchmark (rfdetr-seg-nano TRT + Triton preproc + Triton fullpost + CUDA graphs, vehicles_312px.mp4, 538 frames, 4 runs each; measured on branch optimize-rfdetr-seg but the gain composes identically on main): * baseline: 153.68 FPS mean * this change: 157.71 FPS mean (+4.0 FPS, +2.6%) Bit-exact parity verified: `_is_response_dc_to_dict(dc)` equals `pyd.model_dump(by_alias=True, exclude_none=True)` for all test inputs; mutation of `.time` / `.inference_id` by `Model.infer_from_request` works on the dataclass because those fields are declared and slotted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

aseembits93 mentioned this pull request Apr 30, 2026

perf(is-seg): swap pydantic response for dataclass twins on workflow path (stacked on #22) #29

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(is-seg): swap pydantic response for dataclass twins on workflow path#28

perf(is-seg): swap pydantic response for dataclass twins on workflow path#28
aseembits93 wants to merge 1 commit into
mainfrom
perf/is-seg-workflow-dataclasses

aseembits93 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aseembits93 commented Apr 30, 2026

Summary

Why bother

Numbers

Correctness

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants