Handle TransferEncodingError as graceful network disconnect in media input loop#807
Handle TransferEncodingError as graceful network disconnect in media input loop#807livepeer-tessa wants to merge 2 commits intomainfrom
Conversation
- graph_executor.py: deduplicate fan-in stream edges before queue construction, preferring pipeline-node sources over source-node sources for the same input port; raise a clearer error (with edge details) when two pipeline nodes both target the same port - graph_executor.py: in _validate_edge_ports, include VACE ports for VACEEnabledPipeline instances regardless of static config_class.inputs; gracefully handle PipelineNotAvailableException (pipeline reloading) by logging a warning and skipping port checks for that node - frame_processor.py: in _setup_graph_from_pipeline_ids, only add the last VACEEnabledPipeline to vace_input_video_ids (not all of them), preventing fan-in when a preprocessor like yolo_mask is also a VACEEnabledPipeline Fixes #804 Signed-off-by: livepeer-robot <robot@livepeer.org>
…input loop When an orchestrator truncates the trickle connection mid-stream, aiohttp raises ClientPayloadError (subclass TransferEncodingError). Previously this was caught by the broad 'except Exception' handler and logged at ERROR level, causing noisy logs and unclean teardown. - Catch aiohttp.ClientPayloadError before the generic handler; log at WARNING and let the finally block run the normal media_output.close() path - Suppress ClientConnectorError during media_output.close() (logged at DEBUG) when the orchestrator is already unreachable at teardown time The deeper fix (in livepeer-python-gateway channel_reader.py / trickle_publisher.py) ensures the control channel subscription also terminates cleanly without raising. See: livepeer/livepeer-python-gateway#2 Fixes: #805 Related: #771 Signed-off-by: livepeer-robot <robot@livepeer.org>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚀 fal.ai Preview Deployment
Livepeer Runner
TestingConnect to this preview deployment by running this on your branch: Livepeer mode: 🧪 E2E tests will run automatically against this deployment. |
❌ E2E Tests failed
Test ArtifactsCheck the workflow run for screenshots, traces, and failure details. |
Fixes #805
What
When an orchestrator goes down or is restarted mid-session, aiohttp raises
ClientPayloadError(specificallyTransferEncodingError: 400) on open trickle connections. Previously:_media_input_loopcaught this in the genericexcept Exceptionand loggedERROR - Media input loop failed: ...JSONLReaderinlivepeer-python-gateway) also errored, loggingERROR - Control channel subscription error: ...ERROR - Trickle DELETE exceptionlogsThis is pure network-level disconnect noise — the session ends regardless, we were just logging it wrong and making the stack trace look like a bug.
Changes
src/scope/cloud/livepeer_app.pyaiohttp.ClientPayloadErrorin_media_input_loopbefore the generic handler → log atWARNINGinstead ofERRORaiohttp.ClientConnectorErrorduringmedia_output.close()→ log atDEBUG(orchestrator already gone)Companion PR in livepeer-python-gateway: livepeer/livepeer-python-gateway#2
channel_reader.py: Same treatment forChannelReaderandJSONLReader— clean return instead ofLivepeerGatewayErrortrickle_publisher.py: DemoteClientConnectorErrorin_run_deletefromERRORtoDEBUGBefore / After
Before:
After:
Related: #771 (same pattern, EOFError on clean disconnect)