Skip to content

feat(server,api): auto-download remote video URLs into the asset pipeline#2912

Merged
vpetersson merged 5 commits into
masterfrom
worktree-quizzical-roaming-badger
May 19, 2026
Merged

feat(server,api): auto-download remote video URLs into the asset pipeline#2912
vpetersson merged 5 commits into
masterfrom
worktree-quizzical-roaming-badger

Conversation

@vpetersson
Copy link
Copy Markdown
Contributor

Issues Fixed

Closes #2894.

Description

When an operator POSTs an asset with mimetype='video' and a remote https://…/clip.mp4 URI, Anthias used to store the URL verbatim and stream it at play time — bypassing the per-board HW-codec gate, leaving metadata empty, and letting origin downtime turn into black-screen rotation slots.

This mirrors the YouTube lifecycle for any http(s) single-file video URL: download once, then chain into normalize_video_asset so the same codec gate / metadata pass that protects local uploads protects remote URLs too.

  • New anthias_common.remote_video classifier (extension-first, HEAD-probe fallback) and download_remote_video_asset Celery task (streams via requests, 5 GiB hard cap, chains into normalize_video_asset).
  • Wired into CreateAssetSerializerMixin (v1.2 + v2 only — matches the image-normalization rollout). v1 / v1.1 keep literal-URL semantics by design.
  • Live streams (RTSP / RTMP / HLS / DASH / SmoothStreaming) are filtered out and stay as literal URIs.
  • Failures land as metadata.error_message + is_processing=False, same operator-visible contract as YouTube download and HEIC/video normalize failures.

54 new tests cover the classifier (32), Celery task (11), and view-level dispatch (8 — both positive and the negative legacy/stream paths).

Checklist

  • I have performed a self-review of my own code.
  • New and existing unit tests pass locally and on CI with my changes.
  • I have done an end-to-end test for Raspberry Pi devices.
  • I have tested my changes for x86 devices.
  • I added a documentation for the changes I have made (when necessary).

…line

- Detect http(s) single-file video URLs in CreateAssetSerializerMixin
  (ext-first, HEAD-probe fallback) and rewrite the row to a local
  destination + is_processing=True (v1.2 / v2 only).
- New download_remote_video_asset Celery task streams via requests
  with a 5 GiB cap and chains into normalize_video_asset for the
  per-board HW-codec gate.
- Live streams (RTSP / RTMP / HLS / DASH / SmoothStreaming) stay as
  literal URIs the viewer plays directly.
- Failures land as metadata.error_message + is_processing=False via a
  copy-paste of the YouTube download task's on_failure contract.

Closes #2894
@vpetersson vpetersson requested a review from a team as a code owner May 19, 2026 11:22
@vpetersson vpetersson self-assigned this May 19, 2026
- Route HEAD probe + streaming GET through ``AnthiasSession`` so
  origins see the project-wide ``Anthias/<release>`` UA (#2897).
- Drop the ``urllib3`` logger level side effect at import time.
- Trust the serializer-stamped ``Asset.uri`` exclusively; refuse the
  task on an empty uri rather than guessing the extension (which
  could diverge from the HEAD-probed choice).
CI's mypy step rejected ``test_remote_video_destination_path_*`` —
``tmp_path`` was unannotated and the literal ``{'assetdir': str}``
arg failed against ``AnthiasSettings | None``. Cast through
``AnthiasSettings`` (a ``UserDict[str, Any]``) so mypy is happy
without spinning up the real config layer that needs ``HOME`` set.
- Extract ``_stream_remote_video_to_file`` +
  ``_validate_remote_video_response`` helpers so
  ``download_remote_video_asset`` lands under SonarCloud's cognitive-
  complexity ceiling.
- Drop redundant ``requests.RequestException`` from the except clause
  — it's a subclass of ``OSError`` so ``except OSError`` already
  covers it (S5713).
- Drop the redundant ``startswith(('http://', 'https://'))`` in the
  serializer; ``is_downloadable_remote_video`` already rejects every
  non-http(s) scheme. Removes the literal ``http://`` hotspot in
  mixins.py.
- Replace ``udp://239.0.0.1:1234`` with ``udp://stream.example.test:1234``
  in the test fixture (S1313 hardcoded IP).
- Annotate the deliberate ``http://`` test case with ``# NOSONAR``
  and a comment explaining the LAN-without-TLS use case.
- Extract ``_DownloadAssetTask`` base for the YouTube and remote-
  video download tasks. Subclasses override ``_failure_log_prefix``
  only — the metadata-error / notify body lives in one place.
- Merge ``test_create_remote_hls_manifest_stays_as_stream_url`` and
  ``test_create_rtsp_url_stays_as_stream_url`` into a single
  parametrized test that asserts both shapes through the same path.

Brings new-code duplication below SonarCloud's 3% gate.
@sonarqubecloud
Copy link
Copy Markdown

@vpetersson vpetersson merged commit 70a57bb into master May 19, 2026
9 checks passed
vpetersson added a commit that referenced this pull request May 19, 2026
A 200 OK with no Content-Type is a stronger signal of a misbehaving
origin than evidence of a real video — and accepting it would let an
HTML error page land on disk as a multi-GB asset that the cleanup
sweep can't recover (the row still references the file, so it isn't
an orphan).

Reject with the same ``unexpected Content-Type`` error path so the
operator sees an explicit Failed pill and a re-upload affordance.

Follow-up to #2912.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Asset add: auto-download remote videos to local + normalize

1 participant