Skip to content

[Others] Decord remove#7383

Merged
yongqiangma merged 6 commits into
PaddlePaddle:developfrom
BingooYang:decord_remove
May 18, 2026
Merged

[Others] Decord remove#7383
yongqiangma merged 6 commits into
PaddlePaddle:developfrom
BingooYang:decord_remove

Conversation

@BingooYang
Copy link
Copy Markdown
Contributor

@BingooYang BingooYang commented Apr 14, 2026

Motivation

修复Fastdeploy中decord没有arm包问题

Modifications

使用paddlecodec替代decord(decord没有arm包)

Usage or Command

no

Accuracy Tests

no

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Apr 14, 2026

Thanks for your contribution!

PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

@PaddlePaddle-bot
Copy link
Copy Markdown

🤖 AI Code Review | 2026-04-14T14:39:00Z

📋 Review 摘要

PR 概述:将 decord 替换为 paddlecodec 以支持 ARM 环境

变更范围:fastdeploy/input/、requirements/

影响面 Tag[DataProcessor]

📝 PR 规范检查

PR 标题和描述符合规范要求。

发现的问题

级别 文件 概述
🔴 Bug fastdeploy/input/ernie4_5_vl_processor/utils/video_utils.py:61 original_file 被错误设置为原始路径,导致用户文件被删除
🟡 建议 fastdeploy/input/ernie4_5_vl_processor/utils/video_utils.py:129 del 方法缺少异常保护

总体评价

PR 目标明确,实现了从 decord 到 paddlecodec 的迁移。但 ernie4_5_vl_processor/utils/video_utils.py 中存在严重 Bug:对于非 GIF 字符串路径,self.original_file 被错误设置为原始视频路径,导致对象析构时删除用户文件。建议修复后再合并。


🔴 Bug 详情

文件: fastdeploy/input/ernie4_5_vl_processor/utils/video_utils.py:61

问题描述: 对于非 GIF 字符串路径,self.original_file 被错误设置为原始视频路径。

当前代码无条件执行了 self.original_file = video_path,这意味着对于普通视频文件(如 /path/to/video.mp4),self.original_file 会被设置为该路径。在 __del__ 方法中会尝试删除该文件,导致用户原始视频被误删。

建议修复: 参考 fastdeploy/input/video_utils.py:75-77 的正确实现,删除第 61 行的 self.original_file = video_path 赋值:

if isinstance(video_path, str):
    if video_path.lower().endswith(".gif"):
        gif_input = video_path

🟡 建议详情

文件: fastdeploy/input/ernie4_5_vl_processor/utils/video_utils.py:129

问题描述: __del__ 方法缺少异常保护。

直接访问 self.original_file 可能导致 AttributeError(如果初始化失败)。

建议修复: 参考 fastdeploy/input/video_utils.py:147-153 使用 getattr 和 try-except 保护:

def __del__(self):
    original_file = getattr(self, "original_file", None)
    if original_file:
        try:
            os.remove(original_file)
        except OSError:
            pass

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 94.87179% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@12c6ae0). Learn more about missing BASE report.

Files with missing lines Patch % Lines
fastdeploy/input/utils/video.py 93.75% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7383   +/-   ##
==========================================
  Coverage           ?   63.63%           
==========================================
  Files              ?      462           
  Lines              ?    64293           
  Branches           ?     9853           
==========================================
  Hits               ?    40911           
  Misses             ?    20592           
  Partials           ?     2790           
Flag Coverage Δ
GPU 72.79% <94.87%> (?)
XPU 7.12% <25.64%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

PaddlePaddle-bot

This comment was marked as outdated.

@PaddlePaddle-bot
Copy link
Copy Markdown


🔴 Bug fastdeploy/input/ernie4_5_vl_processor/utils/video_utils.py:61

original_file 被无条件设置为原始路径,导致用户文件可能在 del 中被错误删除。

当传入普通字符串路径(如 /path/to/video.mp4)时,self.original_file = video_path 会把原始文件路径记录下来。然后在 del 中(第 129-130 行)会调用 os.remove(self.original_file),导致用户原始文件被删除!

对比 fastdeploy/input/video_utils.py 中的正确实现(第 73-74 行):

if isinstance(video_path, str):
if video_path.lower().endswith('.gif'): # 只在 GIF 情况下
gif_input = video_path

建议修复方式:

if isinstance(video_path, str):
if video_path.lower().endswith('.gif'):
gif_input = video_path
# 不要无条件设置 self.original_file = video_path

@PaddlePaddle-bot
Copy link
Copy Markdown


🟡 建议 fastdeploy/input/video_utils.py:100

临时文件对象 mp4_file 未显式关闭。

虽然 ntf(delete=False) 会在 del 中手动删除文件,但建议在 clip.close() 后添加 mp4_file.close() 以显式关闭文件对象,避免潜在的资源泄漏。

参考 fastdeploy/input/ernie4_5_vl_processor/utils/video_utils.py 中的实现(第 79-80 行):

mp4_file.close() # close before moviepy writes
clip.write_videofile(mp4_path, verbose=False, logger=None)

@BingooYang
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@PaddlePaddle-bot
Copy link
Copy Markdown

PaddlePaddle-bot commented May 11, 2026

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-18 15:54:40

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

所有 Required 任务均已通过 ✅,PR 可以合并。

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
45(0) 45 43 1 0 0 1

2 任务状态汇总

2.1 Required任务 : 10/10 通过

必选任务阻塞合并,失败需优先处理。

状态 任务 耗时 根因 修复建议 日志 重跑
其余 10 个必选任务通过 - - - - -

2.2 可选任务 — 33/35 通过

可选任务不阻塞合并,失败仅供参考。

状态 任务 耗时 日志 重跑
CI_HPU 1h12m Job -
其余 33 个可选任务通过 - - -

3 失败详情(仅 required)

PaddlePaddle-bot

This comment was marked as outdated.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 11, 2026

CLA assistant check
All committers have signed the CLA.

PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

PaddlePaddle-bot

This comment was marked as outdated.

Copy link
Copy Markdown
Collaborator

@EmmonsCurse EmmonsCurse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~ ffmpeg is already included in the CI test image.

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-05-18 10:20:53

📋 Review 摘要

PR 概述:将视频解码依赖从 decord(无 ARM 包)替换为 paddlecodec(基于 torchcodec/VideoDecoder),修复 ARM 平台不可用问题。
变更范围fastdeploy/input/ 数据处理层、requirements 文件、相关测试
影响面 Tag[DataProcessor] [BugFix]

问题

级别 文件 概述
🟡 建议 video.py VideoReaderWrapper.__init__ sys.modules["torchcodec"] = None 产生全局副作用,污染进程级 sys.modules
🟡 建议 video.py VideoReaderWrapper.__init__ self._decoder = VideoDecoder(...) 构造在 try 块外,构造失败时不触发友好错误信息
📝 PR 规范 Checklist Add unit tests 未勾选,但本 PR 实际已新增测试
📝 PR 规范 标题 Tag [Others] 语义宽泛,建议改为更精准的 [BugFix]

🟡 建议 1:sys.modules["torchcodec"] = None 全局副作用

位置fastdeploy/input/utils/video.pyVideoReaderWrapper.__init__ 中 try 块内。

with paddle.use_compat_guard(enable=True, scope={"torchcodec"}):
    try:
        import sys
        from torchcodec.decoders import VideoDecoder
        sys.modules["torchcodec"] = None   # ← 问题所在
    except (ImportError, RuntimeError) as e:
        ...

VideoDecoder 成功导入后立即将 sys.modules["torchcodec"] 置为 None,这是进程级全局修改。之后同一进程中任何代码执行 import torchcodecfrom torchcodec.xxx import yyy 都会触发 ImportError: import of torchcodec halted; use of sys.modules['torchcodec'] = None,可能影响服务中其他多媒体处理逻辑。

建议修复:若目的是防止后续重复 import,使用模块级变量缓存 VideoDecoder 类,不要污染 sys.modules

_VideoDecoder = None

def _get_video_decoder_cls():
    global _VideoDecoder
    if _VideoDecoder is None:
        with paddle.use_compat_guard(enable=True, scope={"torchcodec"}):
            from torchcodec.decoders import VideoDecoder as _VD
            _VideoDecoder = _VD
    return _VideoDecoder

🟡 建议 2:VideoDecoder(...) 构造在 try 块外

位置fastdeploy/input/utils/video.pyVideoReaderWrapper.__init__

with paddle.use_compat_guard(enable=True, scope={"torchcodec"}):
    try:
        ...
        from torchcodec.decoders import VideoDecoder
        sys.modules["torchcodec"] = None
    except (ImportError, RuntimeError) as e:
        logger.error(...)   # 友好错误只覆盖 import 阶段
        raise
    # ← VideoDecoder 构造在 try 之外
    PADDLECODEC_NUM_THREADS = int(os.environ.get("PADDLECODEC_NUM_THREADS", 0))
    self._decoder = VideoDecoder(
        video_path,
        seek_mode="exact",
        ...
    )

VideoDecoder(video_path, ...) 构造失败时(如文件格式不支持、FFmpeg 解码出错),会绕过 except 块,既不打印友好诊断信息,也不给出修复建议,直接抛原始异常。

建议修复:将 PADDLECODEC_NUM_THREADS 赋值和 self._decoder = VideoDecoder(...) 也纳入 try 块。


📝 PR 规范检查

存在两个规范问题:①标题 Tag 语义不精准;②Checklist Add unit tests 未勾选但实际已添加测试。

标题建议(可直接复制):

  • [BugFix] Replace decord with paddlecodec to fix missing ARM package

PR 描述建议(可直接复制,补充未勾选项):

## Motivation
修复 FastDeploy 中 decord 没有 ARM 包的问题,使视频输入处理在 ARM 平台可用。

## Modifications
- `fastdeploy/input/utils/video.py`:用 paddlecodec(VideoDecoder)替换 decord,新增 `_NumpyFrame` 适配层和 `read_video_paddlecodec`/`read_frames_paddlecodec` 函数
- `fastdeploy/input/encodings/`:ernie_encoding.py / paddleocr_encoding.py / qwen_encoding.py 调用侧同步更新
- `fastdeploy/input/utils/__init__.py`:导出符号更新
- `requirements.txt` / `requirements_dcu.txt` / `requirements_iluvatar.txt` / `requirements_metaxgpu.txt``decord``paddlecodec`
- `tests/input/`:test_encodings.py / test_process_video.py / test_video_utils.py 同步更新 mock 和测试用例

## Usage or Command
N/A

## Accuracy Tests
N/A

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [x] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

本 PR 思路清晰,完整地将 decord 替换为 paddlecodec 并同步更新了所有调用点和测试,解决了 ARM 平台兼容性问题。需关注 sys.modules["torchcodec"] = None 的全局副作用风险,以及 VideoDecoder 构造未在 try 块保护内的问题,建议修复后合入。

@yongqiangma yongqiangma merged commit 9d3dc0e into PaddlePaddle:develop May 18, 2026
42 of 44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants