Skip to content

v3.4.0

Choose a tag to compare

@Bobholamovic Bobholamovic released this 29 Jan 11:19
· 45 commits to release/3.4 since this release
b1bfbc6

2026.1.29 v3.4.0 released

  • Release the PaddleOCR-VL-1.5 complex document parsing solution.

    PaddleOCR-VL-1.5 is a new iterative version of the PaddleOCR-VL series. Based on comprehensive optimization of the core capabilities of version 1.0, the model achieves 94.5% accuracy on the authoritative document parsing benchmark OmniDocBench v1.5, surpassing top global general-purpose large models and document parsing–specific models.

    PaddleOCR-VL-1.5 innovatively supports irregular-shaped bounding box localization of document elements, enabling excellent performance in real-world application scenarios such as scanning, skew, warping, screen-photography, and complex illumination, achieving comprehensive SOTA performance. In addition, the model further integrates seal recognition and spotting tasks, with key metrics continuing to lead mainstream models.

    You can use it online on the PaddleOCR official website or call the model API.

  • Add support for calling MLX-VLM inference services.

  • The PP-StructureV3 service supports the prettifyMarkdown and showFormulaNumber parameters, with functionality aligned with local inference.

  • Upgrade the PaddleOCR-VL concatenate-pages method to restructure-pages, supporting reorganization of multi-page results without changing the total number of pages, with more flexible usage.

  • Fix potential memory issues caused by the non-thread-safe PDF rendering library in multi-threaded invocation scenarios.

  • The parameter validation logic of production services such as general OCR has been optimized, so that in more cases of invalid input parameters, a 422 status code is returned instead of 500.

  • For GenAIClient, implement graceful exit of the asynchronous event loop to improve system stability and the reliability of resource release.

2026.1.29 v3.4.0 发布

  • 发布 PaddleOCR-VL-1.5 复杂文档解析方案。

    PaddleOCR-VL-1.5 是 PaddleOCR-VL 系列的全新迭代版本。在全面优化 1.0 版本核心能力的基础上,该模型在文档解析权威评测集 OmniDocBench v1.5 上斩获了 94.5% 的高精度,超越了全球的顶尖通用大模型及文档解析专用模型。

    PaddleOCR-VL-1.5 创新性地支持了文档元素的异形框定位,使得 PaddleOCR-VL-1.5 在扫描、倾斜、弯折、屏幕拍摄及复杂光照等真实落地场景中均表现卓越,实现了全面的 SOTA。此外,模型进一步集成了印章识别与文本检测识别任务,关键指标持续领跑主流模型。

    您可以在 PaddleOCR官网 在线使用或者调用该模型的API。

  • 新增对 MLX-VLM 推理服务的调用支持。

  • PP-StructureV3 服务支持 prettifyMarkdownshowFormulaNumber 参数,功能与本地推理对齐。

  • 将 PaddleOCR-VL 的 concatenate-pages 方法升级为 restructure-pages,支持在不改变总页数的情况下重新整合多页结果,用法更加灵活。

  • 修复在多线程调用场景下,PDF 渲染库因非并发安全导致的潜在内存问题。

  • 优化了通用 OCR 等产线服务的参数校验逻辑,在更多传入无效参数的场景下返回 422 状态码,而非 500 状态码。

  • 针对 GenAIClient,实现异步事件循环的优雅退出,提升系统稳定性与资源释放可靠性。

New Contributors

Full Changelog: v3.3.13...v3.4.0