feat: add VLM support for text and table recognition
Update 2025.04.15: V1.1.3 Released
Major Changes:
- Support for
VlmTableOCRandVlmTextFormulaOCRmodels based on the VLM interface (see LiteLLM documentation) allowing the use of closed-source VLM models. Installation command:pip install pix2text[vlm].- Usage examples can be found in tests/test_vlm.py and tests/test_pix2text.py.
主要变更:
- 支持基于 VLM 接口(具体参考 LiteLLM 文档)的
VlmTableOCR和VlmTextFormulaOCR模型,可使用闭源 VLM 模型。安装命令:pip install pix2text[vlm]。- 使用方式见 tests/test_vlm.py 和 tests/test_pix2text.py。