Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【功能新增】增强对PPT、DOC知识库文件的OCR识别 #2013

Merged
merged 6 commits into from Jan 12, 2024

Conversation

596192804
Copy link
Contributor

当在知识库管理页面添加PPT、DOC文件至知识库时,默认使用的是langchain的UnstructuredFileLoader,该loader默认只会加载文件中的文字,而无法加载文档里图片中的文字。

@596192804 596192804 changed the title 【功能新增】增强对PPT、DOC文件的OCR识别 【功能新增】增强对PPT、DOC知识库文件的OCR识别 Nov 10, 2023
@zRzRzRzRzRzRzR
Copy link
Collaborator

mark 我们检查一下

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Jan 11, 2024
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jan 12, 2024
@zRzRzRzRzRzRzR zRzRzRzRzRzRzR merged commit 75ff268 into chatchat-space:dev Jan 12, 2024
@liunux4odoo liunux4odoo mentioned this pull request Jan 25, 2024
liunux4odoo added a commit that referenced this pull request Jan 25, 2024
新功能:
- 优化 PDF 文件的 OCR,过滤无意义的小图片 by @liunux4odoo #2525
- 支持 Gemini 在线模型 by @yhfgyyf #2630
- 支持 GLM4 在线模型 by @zRzRzRzRzRzRzR
- elasticsearch更新https连接 by @xldistance #2390
- 增强对PPT、DOC知识库文件的OCR识别 by @596192804 #2013
- 更新 Agent 对话功能 by @zRzRzRzRzRzRzR
- 每次创建对象时从连接池获取连接,避免每次执行方法时都新建连接 by @Lijia0 #2480
- 实现 ChatOpenAI 判断token有没有超过模型的context上下文长度 by @glide-the
- 更新运行数据库报错和项目里程碑 by @zRzRzRzRzRzRzR #2659
- 更新配置文件/文档/依赖 by @imClumsyPanda @zRzRzRzRzRzRzR
- 添加日文版 readme by @eltociear #2787

修复:
- langchain 更新后,PGVector 向量库连接错误 by @HALIndex #2591
- Minimax's model worker 错误 by @xyhshen 
- ES库无法向量检索.添加mappings创建向量索引 by MSZheng20 #2688
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants