Skip to content

[Bug] For academic documents pdf and converting pdf into word, the analysis is incorrect because academic documents are divided into two sides. The order is read on the left and right, and then the next page. #2903

@35plus

Description

@35plus

Contact Information

No response

MaxKB Version

v1.10.4

Problem Description

针对学术文档pdf,以及pdf转成word,解析不正确,因为学术文档是 一页 分为左右两面顺序是左边读完,顺序右边,然后才是下一页。

Steps to Reproduce

NGS用于食品微生物研究fmicb-08-01829.pdf
最近 在 多个客户那里发现这样的问题,都是学术类型的机构。

  1. 上面是附件,可以打开看一下。人的阅读顺序是 左边 读完再读右边,然后下一页。
  2. 针对这类型的解析也需要按照这样的顺序进行解析。

The expected correct result

目前看需要针对这种学术文档,参照Excel文件,专门处理。

Related log output

Additional Information

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions