Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing separators when converting pdf to docx #257

Open
wwaguai opened this issue Jan 22, 2024 · 2 comments
Open

Missing separators when converting pdf to docx #257

wwaguai opened this issue Jan 22, 2024 · 2 comments
Labels
feature to be considered question discussion

Comments

@wwaguai
Copy link

wwaguai commented Jan 22, 2024

Hello,

I have noticed that when converting pdf files to docx using the pdf2docx library, the resulting docx file is missing the separators. Specifically, the lines that separate different sections or paragraphs in the PDF are not preserved in the converted document.

I would like to know if there is a way to address this issue and ensure that the separators are retained during the conversion process. For example, I have attached a sample PDF file where this problem occurs.

Any guidance or assistance on resolving this matter would be greatly appreciated.

Thank you!
test_0122.pdf

@dothinking dothinking added feature to be considered question discussion labels Jan 22, 2024
@dothinking
Copy link
Collaborator

Thanks for providing test file.
This is a planned feature (straight line), but unfortunately, it is not supported yet, and might take some time.

@richa27gpt
Copy link

I have the same request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature to be considered question discussion
Projects
None yet
Development

No branches or pull requests

3 participants