Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何提取水印? #20

Open
XinwenXiang opened this issue Apr 24, 2023 · 4 comments
Open

如何提取水印? #20

XinwenXiang opened this issue Apr 24, 2023 · 4 comments

Comments

@XinwenXiang
Copy link

您好,查看了您对于pdf文件嵌入的源码,请问是否有对应提取水印的方法?

@ekoz
Copy link
Owner

ekoz commented Apr 25, 2023

@XinwenXiang 你好,请问是提取水印,还是移除水印,如果是移除水印,我google了下,pdfbox 应该可以移除图片或文本水印,如果是提取水印,是提取为图片,还是文本呢?

@XinwenXiang
Copy link
Author

您好,是提取水印,想要提取水印的文本信息。
项目逻辑应该是addWaterMark中可以设置嵌入的水印信息(我这边暂时只考虑文本信息),但是缺少一个对应的extractWaterMark()处理,我看到有些用python实现的代码库,但是Java还没找到一个较为合适的。

@ekoz
Copy link
Owner

ekoz commented Apr 26, 2023

@XinwenXiang pdf 添加文本水印,其实是通过文本生成一张图片,再把图片调用 itextpdf PdfStamper 方法生成水印,提取文本我还没有思路,itextpdf 好像没有提供方法。如果 python 来实现,是 ocr 提取文本么?找到一篇相关论文:https://patents.google.com/patent/CN107194390A/zh

@XinwenXiang
Copy link
Author

您好,看了下Java的Itext 这些开源仓库,在嵌入水印的时候可以设置文本信息但是没有对应的提取(此部分我还需要再学习下),这篇专利看起来没有代码,能参考的不多。感谢您的回复和帮助!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants