[Discussion] How to get PaddleOCR better maintained. #12257

jzhang533 · 2024-04-01T10:44:55Z

jzhang533
Apr 1, 2024
Maintainer

I'd like to start a discussion about maintenance of PaddleOCR project.

Current Status

PaddleOCR is an open-source OCR toolkit, based on PaddlePaddle, and is known for it's high performance, and practical PP-OCR models. While many development activities are conducted by individuals from Baidu, as evidenced by contributors ranked at top of git shortlog -s -n, who are or were Baidu employees, there are also numerous valuable contributions from contributors around the world.

Since the inception of PaddleOCR in 2020, PaddleOCR has become a vital upstream project in the OCR domain, there are 2,328 dependents, with an average of 120 new issues reported every month. Additionally, I have heard of some commercial success cases using PaddleOCR.

However, since last year, development activities have significantly slowed down, with PP-OCR models halted at version 4 and many issues lacking triage, reproduction, and fix. In order to sustainably maintain PaddleOCR, I would like to propose the following actions that the community can take.

Short term actions the community can take

Help to solve long standing issues: there is a fundable project in recent PaddlePaddle event, please check fundable project NO.6, in this issue: 【Hackathon 6th】Fundable Projects Paddle#62908, and recent progress: 【疑难解决】解决PaddleOCR历史存在的疑难Issue #11906.
Improve dev infrastructure:
- Setup a linting workflow in github action: although there is a pre-commit config in the repository, but never protected by a workflow.
- Migrate CI-PaddleOCR-Py37-LinuxUbuntu-Cuda102-PR-Release to github action: I don’t think we need to identically migrate this workflow. Setup a new CI pipeline in GitHub action to cover testcases and several typical mini-size models running on CPU is enough, since PaddlePaddle/Paddle has sophisticated CI checks.
- Modernize python packaging: using pyproject.toml, as recommended here: https://packaging.python.org/en/latest/guides/writing-pyproject-toml/
Comply with best OSS practices: set the default branch to a real develop branch, use semantic versioning, etc. I will take this item.
Improve documentation: dev docs, contribution docs, user facing docs, etc.

Long term actions the community can take

Establish a Project Management Committee (PMC) to contribute and provide oversight for the project. Anyone who is skilled and interested in investing to this project, please reply or send an email to: ext_paddle_oss@baidu.com.
Developing new PP-OCR models: We have organized a competition with OpenAtom Foundation, aimed for accelerating innovation in OCR technologies based on PaddleOCR. However, developing new models is challenging and requiring expertise and resources.
Design and implement new features.

Although I am not an OCR expert, but I do love open source, please reply to discuss and help to improve maintenance of PaddleOCR.

let me ping some people, who I think could share ideas:

PaddleOCR key contributors I know: @dyning @tink2123 @Sunting78
PaddlePaddle OpenSource Development Working Group members: @luotao1 @Aurelius84 @Ligoml @Liyulingyue @Zheng-Bicheng @GreatV @liyongchao911 @jinyouzhi @jzhang533

This post is written in English, so the broader community member can know what's going on, but please feel free to reply in Chinese.

Harryoung · 2024-04-01T11:17:46Z

Harryoung
Apr 1, 2024
Maintainer

As a member of the PaddlePaddle team, I am very happy to assist everyone who plans to contribute to the PaddleOCR project.
Make PaddleOCR great again!🥳

0 replies

GreatV · 2024-04-01T11:31:40Z

GreatV
Apr 1, 2024
Maintainer

Make PaddleOCR great again!🥳

0 replies

GreatV · 2024-04-01T11:50:30Z

GreatV
Apr 1, 2024
Maintainer

We need to add more cutting-edge OCR models, but this requires computational power resources.

0 replies

Liyulingyue · 2024-04-01T11:54:36Z

Liyulingyue
Apr 1, 2024
Collaborator

Make PaddleOCR great again!🥳

0 replies

jinyouzhi · 2024-04-01T14:34:47Z

jinyouzhi
Apr 1, 2024

Willing to do something to make PP-OCR great again within my reach.

0 replies

GreatV · 2024-04-03T04:48:08Z

GreatV
Apr 3, 2024
Maintainer

We may organize a PaddleOCR issues solving contest to promote the elimination of accumulated issues in the community.

0 replies

asif-ca · 2024-04-03T11:30:59Z

asif-ca
Apr 3, 2024

I personally trained one rec-model on multiple languages. I will add more language dictionaries and corpora that I've collected over the past few months.

0 replies

SWHL · 2024-04-19T08:57:12Z

SWHL
Apr 19, 2024
Collaborator

原谅我这里用中文了，英文表达有些费劲。
我在想一个问题：PaddleOCR项目的定位是什么？ 是为了成为应用最广泛的开源OCR？还是成为学术界大家以此作为baseline的基础呢？

这个问题很关键，它决定了之后PaddleOCR怎么发展？

从我看到的情况来看，学术界大部分的学者，但凡涉及到OCR相关的研究时，更愿意用mmocr。因此，在我这里，刻板印象是搞学术用mmocr，搞应用用PaddleOCR.

0 replies

SWHL · 2024-04-19T09:00:23Z

SWHL
Apr 19, 2024
Collaborator

另外一个问题：是否考虑将现有项目中的PPOCRLabel、StyleText、ppstructure分离出去？
现有许多问题都交织在一起，耦合太强了。

0 replies

Zheng-Bicheng · 2024-04-19T09:02:41Z

Zheng-Bicheng
Apr 19, 2024

原谅我这里用中文了，英文表达有些费劲。我在想一个问题：PaddleOCR项目的定位是什么？ 是为了成为应用最广泛的开源OCR？还是成为学术界大家以此作为baseline的基础呢？

这个问题很关键，它决定了之后PaddleOCR怎么发展？

从我看到的情况来看，学术界大部分的学者，但凡涉及到OCR相关的研究时，更愿意用mmocr。因此，在我这里，刻板印象是搞学术用mmocr，搞应用用PaddleOCR.

我其实没明白搞学术和搞应用上侧重的点分别是什么，按我的理解来看，mmocr似乎和ppocr没有特别大的区别呀。

0 replies

jzhang533 · 2024-04-19T09:05:06Z

jzhang533
Apr 19, 2024
Maintainer Author

我觉得还是要定位成应用广泛的开源 OCR，才有独特价值。如果做学术界的 baseline 的话，很现实的问题是 paddle 在学术界的使用还远不如 pytorch。

代码仓库确实需要有一个较大的调整才行，我现在看的也是有些懵，这里有不少历史原因。

0 replies

SWHL · 2024-04-19T09:05:11Z

SWHL
Apr 19, 2024
Collaborator

原谅我这里用中文了，英文表达有些费劲。我在想一个问题：PaddleOCR项目的定位是什么？ 是为了成为应用最广泛的开源OCR？还是成为学术界大家以此作为baseline的基础呢？
这个问题很关键，它决定了之后PaddleOCR怎么发展？
从我看到的情况来看，学术界大部分的学者，但凡涉及到OCR相关的研究时，更愿意用mmocr。因此，在我这里，刻板印象是搞学术用mmocr，搞应用用PaddleOCR.

我其实没明白搞学术和搞应用上侧重的点分别是什么，按我的理解来看，mmocr似乎和ppocr没有特别大的区别呀。

搞学术侧重的是可以复现算法在论文中的效果，提供算法基线
搞应用侧重的是方便部署，依赖少，泛化强，效果好。这个可以参考我们搞的：RapidOCR，不管模型是什么，只要最好的。

0 replies

SWHL · 2024-04-19T09:07:05Z

SWHL
Apr 19, 2024
Collaborator

目前PaddleOCR的优势在于轻量效果最强，可以说做到了极致。但是由于Paddle推理框架的束缚，导致在学术界差了些。

Paddle框架有些硬伤会间接影响PaddleOCR。比如内存泄漏、兼容性、生态不足等。

0 replies

Zheng-Bicheng · 2024-04-19T09:09:07Z

Zheng-Bicheng
Apr 19, 2024

目前PaddleOCR的优势在于轻量效果最强，可以说做到了极致。但是由于Paddle推理框架的束缚，导致在学术界差了些。

Paddle框架有些硬伤会间接影响PaddleOCR。比如内存泄漏、兼容性、生态不足等。

内存泄露似乎torch也有，后两个确实不是短时间内能解决的

0 replies

SWHL · 2024-04-19T09:14:03Z

SWHL
Apr 19, 2024
Collaborator

抛开Paddle框架的原因，说回这个项目。
我觉得优化点可以从以下几点着手：

耦合项目剥离
完善Action，自动化规范化代码、发版
Github中Issue、 Discussion和Project利用起来，各个issue做分类。bug放到issue，讨论需求放到discussion中。修复进度关联到project。感觉1.2k个issue中，大部分都是无效提问，可以先做个整理。

简单想法哈，没有任何恶意。

0 replies

SWHL · 2024-04-20T01:40:39Z

SWHL
Apr 20, 2024
Collaborator

嗯嗯，后续有什么想要一起做的，可以喊我。光畅想不太行呀

0 replies

jzhang533 · 2024-04-22T04:48:54Z

jzhang533
Apr 22, 2024
Maintainer Author

嗯嗯，后续有什么想要一起做的，可以喊我。光畅想不太行呀

是的，现在有四位表达了投入精力维护 PaddleOCR 的想法： @GreatV @Topdu @SWHL @Liyulingyue 。
我们可以在五一之后，正式成立一个 PMC，然后开始启动 PaddleOCR 的社区化研发。

0 replies

Gmgge · 2024-04-23T02:23:51Z

Gmgge
Apr 23, 2024

我提一点哈，我觉得不建议完全脱离研究性方向，标准化，产业化等这一系列成功是因为OCR研究的取得了有效的进展，paddleocr社区当前维护人员的规模可能变换很大，考虑先集中精力从易用向下手保证社区活跃度是非常正确的选择，但是作为国内头部的ocr社区，其实是有责任维护研究向的工作，不建议设计的时候忽视了这方面考虑。

当然了，如果是应用向的开发与维护的工作，这边伸手报名。

0 replies

SWHL · 2024-04-23T02:34:53Z

SWHL
Apr 23, 2024
Collaborator

研究性和应用可以都考虑。
研究性负责探索最新算法，提供实现算法baseline，便于研究人员快速在上面做实验。
选取最新有效的算法，整合到应用分支，打造最强工业可落地模型。

0 replies

abbydev · 2024-04-25T06:12:41Z

abbydev
Apr 25, 2024

1，其实大家都很愿意去支持国产框架以及项目的发展,，要发展好一个生态，其实是很难的，感谢百度的付出。
2，我就提一个意见：不要后劲不足(大众普遍对百度系产品以及开源项目的认知)。要么就彻底放弃该项目，开源档另谋出路，要么就好好维护下去，就算开个账户众筹资金，我相信还是有很多人愿意支持的。

0 replies

Zheng-Bicheng · 2024-04-25T06:18:36Z

Zheng-Bicheng
Apr 25, 2024

1，其实大家都很愿意去支持国产框架以及项目的发展,，要发展好一个生态，其实是很难的，感谢百度的付出。 2，我就提一个意见：不要后劲不足(大众普遍对百度系产品以及开源项目的认知)。要么就彻底放弃该项目，开源档另谋出路，要么就好好维护下去，就算开个账户众筹资金，我相信还是有很多人愿意支持的。

这个很难评。我个人来看，原先的Paddle在开源这块一直是Paddle牵头投资源在搞的（包括其实现在也很依赖Paddle）。由一个公司发起的项目就一定会背指标，那么新项目来了，一些收益不大的旧项目就被放弃了，这挺正常的。

关于后劲的问题，现在已经准备逐渐把部分开源项目彻底社区化了，现在不单单是钱的问题，更多的是很难找到这么多的开发者愿意一起做这件事儿。

0 replies

tink2123 · 2024-04-25T10:36:11Z

tink2123
Apr 25, 2024
Maintainer

大家好呀，想在这里发起一个小讨论：目前PaddleOCR的issue非常之多，部分难定位、难解决的问题给开发者使用造成了困扰。希望可以靠大家的力量一起解决此类问题，提升PaddleOCR的使用体验。看到 @Liyulingyue 已经整理了非常好的模版：https://github.com/PaddlePaddle/PaddleOCR/issues/11906，不过应该如何高效的记录、处理这些问题呢？
目前想到的一个方案，大家看是否可行：
研发同学值班时，对认为无法在短期解决的Issue打一个TAG(例如HardCase或现在已有的triaged)，我们PMC的同学定期筛选标签，看到后记录在#11906 中，并发动更多人报名解决。

除此之外还有什么更好的办法呢？

0 replies

jzhang533 · 2024-04-25T11:14:01Z

jzhang533
Apr 25, 2024
Maintainer Author

大家好呀，想在这里发起一个小讨论：目前PaddleOCR的issue非常之多，部分难定位、难解决的问题给开发者使用造成了困扰。希望可以靠大家的力量一起解决此类问题，提升PaddleOCR的使用体验。

我和 @GreatV @Liyulingyue 筛选了一些需要解决的问题。并专门标注为 triaged: long standing issues

至少这里的问题，靠社区是很难解决的。比如 M2 芯片上运行不了，内存泄露，等。

我反而觉得，更清晰的 PaddleOCR 的产品定位，做好基础设施建设，明确项目的未来发展目标，这些应该更优先搞清楚。

BTW: PMC 只是还在筹备中，还没成立。

0 replies

SWHL · 2024-04-25T11:14:14Z

SWHL
Apr 25, 2024
Collaborator

我的想法是：将现有仓仓库下出ocr以外的项目分别单独建立仓库。将现在issue根据所属项目移动到各种项目下。 ocr下的issue是bug的，不动，打tag 是讨论和想法的，移动到Discussion部分。

…

---- Replied Message ---- | From | ***@***.***> | | Date | 04/25/2024 18:36 | | To | PaddlePaddle/PaddleOCR ***@***.***> | | Cc | SWHL ***@***.***>, Mention ***@***.***> | | Subject | Re: [PaddlePaddle/PaddleOCR] [Discussion] How to get PaddleOCR better maintained. (Issue #11859) | 大家好呀，想在这里发起一个小讨论：目前PaddleOCR的issue非常之多，部分难定位、难解决的问题给开发者使用造成了困扰。希望可以靠大家的力量一起解决此类问题，提升PaddleOCR的使用体验。看到 @Liyulingyue 已经整理了非常好的模版：#11906，不过应该如何高效的记录、处理这些问题呢？目前想到的一个方案，大家看是否可行：研发同学值班时，对认为无法在短期解决的Issue打一个TAG(例如HardCase或现在已有的triaged)，我们PMC的同学定期筛选标签，看到后记录在#11906 中，并发动更多人报名解决。除此之外还有什么更好的办法呢？ — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

tink2123 · 2024-04-26T08:54:50Z

tink2123
Apr 26, 2024
Maintainer

大家好呀，想在这里发起一个小讨论：目前PaddleOCR的issue非常之多，部分难定位、难解决的问题给开发者使用造成了困扰。希望可以靠大家的力量一起解决此类问题，提升PaddleOCR的使用体验。

我和 @GreatV @Liyulingyue 筛选了一些需要解决的问题。并专门标注为 triaged: long standing issues

至少这里的问题，靠社区是很难解决的。比如 M2 芯片上运行不了，内存泄露，等。

我反而觉得，更清晰的 PaddleOCR 的产品定位，做好基础设施建设，明确项目的未来发展目标，这些应该更优先搞清楚。

BTW: PMC 只是还在筹备中，还没成立。

我理想的这个TAG下，只包含高频出现且社区可以解决的问题，例如很多人提到的 pyinstaller 打包失败。像M2芯片无法运行、显存泄漏等，或许归类在其他TAG下，推动Paddle适配更合适。

PaddleOCR由于历史问题，很多算法已经不再维护，确实可以做一个整理将不必要的项目移除。

0 replies

jzhang533 · 2024-04-30T07:47:23Z

jzhang533
Apr 30, 2024
Maintainer Author

I am working on a meeting agenda for 2024 Q2 paddlepaddle community meeting, see PaddlePaddle/community#889 or rendered version.

One of the topics in the meeting would be discussing establishment of PaddleOCR PMC.

to @Sunting78 @tink2123 @GreatV @Topdu @SWHL @Liyulingyue (PMC candidates): I will create a wechat group, and coordinate the meeting schedule with you.

To those involved in this discussion, if you are interested, please send an email to ext_paddle_oss@baidu.com. I will send you the meeting schedule and agenda once they are finalized.

0 replies

sisrfeng · 2024-05-10T02:11:20Z

sisrfeng
May 10, 2024

In my opinion, this repo cantains something outside of the concept of OCR(optical character recognition), e.g. information extraction, which should be split.

我的一些总结, 供大佬们参考, 请不吝赐教: https://cmb-d3-ocr.feishu.cn/docx/TrohdxrDCoQw9zxGCpLcNuZ3nFb

0 replies

alexisdrakopoulos · 2024-05-11T07:19:51Z

alexisdrakopoulos
May 11, 2024

I really think the packaging should be priority, getting paddleOCR working without random segfaults due to some missing dependency is so difficult. I just started trying to use PPStructure and I do not understand why it randomly crashes.

0 replies

jasondalycanpk · 2024-05-11T17:57:28Z

jasondalycanpk
May 11, 2024

Prioritizing the memory leak in #11639 should also be a priority

0 replies

crackso · 2024-06-26T05:58:51Z

crackso
Jun 26, 2024

看到大家对社区的热情，也点燃了我的积极性。有幸认识大家，共同参与维护社区。
Make PaddleOCR great again!🥳

0 replies

[Discussion] How to get PaddleOCR better maintained. #12257

jzhang533 Apr 1, 2024 Maintainer

Current Status

Short term actions the community can take

Long term actions the community can take

Replies: 41 comments

Harryoung Apr 1, 2024 Maintainer

GreatV Apr 1, 2024 Maintainer

GreatV Apr 1, 2024 Maintainer

Liyulingyue Apr 1, 2024 Collaborator

GreatV Apr 3, 2024 Maintainer

SWHL Apr 19, 2024 Collaborator

SWHL Apr 19, 2024 Collaborator

jzhang533 Apr 19, 2024 Maintainer Author

SWHL Apr 19, 2024 Collaborator

SWHL Apr 19, 2024 Collaborator

SWHL Apr 19, 2024 Collaborator

SWHL Apr 20, 2024 Collaborator

jzhang533 Apr 22, 2024 Maintainer Author

SWHL Apr 23, 2024 Collaborator

tink2123 Apr 25, 2024 Maintainer

jzhang533 Apr 25, 2024 Maintainer Author

SWHL Apr 25, 2024 Collaborator

tink2123 Apr 26, 2024 Maintainer

jzhang533 Apr 30, 2024 Maintainer Author

jzhang533
Apr 1, 2024
Maintainer

Harryoung
Apr 1, 2024
Maintainer

GreatV
Apr 1, 2024
Maintainer

GreatV
Apr 1, 2024
Maintainer

Liyulingyue
Apr 1, 2024
Collaborator

GreatV
Apr 3, 2024
Maintainer

SWHL
Apr 19, 2024
Collaborator

SWHL
Apr 19, 2024
Collaborator

jzhang533
Apr 19, 2024
Maintainer Author

SWHL
Apr 19, 2024
Collaborator

SWHL
Apr 19, 2024
Collaborator

SWHL
Apr 19, 2024
Collaborator

SWHL
Apr 20, 2024
Collaborator

jzhang533
Apr 22, 2024
Maintainer Author

SWHL
Apr 23, 2024
Collaborator

tink2123
Apr 25, 2024
Maintainer

jzhang533
Apr 25, 2024
Maintainer Author

SWHL
Apr 25, 2024
Collaborator

tink2123
Apr 26, 2024
Maintainer

jzhang533
Apr 30, 2024
Maintainer Author