Skip to content

🪜python对网页表格进行截取并从中提炼信息,基于OpenCV+PaddleOCR实现

Notifications You must be signed in to change notification settings

bishbishup/information-extraction-project

Repository files navigation

表格截取抓取信息项目📈📉

1、从大图识别各个表格的位置然后切割出来

2、对每个表格使用paddleOCR模型识别厂商、商品的字符串,使用cv2中的matchtemplate对价格趋势的符号进行匹配转换

3、提取出表格所有厂商的有关信息并将结果存储到Result.xlsx中

About

🪜python对网页表格进行截取并从中提炼信息,基于OpenCV+PaddleOCR实现

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages