Skip to content

该项目采用最前沿的AI算法,针对合同扫描文档进行识别和抽取。

License

Notifications You must be signed in to change notification settings

cuppersd/Fast-Chinese-OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Fast-Chinese-OCR

This project uses the most advanced AI algorithm to identify and extract contract scanned documents.

Introduction

  • A text detector based on Mask R-CNN is used, and the methods are mainly inspired by fully convolutional networks. First, CNN is adopted to detect text blocks, from which character candidates are extracted. Then FPN is used to predict the corresponding segmentation masks. Last, segmentation mask is used to find suitable rectangular bounding boxes for the text instances.

  • The pre-trained model provided on ICDAR 2017 Incidental Scene Text Detection Challenge using only training images from ICDAR 2017 and 2019.

  • This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.

  • This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR. For details, please refer to our paper http://arxiv.org/abs/1507.05717.

Instance Segmentation Sample

Contents

  1. Installation
  2. Download
  3. Demo
  4. Test
  5. Train
  6. Examples
  7. Result

Installation

  • Python 3.6+
  • Tensorflow v2.0.0+
  • opencv-python 3.4+

Download

Models trained on ICDAR 2017 (training set) + ICDAR 2019 (training set): Download link

Test

Visit the website [http://werfwef.qicp.vip/ocr/file/]

Train

Result

Using only ICDAR 2017 MLT training set and ICDAR 2019 training set. Mask R-CNN for ICDAR MLT 2017 Challenge 1 Text detection.
Method Precision (%) Recall (%) F-measure (%)
Mask R-CNN-resnet101 83.52 76.58 79.89

About

该项目采用最前沿的AI算法,针对合同扫描文档进行识别和抽取。

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages