Skip to content

williamberrios/Automatic-Review-Report-Telco-Industry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Objective

The challenge involves developing an object detection and OCR model to automate the review of documents that a technician collects during each installation visit and that are then manually reviewed when they are delivered to their base, which could lead to human error.

Description

For this challenge we are only asked to determine the location of 3 fields of the format (2 signatures and 1 date) and to obtain the handwritten date separated in day month and year.

Solution

Our solution is "totally free" and not depending of any api's that might involucrate a cost in the near future. It is divided into 3 main parts:

01. Pre-processing images

Image aligment and standarization of orientation, size, proportion and JGP format.

02. Sign detection model

CNN architecture to binary classification in order to know is the sign is present or not.

03. Date recognition model

FasterRCNN - based object detection approach for recognizing date characters

Getting Started

1. Clone our repository

$ git clone https://github.com/williamberrios/Datathon-Entel-Object-Dectection.git

2. Downloading dataset

https://www.kaggle.com/c/datathon-entel-2021-reto1/data

3.Prepare your enviroment

$ pip install -r requirements.txt

You can follow the whole methodology to obtain our results, or you can go directly to step "9" to process the test data with the trained models. Donwload pytorch date recognition model and save it in 03.SavedModels folder

4.Pre-processing dataset

$ cd 02.Codes
$ cd 01.PeprocesingImages
$ python ImagePreprocessing.py
After the pre-processing we obtain this structure:

    ├processed        
    ├── images_test
    │   ├── aligned
    │   ├── fechas
    │   └── firmas
    └── images_train
        ├── aligned
        ├── fechas
        ├── firmas
        └── firmas_modelo
            ├── 1
            ├── 0
            └── modelamiento

5.Labeling the train dataset for object detection

Once we have obtained the aligned image of the date, and split the date. We can do the labeling of each character using the free project labelImg

After that, we obtain this structure:

    ├labeling        
    ├── train
    └── classes.txt

6. Training & Evaluating sign model detection

$ cd 02.Codes
$ cd 02.ModeloFirmas
$ python training.py
$ python evaluate.py
Training and Validation Loss
Confusion Matrix
Snow
Mountains

8. Trainig date recognition model

$ cd 02.Codes
$ cd 03.ModeloFechas
$ run jupyter notebook  01.GenerateDataset-Fechas.ipynb
$ python train.py

9.Main process

$ cd 02.Codes
$ cd 04.Main
$ run jupyter notebook 01-Main.ipynb
We obtain the final submission 

    ├── 01-Main.ipynb
    ├── config.py
    └── submissions
        └── final_submission.csv

10. Deployment

You can check inside 04.Resources/Videos in order to see a video demostration of the deployment using Vue, JavaScript and python.

Team: Insight_ML

Members:

About

Top 5 Solution to Datathon Entel - Object Detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages