Skip to content
Indonesia Presidential Election tally form (form C1) reader using deep learning (CNN) and computer vision
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
classifiers
static
test_images
README.md
pyimagesearch.py
read-C1-form.py

README.md

C1 Form Reader

Application of computer vision and convolutional neural network (CNN) for automatically reading hand written numbers on C1 form (Indonesia election's tally form).

Prerequisites

You need to have python 3.x and the following libraries must be installed to run c1 form reader:

  • OpenCV
  • Keras
  • Tensorflow
  • Numpy
  • Matplotlib

Installing

Simplest way to install all the requirements is using anaconda.

$ conda create -n c1reader python=3.7 anaconda
$ source activate c1reader
(c1reader) $ pip install opencv-python
(c1reader) $ pip install keras
(c1reader) $ pip install tensorflow

Once all requirements are installed, clone this repository.

(c1reader) $ git clone https://github.com/rzwck/c1-form-reader.git
(c1reader) $ cd c1-form-reader

Running

From the c1-form-reader you can just run the script as the following example:

(c1reader) $ python read-C1-form.py
Using TensorFlow backend.
2019-05-05 11:19:51.089820: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
test_images/test1.jpg {'01': 12, '02': 115, 'valid': 122, 'invalid': 0, 'total': 127}
test_images/test10.jpg {'01': 136, '02': 11, 'valid': 147, 'invalid': 6, 'total': 153}
Form test_images/test11.jpg is unreadable: Unable to find digit positions for invalid ballots count
test_images/test12.jpg {'01': 190, '02': 415, 'valid': 205, 'invalid': 4, 'total': 209}
test_images/test13.jpg {'01': 44, '02': 61, 'valid': 109, 'invalid': 0, 'total': 150}
Form test_images/test14.jpg is unreadable: Unable to find digit positions for valid ballots count
test_images/test15.jpg {'01': 119, '02': 19, 'valid': 138, 'invalid': 0, 'total': 138}
Form test_images/test16.jpg is unreadable: Unable to find digit positions for votes #01
test_images/test17.jpg {'01': 69, '02': 160, 'valid': 229, 'invalid': 7, 'total': 231}
Form test_images/test18.jpg is unreadable: Unable to find digit positions for invalid ballots count
Form test_images/test19.jpg is unreadable: Unable to find digit positions for votes #02
test_images/test2.jpg {'01': 125, '02': 57, 'valid': 182, 'invalid': 3, 'total': 185}
Form test_images/test20.jpg is unreadable: Unable to find digits for total ballots count
test_images/test21.jpg {'01': 72, '02': 164, 'valid': 236, 'invalid': 4, 'total': 240}
test_images/test22.jpg {'01': 116, '02': 24, 'valid': 140, 'invalid': 3, 'total': 143}
test_images/test23.jpg {'01': 59, '02': 150, 'valid': 209, 'invalid': 1, 'total': 210}
test_images/test24.jpg {'01': 89, '02': 133, 'valid': 223, 'invalid': 5, 'total': 228}
Form test_images/test25.jpg is unreadable: Unable to find digit positions for valid ballots count
Form test_images/test3.jpg is unreadable: Unable to find digit positions for votes #01
test_images/test4.jpg {'01': 128, '02': 28, 'valid': 156, 'invalid': 5, 'total': 161}
test_images/test5.jpg {'01': 121, '02': 3, 'valid': 124, 'invalid': 5, 'total': 129}
Form test_images/test6.jpg is unreadable: Unable to find digit positions for votes #02
Form test_images/test7.jpg is unreadable: Unable to find digit positions for votes #02
test_images/test8.jpg {'01': 143, '02': 89, 'valid': 232, 'invalid': 7, 'total': 239}
test_images/test9.jpg {'01': 139, '02': 7, 'valid': 146, 'invalid': 0, 'total': 146}

The script saves the output file on the same folder (test_images). The following screenshot shows some of the outputs produced by c1 form reader. Green boxes are area containing hand written digits, small black boxes with white written numbers are 28x28 hand written digits that will be fed to CNN classifiers. The final number recognized by this program is the white number written on the blue rectangles.

Built With

This scripts utilizes codes from the following sites:

Training dataset

I trained two digits (0-9) classifiers from two different datasets:

  • MNIST database - samples of hand written digits
  • Pilkada DKI 2017 - extracted hand written digits from pilkada DKI's C1 form, since local people might have local hand written styles that is not contained in standard MNIST dataset.

I also trained "X" classifiers and "-" hyphen classifiers for recognizing "X" and "-" characters that is commonly used to denote empty digit box. The sample for this "X" and "-" classifier also came from Pilkada DKI 2017.

You can’t perform that action at this time.