Skip to content
End-to-end variable length Captcha recognition using CNN+RNN+Attention/CTC (pytorch implementation). 端到端的不定长验证码识别
Python
Branch: master
Clone or download
Latest commit 48df533 Oct 27, 2017
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data set ff=unix Sep 28, 2017
fonts set ff=unix Sep 28, 2017
images add epoch_error.png Oct 27, 2017
results set ff=unix Sep 28, 2017
.gitignore set ff=unix Sep 28, 2017
GenCaptcha.py progress in ctc Sep 29, 2017
LICENSE Initial commit Sep 18, 2017
README.md Update README.md Oct 27, 2017
ctcmain.py ctc complete Oct 27, 2017
ctcmodel.py ctc complete Oct 27, 2017
data_utils_torch.py add test Oct 26, 2017
main.py progress in ctc Sep 29, 2017
model.py minor fix Sep 18, 2017
utils.py ctc complete Oct 26, 2017

README.md

CaptchaRecognition

End-to-end variable length Captcha recognition using CNN+RNN+Attention or CTC. 端到端的不定长验证码识别

encoder: CNN+RNN or CNN

decoder: two types of attention +no attention

目前encoder端可以选择使用CNN+RNN或CNN;decoder端有两种attention方式+不使用attention。

update(2017-10-27)

  • RNN+CTC added. 新增了 CNN+RNN + CTC 的验证码识别。

Usage

Put your font file in fonts directory, and change the font file's name in line 173 of GenCaptcha.py.

To generate training data in data/, run python GenCaptcha.py.

Run python main.py to train.

把字体文件放入fonts文件夹,并修改GenCaptcha.py中第173行的字体文件名。

运行 python GenCaptcha.py ,在data/下生成数据集captcha.npz和captcha.vocab_dict。

(GenCaptcha.py中还提供了生成tfrecord文件的函数。)

运行 python main.py训练。

Using CTC

Install warp_ctc for pytorch, and run python ctcmain.py.

安装pytorch下的warp_ctc,

按照上一节的步骤生成数据后,运行 python ctcmain.py.

Results

results of using CTC (accuracy reaches 0.95 on test set):

You can’t perform that action at this time.