Skip to content

Brazilian Identity Document Dataset (BID Dataset): The first public dataset of Brazilian identification documents.

Notifications You must be signed in to change notification settings

ricardobnjunior/Brazilian-Identity-Document-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Brazilian-Identity-Document-Dataset

This repository introduces the dataset named Brazilian Identity Document Dataset (BID Dataset): The first public dataset of Brazilian identification documents.

BID Dataset was presented in work: "BID Dataset: a challenge dataset for document processing tasks", and aims at three crucial challenges in the Computer Vision field: (i) Document Images Classification; (ii) Text Region Segmentation and (iii) Optical Character Recognition (OCR). BID Dataset is composed of images of Brazilian identification documents divided into eight classes: front and back faces of National Driver's License (CNH), CNH front face, CNH back face, Natural Persons Register (CPF) front face, CPF back face, General Registration (RG) front face, RG back face, and RG front and back faces.

BID Dataset is composed of 28,800 document images, with 3,600 samples for each class.

Download the dataset

Sample Dataset: https://drive.google.com/file/d/144EqqmMtCziua9iYo-3afUEvZrJVxUXU/view?usp=sharing

Full Dataset: https://drive.google.com/file/d/1Oi88TRcpdjZmJ79WDLb9qFlBNG8q2De6/view?usp=sharing

If you use this dataset, please cite:

@inproceedings{sibgrapi_estendido,
 author = {Álysson Soares and Ricardo das Neves Junior and Byron Bezerra},
 title = {BID Dataset: a challenge dataset for document processing tasks},
 booktitle = {Anais Estendidos do XXXIII Conference on Graphics, Patterns and Images},
 location = {Evento Online},
 year = {2020},
 keywords = {},
 issn = {0000-0000},
 pages = {143--146},
 publisher = {SBC},
 address = {Porto Alegre, RS, Brasil},
 doi = {10.5753/sibgrapi.est.2020.12997},
 url = {https://sol.sbc.org.br/index.php/sibgrapi_estendido/article/view/12997}
}

About

Brazilian Identity Document Dataset (BID Dataset): The first public dataset of Brazilian identification documents.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages