1. Overview

[DocVQA] 이미지와 질문에 따른 답변 추론 시연 영상

Introduce

DocVQA(Document Visual Question Answering)는 RRC(Robust Reading Competetion)에서 2021년에 내놓은 Task 중 하나로, 기존의 DAR방식보다 한 단계 더 높은 난이도의 TASK입니다.

정확히는 문서 이미지의 텍스트(수기, 타이핑 또는 인쇄) 내용을 추출하고 해석할 뿐만 아니라 레이아웃(페이지 구조, 양식, 표), 텍스트가 아닌 요소(마크, 체크박스, 구분자, 다이어그램) 및 스타일(글꼴, 색상, 강조 표시)을 포함한 수많은 다른 시각적 단서를 활용하는 TASK입니다.

What we did

데이터셋에 제공되어있지 않은 Answer index를 찾기 위해 기존의 Hit algorithm을 유클리드 기반으로 수정
Category별 Data Annotation, Error Analysis
Visualize Attention Heatmap
Decoder Generate

2. Project Tree

DocVQA
├─ configs
│  └─ baseline.yaml
├─ data_proces
│  └─ LayoutLMPreprocess.py
├─ install
│  └─ install_requirements.sh
├─ jupyter
│  ├─ Datasets.ipynb
│  ├─ inference.ipynb
│  └─ LayoutLMv2.ipynb
├─ model
│  ├─ BaselineModel.py
│  └─ Decoder.py
├─ save
│  └─ model.pt
├─ trainer
│  ├─ BaselineTrainer.py
│  └─ DecoderTrainer.py
├─ utils
│  ├─ check_dir.py
│  ├─ metric.py
│  ├─ seed_setting.py
│  └─ wandb_setting.py
├─ .gitignore
├─ git_convention.md
├─ train.py
├─ generate.py
└─ inference.py

3. Contributors

김근형	김찬	유선종	이헌득

Github	Github	Github	Github

김근형: Deocder, Streamlit Demo, Fine-tuning
김찬: Result Analysis, Encoder, Question Maker Exp.
유선종: AttentionHeatmap, Hit Algorithm, Refactoring Code, Encoder
이헌득: Decoder, Baseline Modeling, BoundingBox Exp. Code Reviewer

4. Project Pipeline

Reference

Mathew, M., Karatzas, D., & Jawahar, C. V. (2021). Docvqa: A dataset for vqa on document images. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 2200-2209).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

data_process

data_process

install

install

jupyter

jupyter

model

model

trainer

trainer

utils

utils

.gitignore

.gitignore

README.md

README.md

git_convention.md

git_convention.md

train.py

train.py

train_manymany.sh

train_manymany.sh

Repository files navigation

1. Overview

Introduce

What we did

2. Project Tree

3. Contributors

4. Project Pipeline

Reference

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
configs		configs
data_process		data_process
install		install
jupyter		jupyter
model		model
trainer		trainer
utils		utils
.gitignore		.gitignore
README.md		README.md
git_convention.md		git_convention.md
train.py		train.py
train_manymany.sh		train_manymany.sh

boostcampaitech4lv23nlp1/final-project-level3-nlp-03

Folders and files

Latest commit

History

Repository files navigation

1. Overview

Introduce

What we did

2. Project Tree

3. Contributors

4. Project Pipeline

Reference

About

Resources

Stars

Watchers

Forks

Languages