<img align="right" width="400" src="https://www.fhnw.ch/de/++theme++web16theme/assets/media/img/fachhochschule-nordwestschweiz-fhnw-logo.svg" alt="FHNW Logo">


# Document Question Answering using Transformers

by Fabian Märki

## Summary
The aim of this notebook is to show how Huggingface's model can be used for document question answering.


## Links
- [Notebooks](https://huggingface.co/docs/transformers/notebooks) on a different topics (fine tuning,  translation, summarization, question answering, audio classification, image classification etc.)
- [Enabling GPU on Google Colab](https://www.tutorialspoint.com/google_colab/google_colab_using_free_gpu.htm)

<a href="https://colab.research.google.com/github/markif/2022_HS_DAS_NLP_Notebooks/blob/master/08_d_Transformers_Document_Question_Answering.ipynb">
  <img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

In [1]:
%%capture

!pip install 'fhnw-nlp-utils>=0.4.8,<0.5.0'

**Make sure that a GPU is available (see [here](https://www.tutorialspoint.com/google_colab/google_colab_using_free_gpu.htm))!!!**

In [2]:
from fhnw.nlp.utils.system import system_info
print(system_info())

OS name: posix
Platform name: Linux
Platform release: 5.15.0-48-generic
Python version: 3.8.10
CPU cores: 6
RAM: 31.12GB total and 24.97GB available
Tensorflow version: 2.10.0
GPU is available
GPU is a NVIDIA GeForce RTX 2070 with Max-Q Design with 8192MiB


In [33]:
!pip install transformers
!apt-get install -y tesseract-ocr
!pip install Pillow pytesseract 

In [5]:
from transformers import pipeline

qa_pipeline = pipeline(
    "document-question-answering",
    model="impira/layoutlm-document-qa",
)

Downloading:   0%|          | 0.00/789 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/511M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/315 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/798k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/239 [00:00<?, ?B/s]

In [11]:
%%time

image = "https://templates.invoicehome.com/invoice-template-us-neat-750px.png"
qa_pipeline(
    image,
    "What is the invoice number?"
)

CPU times: user 946 ms, sys: 83 ms, total: 1.03 s
Wall time: 692 ms


{'score': 0.42514947056770325, 'answer': 'us-001', 'start': 16, 'end': 16}

In [14]:
from IPython.display import Image
from IPython.core.display import HTML

Image(url=image) 

In [29]:
%%time

image = "https://templates.invoicehome.com/invoice-template-us-neat-750px.png"
qa_pipeline(
    image,
    "What is the due date?"
)

CPU times: user 912 ms, sys: 51.6 ms, total: 964 ms
Wall time: 705 ms


{'score': 0.9999262094497681, 'answer': '26/02/2019', 'start': 42, 'end': 42}

In [30]:
%%time

image = "https://templates.invoicehome.com/invoice-template-us-neat-750px.png"
qa_pipeline(
    image,
    "Who is the buyer?"
)

CPU times: user 909 ms, sys: 29.5 ms, total: 938 ms
Wall time: 625 ms


{'score': 0.13438580930233002, 'answer': 'John Smith', 'start': 17, 'end': 18}

In [31]:
%%time

image = "https://templates.invoicehome.com/invoice-template-us-neat-750px.png"
qa_pipeline(
    image,
    "Who is the issuer?"
)

CPU times: user 921 ms, sys: 30.7 ms, total: 951 ms
Wall time: 637 ms


{'score': 0.8599034547805786,
 'answer': 'East Repair Inc.',
 'start': 1,
 'end': 3}

In [15]:
%%time

image = "https://miro.medium.com/max/787/1*iECQRIiOGTmEFLdWkVIH2g.jpeg"
qa_pipeline(
    image,
    "What is the purchase amount?"
)

CPU times: user 1.36 s, sys: 87.2 ms, total: 1.44 s
Wall time: 1.27 s


{'score': 0.9998499155044556,
 'answer': '$1,000,000,000',
 'start': 97,
 'end': 97}

In [16]:
Image(url=image) 

In [18]:
%%time

image = "https://www.accountingcoach.com/wp-content/uploads/2013/10/income-statement-example@2x.png"
qa_pipeline(
    image,
    "What are the 2020 net sales?"
)

CPU times: user 625 ms, sys: 35 ms, total: 660 ms
Wall time: 988 ms


{'score': 0.9938769340515137, 'answer': '$ 3,980', 'start': 15, 'end': 16}

In [19]:
Image(url=image) 

In [28]:
image = "https://www.accountingcoach.com/wp-content/uploads/2013/10/income-statement-example@2x.png"
qa_pipeline(
    image,
    "Issuer?"
)

{'score': 0.9878706932067871,
 'answer': 'Example Corporation',
 'start': 0,
 'end': 1}

In [27]:
image = "https://www.accountingcoach.com/wp-content/uploads/2013/10/income-statement-example@2x.png"
qa_pipeline(
    image,
    "Document type?"
)

{'score': 0.2636066973209381,
 'answer': 'Income Statement',
 'start': 2,
 'end': 3}