This code uses the BERT (Bidirectional Encoder Representations from Transformers) model to perform Q&A on a given Polish context. It utilizes the transformers library, and tracks the amount of data downloaded during the process. A function is defined to take a question as input and returns the answer using the Q&A pipeline with the provided context. The main purpose of this code is to provide a breakdown of payment information using BERT's Q&A capabilities instead of regular expressions.

In [37]:
!pip install transformers
!pip install humanize
!pip install psutil

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [38]:
import psutil
initial_io_counters = psutil.net_io_counters()

In [39]:
from transformers import pipeline

qa_pipeline = pipeline(
    "question-answering",
    model="henryk/bert-base-multilingual-cased-finetuned-polish-squad2",
    tokenizer="henryk/bert-base-multilingual-cased-finetuned-polish-squad2"
)

In [40]:
final_io_counters = psutil.net_io_counters()
data_downloaded = final_io_counters.bytes_recv - initial_io_counters.bytes_recv
import humanize
print(f'Data downloaded: {humanize.naturalsize(data_downloaded)}')

Data downloaded: 142.9 kB


In [41]:
context="""
Informacja odnośnie rozliczenia za wyżywienie
Ilość dni w IX / 2023 r. - 21 dni 
Ilość dni zgłoszonych nieobecności z poprzedniego miesiąca: 14 

Wyliczenia: 21-14= 7
7x17,00= 119 


Kwota do zapłaty na konto : 119 zł. 
"""

In [42]:
def ask(question):
  return qa_pipeline({
    'context': context,
    'question': question})

In [43]:
ask("Ilość dni?")

{'score': 0.4355515241622925, 'start': 74, 'end': 76, 'answer': '21'}

In [44]:
ask("Ilość dni nie obecności?")

{'score': 0.00016279886767733842, 'start': 142, 'end': 144, 'answer': '14'}

In [45]:
ask("W jakim miesiącu i roku odbywa się rozliczenie za wyżywienie?")

{'score': 0.005733450409024954, 'start': 59, 'end': 68, 'answer': 'IX / 2023'}

In [46]:
%time ask("Jaka kwota do zapłaty?")

CPU times: user 368 ms, sys: 925 µs, total: 369 ms
Wall time: 375 ms


{'score': 0.8604817986488342, 'start': 212, 'end': 218, 'answer': '119 zł'}

In [47]:
def create_qa_input(questions, context):
    qa_input = []
    for question in questions:
        qa_input.append({'question': question, 'context': context})
    return qa_input

questions = ["ilość dni?", "Ilość dni nieobecności?", "W jakim miesiącu i roku odbywa się rozliczenie za wyżywienie?", "Jaka kwota do zapłaty?"]
qa_pipeline(create_qa_input(questions, context))

[{'score': 0.40143856406211853, 'start': 74, 'end': 76, 'answer': '21'},
 {'score': 0.2581802010536194, 'start': 142, 'end': 144, 'answer': '14'},
 {'score': 0.005733450409024954,
  'start': 59,
  'end': 68,
  'answer': 'IX / 2023'},
 {'score': 0.8604817986488342, 'start': 212, 'end': 218, 'answer': '119 zł'}]

In [48]:
#TODO: Next step is to implement transfer learning on the BERT model for further fine-tuning and improved performance 

In [49]:
!pip install ipywidgets

# this will allow the notebook to reload/refresh automatically within the runtime
%reload_ext autoreload
%autoreload 2

from ipywidgets import interact

def f(x):
  return x

interact(f, x=10)

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


interactive(children=(IntSlider(value=10, description='x', max=30, min=-10), Output()), _dom_classes=('widget-…

<function __main__.f(x)>

In [50]:
from ipywidgets import widgets
btn_upload = widgets.FileUpload()
btn_upload

FileUpload(value={}, description='Upload')

In [58]:
lbl_pred = widgets.Label()
lbl_pred.value = "MyWebApp"
lbl_pred

style = {'description_width': 'initial'}
title = widgets.Text(
    description='Title:', value='Landsat Timelapse', width=200, style=style
)

submit_button = widgets.Button(
    description='Ask!',
    button_style='primary',
    tooltip='Click to create timelapse',
    style=style,
)

context_text = widgets.Textarea(value=context,
                            placeholder='OK',
                            description='context',
                            style=style,
                            rows=10,
                            layout=widgets.Layout(height="auto", width="auto"))

question_text = widgets.Textarea(value=questions[0],
                            placeholder='OK',
                            description='question',
                            style=style,
                            layout=widgets.Layout(height="auto", width="auto"))


ai_answer_text = widgets.Textarea(value='',
                            placeholder='OK',
                            description='ai_answer',
                            style=style,
                            layout=widgets.Layout(height="auto", width="auto"))


def submit_clicked(b):
  ai_answer_text.value=str(qa_pipeline({
    'context': context_text.value,
    'question': question_text.value}))

submit_button.on_click(submit_clicked)

submit_button

hbox1 = widgets.VBox([lbl_pred, context_text, question_text, submit_button, ai_answer_text])
hbox1

VBox(children=(Label(value='MyWebApp'), Textarea(value='\nInformacja odnośnie rozliczenia za wyżywienie\nIlość…

Make it into real WebApp

In [59]:
!pip install voila
!jupyter serverextension enable voila --sys-prefix

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting voila
  Downloading voila-0.4.0-py3-none-any.whl (5.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.5/5.5 MB[0m [31m32.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting jupyter-server<2.0.0,>=1.18
  Downloading jupyter_server-1.23.5-py3-none-any.whl (346 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m347.0/347.0 KB[0m [31m31.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting websockets>=9.0
  Downloading websockets-10.4-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (106 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m107.0/107.0 KB[0m [31m13.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nbclient<0.8,>=0.4.0
  Downloading nbclient-0.7.2-py3-none-any.whl (71 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m72.0/72.0 KB[0m [31m7.9 MB/s[0m eta 