<div style="text-align: right;">
  <img src="https://raw.githubusercontent.com/exasol/ai-lab/refs/heads/main/assets/Exasol_Logo_2025_Dark.svg" style="width:200px; margin: 10px;" />
</div>

# Question answering model

In this notebook, we will load and use a question-answering language model that can retrieve the answer to a question from a given text. Learn more about the Question Answering task <a href="https://huggingface.co/tasks/question-answering" target="_blank" rel="noopener">here</a>. Please also refer to the Transformer Extension <a href="https://github.com/exasol/transformers-extension/blob/main/doc/user_guide/user_guide.md" target="_blank" rel="noopener">User Guide</a> to find more information about the UDF used in this notebook.

We will be running SQL queries using <a href="https://jupysql.ploomber.io/en/latest/quick-start.html" target="_blank" rel="noopener"> JupySQL</a> SQL Magic.

## Prerequisites

Prior to using this notebook the following steps need to be completed:
1. [Configure the AI Lab](../main_config.ipynb).
2. [Initialize the Transformer Extension](te_init.ipynb).

## Setup

### Open Secure Configuration Storage

In [None]:
%run ../utils/access_store_ui.ipynb
display(get_access_store_ui('../'))

Let's bring up JupySQL and connect to the database via SQLAlchemy. Please refer to the documentation of <a href="https://github.com/exasol/sqlalchemy-exasol" target="_blank" rel="noopener">sqlalchemy-exasol</a> for details on how to connect to the database using the Exasol SQLAlchemy driver.

In [None]:
%run ../utils/jupysql_init.ipynb

## Get a language model

To demonstrate the question-answering task we will use the [roberta model](https://huggingface.co/deepset/roberta-base-squad2).

We need to load the model from the Hugging Face Hub into the [BucketFS](https://docs.exasol.com/db/latest/database_concepts/bucketfs/bucketfs.htm). This could potentially be a long process, depending on the connection of the Database. Unfortunately, we cannot tell exactly when it has finished. The notebook's hourglass may not be a reliable indicator. BucketFS will still be doing some work when the call issued by the notebook returns. Please wait for a few moments after that, before querying the model.

You might see a warning that some weights are newly initialized and the model should be trained on a down-stream task. Please ignore this warning. For the purpose of this demonstration, it is not important, the model should still be able to produce some meaningful output.

In [None]:
from exasol.nb_connector.model_installation import install_model, TransformerModel
from transformers import AutoModelForQuestionAnswering

# This is the name of the model at the Hugging Face Hub
MODEL_NAME = 'deepset/roberta-base-squad2'
install_model(ai_lab_config, TransformerModel(MODEL_NAME, 'question_answering', AutoModelForQuestionAnswering))

## Use the language model

We are going to check the model output given the same question but two different contexts, using the `TE_QUESTION_ANSWERING_UDF`. In neither case the context has a direct answer to the question. We expect the answer to be relevant to the context.

In [None]:
# This will be our question
TEST_QUESTION = 'What is bitumen used for?'

# Let's first try it first with the following context
TEST_CONTEXT1 = """
Apart from the stylish design features of new flat roofs, the other thing that’s moved on considerably is the technology
used to keep them weather-proof. Once flat roofs were notoriously prone to leaking and the problem could only be solved
with a boiling cauldron of tar. These days there are patch repair kits, liquid rubber membranes, and even quick,
efficient waterproofing paint that lasts for ages – and can even be applied in damp weather.
"""

# Make sure our texts can be used in an SQL statement.
TEST_QUESTION = TEST_QUESTION.replace("'", "''")
TEST_CONTEXT1 = TEST_CONTEXT1.replace("'", "''")

The udf takes various input parameters:

* device_id: To run on a GPU, specify the valid cuda device ID.
* bucketfs_conn: The BucketFS connection name.
* sub_dir: The directory where the model is stored in the BucketFS.
* model_name: The name of the model to use for prediction.
* question: The question text.
* context_text: The context text, associated with the question.
* top_k: The max number of answers to return.

You need to supply these parameters in the correct order. Further information can be found in the  <a href="https://github.com/exasol/transformers-extension/blob/main/doc/user_guide/user_guide.md" target="_blank" rel="noopener">User Guide</a>.


We will collect at most the 5 best answers.
We will save the result in the variable `udf_output` to support automatic testing of this notebook.

In [None]:
%%sql
WITH MODEL_OUTPUT AS
(
    SELECT TE_QUESTION_ANSWERING_UDF(
        NULL,
        '{{ai_lab_config.bfs_connection_name}}',
        '{{ai_lab_config.bfs_model_subdir}}',
        '{{MODEL_NAME}}',
        '{{TEST_QUESTION}}',
        '{{TEST_CONTEXT1}}',
        5
    )
)
SELECT answer, score, error_message FROM MODEL_OUTPUT ORDER BY SCORE DESC

As you can see, we select only some of the udf's output columns in these examples.  If you need more details to your output, you can find information on all output columns in the <a href="https://github.com/exasol/transformers-extension/blob/main/doc/user_guide/user_guide.md" target="_blank" rel="noopener">User Guide</a>.

The output of the model is sorted into the following columns by the udf:

* answer: the generated answer for the input question
* score: the confidence of the answer
* rank: the rank of the answer. In this context, all answers for one input are ranked by their score. rank=1 means best result/highest score.
* error_message: error occurring while executing the udf will be saved here

Let's now change the context and see a different set of answers.

In [None]:
# New context
TEST_CONTEXT2 = """
You can make a wooden planter in a day, using treated timber. Simply work out how big an area you need,
cut the wood to size and follow our steps to putting the planter together. Make sure your wooden planter
has drainage holes, so plants don’t become waterlogged.
"""

# Make sure our text can be used in an SQL statement.
TEST_CONTEXT2 = TEST_CONTEXT2.replace("'", "''")

In [None]:
%%sql
WITH MODEL_OUTPUT AS
(
    SELECT TE_QUESTION_ANSWERING_UDF(
        NULL,
        '{{ai_lab_config.bfs_connection_name}}',
        '{{ai_lab_config.bfs_model_subdir}}',
        '{{MODEL_NAME}}',
        '{{TEST_QUESTION}}',
        '{{TEST_CONTEXT2}}',
        5
    )
)
SELECT answer, score, error_message FROM MODEL_OUTPUT ORDER BY SCORE DESC

The code above shows how the model works on a toy example. However, the main purpose of having a model deployed in the database is to get a quick response for a batch input. The performance gain comes from two factors - localization and parallelization. The first means that the input data never crosses the machine boundaries. The second means that multiple instances of the model are processing the data on all available nodes in parallel.

Another advantage of making predictions within the database is enhanced data security. The task of safeguarding privacy can be simplified given the fact that the source data never leaves the database machine.

In a more practical application, the question and the context would be stored in columns of a database table. For example, if we wanted to get the best answer for each row of the input table `MY_TEXT_TABLE`, where the question is in the column `MY_QUESTION` and the context is in the column `MY_CONTEXT`, the SQL would look similar to this:
```
SELECT TE_QUESTION_ANSWERING_UDF(..., MY_QUESTION, MY_CONTEXT, 1) FROM MY_TEXT_TABLE;
```
Please note, that the response time observed on the provided example with a single input will not be scaled up linearly in case of multiple inputs. Much of the latency falls on loading the model into the CPU memory from BucketFS. This needs to be done only once regardless of the number of inputs.