<div style="text-align: right;">
  <img src="https://raw.githubusercontent.com/exasol/ai-lab/refs/heads/main/assets/Exasol_Logo_2025_Dark.svg" style="width:200px; margin: 10px;" />
</div>

# Fill-Mask model

In this notebook, we will load and use a masked language model. This kind of model predicts which words would replace masked words in a sentence. Learn more about the Fill-Mask task <a href="https://huggingface.co/tasks/fill-mask" target="_blank" rel="noopener">here</a>. Please also refer to the Transformer Extension <a href="https://github.com/exasol/transformers-extension/blob/main/doc/user_guide/user_guide.md" target="_blank" rel="noopener">User Guide</a> to find more information about the UDF used in this notebook.

We will be running SQL queries using <a href="https://jupysql.ploomber.io/en/latest/quick-start.html" target="_blank" rel="noopener"> JupySQL</a> SQL Magic.

## Prerequisites

Prior to using this notebook the following steps need to be completed:
1. [Configure the AI-Lab](../main_config.ipynb).
2. [Initialize the Transformer Extension](te_init.ipynb).

## Setup

### Open Secure Configuration Storage

In [None]:
%run ../utils/access_store_ui.ipynb
display(get_access_store_ui('../'))

Let's bring up JupySQL and connect to the database via SQLAlchemy. Please refer to the documentation of <a href="https://github.com/exasol/sqlalchemy-exasol" target="_blank" rel="noopener">sqlalchemy-exasol</a> for details on how to connect to the database using the Exasol SQLAlchemy driver.

In [None]:
%run ../utils/jupysql_init.ipynb

## Get language model

To demonstrate the filling of a masked word task we will use a [RadBERT model](https://huggingface.co/StanfordAIMI/RadBERT) which was pre-trained on radiology reports.

We need to load the model from the Huggingface hub into the [BucketFS](https://docs.exasol.com/db/latest/database_concepts/bucketfs/bucketfs.htm). This could potentially be a long process. Unfortunately, we cannot tell exactly when it has finished. The notebook's hourglass may not be a reliable indicator. [BucketFS](https://docs.exasol.com/db/latest/database_concepts/bucketfs/bucketfs.htm) will still be doing some work when the call issued by the notebook returns. Please wait for a few moments after that, before querying the model.

You might see a warning that some weights are newly initialized and the model should be trained on a down-stream task. Please ignore this warning. For the purpose of this demonstration, it is not important, the model should still be able to produce some meaningful output.

In [None]:
from exasol.nb_connector.model_installation import install_model, TransformerModel
from transformers import AutoModelForMaskedLM

# This is the name of the model at the Huggingface Hub
MODEL_NAME = 'StanfordAIMI/RadBERT'
install_model(ai_lab_config, TransformerModel(MODEL_NAME, 'filling_mask', AutoModelForMaskedLM))

## Use language model

Let's see if the model can fill in a masked word in the following text. This is an instruction usually given to a patient when a radiographer is doing a chest X-ray.

In [None]:
# This is a sentence with a masked word that will be given to the model.
MY_TEXT = 'Take a deep [MASK] and hold it'

# Make sure our text can be used in an SQL statement.
MY_TEXT = MY_TEXT.replace("'", "''")

We will collect the 5 best answers.
We will save the result in the variable `udf_output` to support automatic testing of this notebook.

In [None]:
%%sql --save udf_output
WITH MODEL_OUTPUT AS
(
    SELECT TE_FILLING_MASK_UDF(
        NULL,
        '{{ai_lab_config.bfs_connection_name}}',
        '{{ai_lab_config.bfs_model_subdir}}',
        '{{MODEL_NAME}}',
        '{{MY_TEXT}}',
        5
    )
)
SELECT filled_text, score, error_message FROM MODEL_OUTPUT ORDER BY SCORE DESC

The code above shows how the model works on a toy example. However, the main purpose of having a model deployed in the database is to get a quick response for a batch input. The performance gain comes from two factors - localization and parallelization. The first means that the input data never crosses the machine boundaries. The second means that multiple instances of the model are processing the data on all available nodes in parallel.

Another advantage of making predictions within the database is enhanced data security. The task of safeguarding privacy can be simplified given the fact that the source data never leaves the database machine.

In a more practical application, the input text would be stored in a column of a database table. For example, if we wanted to get the best answer for each row of the input table `MY_TEXT_TABLE`, where text with a masked word is in the column `MY_TEXT_COLUMN`, the SQL would look similar to this:
```
SELECT TE_FILLING_MASK_UDF(..., MY_TEXT_COLUMN, 1) FROM MY_TEXT_TABLE;
```
Please note, that the response time observed on the provided example with a single input will not be scaled up linearly in case of multiple inputs. Much of the latency falls on loading the model into the CPU memory from BucketFS. This needs to be done only once regardless of the number of inputs.