# RESTBERTa Application
This notebook demonstrates the application of a RESTBERTa model.
It relies on the same processing pipelines used by the Flask application (see [tools](https://github.com/SebastianKotstein/RESTBERTa/tree/master/tools)).

## Install Dependencies

In [28]:
!pip install tensorflow==2.13.0
!pip install transformers==4.33.2
!pip install numpy==1.24.3

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




In [33]:
from tools.pipeline.pipeline import Pipeline
from tools.pipeline.lru_cache import LRUCache

import pandas as pd

## Arguments
The following models are available on HuggingFace:
- Parameter Matching: [SebastianKotstein/restberta-qa-parameter-matching](https://huggingface.co/SebastianKotstein/restberta-qa-parameter-matching)
- Endpoint Discovery: [SebastianKotstein/restberta-qa-endpoint-discovery](https://huggingface.co/SebastianKotstein/restberta-qa-endpoint-discovery)
- RESTBERTa model for both tasks: [SebastianKotstein/restberta-qa-pm-ed](https://huggingface.co/SebastianKotstein/restberta-qa-pm-ed)

In [34]:
# Model name (see description above)
model = "SebastianKotstein/restberta-qa-parameter-matching"

# For a model with an input (context) length of 512 tokens, there are up to to 512x512 possible start-end token pairs, each representing a potential answer span.
# The output interpreter ranks these spans by assigning a score to each start-end pair. This score is computed as the sum of the start and the end logit.
# To improve performance, it is recommended to restrict this computation to only the top 'n' start and end tokens, i.e., those with the highest logits.
# This reduces the number of computations from 512x512 to 20x20, if n = 20, for instance.
best_size = 20

# The pipeline implements an LRU cache that allows us to cache the last 'n' prediction results.
# To disable caching, set this parameter to 'None'.
cache_size = 100

# If 'True', duplicates will be removed from the ranked list of suggested Web API elements so that a suggested Web API element occurs only once with the highest score among all its duplicates.
suppress_duplicates = True

# For each fragment, RESTBERTa outputs a threshold value (also refered as 'NULL answer score') indicating whether the fragment contains the queried Web API element or not. 
# If this parameter is set to 'treshold', only Web API elements with a score higher than the threshold value of the respective fragment will be returned. 
# If set to 'ignore', these threshold values will be ignored.
no_answer_strategy = "ignore"

# If set, only the 'x' highest ranked suggested Web API elements will be contained in the result list.
top_answers_n = None

# Hugging Face access token for loading custom models from private repositories.
token = None

## Initialize Pipeline
The following code initializes the processing pipelines that we also use for the Flask application. 
If a GPU is available for inference, the output should look like the following:<br>
`[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]`


In [36]:
if cache_size:
    cache = LRUCache(cache_size,False)
else:
    cache = None
pipeline = Pipeline(model,best_size,cache,token)

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


All model checkpoint layers were used when initializing TFRobertaForQuestionAnswering.

All the layers of TFRobertaForQuestionAnswering were initialized from the model checkpoint at SebastianKotstein/restberta-qa-parameter-matching.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFRobertaForQuestionAnswering for predictions without further training.


## Pipeline Input
The pipelines accepts the an dictionary as input having the following structure: <br>
- On root level, the dictionary has property `schemas`, which is an array. Depending on the task, each `schema` item represents either a payload schema or a list of endpoints.
- A `schema` item has a `value` property containing the linear list of Web API element in an XPath-like notation. These Web API elements are the properties of the payload schema of the endpoints of the endpoint list.
- The `queries` array enables the definition of multiple queries per `schema` item. Each query should be expressed in natural language in the `value` property. Assigning a unique identifier to `queryId` is optional but can be helpful for mapping results back to their corresponding queries.
- Set the `verboseOutput`flag to `True` to obtain details about processed tokens and fragment-level results.

Note that each query is processed independently of the others. <br>
For more details, we refer to the [OpenAPI](https://github.com/SebastianKotstein/RESTBERTa/blob/master/tools/static/OpenAPI.yml) documentation of the Flask application.

In [42]:
input = {
    "schemas":[
        {
            "schemaId": "s1",
            "name": "testSchema",
            "value": "auth.key location.city location.city_id location.country location.lat location.lon location.postal_code state units", 
            "queries":[
                {
                    "queryId": "q1",
                    "name": "first query",
                    "value": "The ZIP",
                    "verboseOutput":False
                },
                {
                    "queryId": "q2",
                    "name": "second query",
                    "value": "The auth token",
                    "verboseOutput":False
                }

            ]
        }
    ]
}

results = pipeline.process(input,top_answers_n,suppress_duplicates,no_answer_strategy)

## Output Visualization

In [43]:
def display_results(results, sort_by = None):
    pd.set_option('display.max_rows', None)
    df = pd.DataFrame(results)
    if sort_by:
        df = df.sort_values(sort_by, ascending=False)
    df = df.style.set_properties(**{'text-align': 'left'})
    display(df)

In [44]:
query_tables = {}  # Store DataFrames per queryId

for schema in results['schemas']:
    schema_name = schema['name']
    for query in schema['queries']:
        query_name = query['name']
        answers = query['result'].get('answers', [])
        
        # Build DataFrame
        df = pd.DataFrame([
            {'Property': ans['property']['name'], 'Score': float(ans['score'])}
            for ans in answers
            if ans['property'] is not None
        ])
        
        print(f"\nSchema: {schema_name} | Query: {query_name}")
        display_results(df)


Schema: testSchema | Query: first query


Unnamed: 0,Property,Score
0,location.postal_code,16.05726
1,location.lon,1.122429
2,location.city_id,-6.211535
3,location.city,-6.740502
4,location.country,-8.156937
5,auth.key,-9.358548
6,location.lat,-11.109827



Schema: testSchema | Query: second query


Unnamed: 0,Property,Score
0,auth.key,17.928198
1,location.city,-11.495063
2,location.city_id,-19.220545
3,location.postal_code,-19.519812
4,location.lon,-19.53437
5,location.country,-19.603506
6,location.lat,-19.60597
7,state,-19.802938
8,units,-19.832052
