# Invoking and Testing the Vector Store Inference Service (Optional)

Welcome to the third part of the tutorial series on building a question-answering application over a corpus of private
documents using Large Language Models (LLMs). In the previous Notebooks, you've embarked on the journey of transforming
unstructured text data into structured vector embeddings and deploying an Inference Service (ISVC) to serve the Vector
Store that holds these embeddings.

In this optional Notebook, you focus on invoking the Vector Store ISVC you've created and testing its performance. This
is an essential step, as it allows you to verify the functionality of your service and observe how it performs in
practice. Throughout this Notebook, you construct suitable requests, communicate with the service, and interpret the
responses.

By the end of this Notebook, you will gain practical insights into the workings of the Vector Store ISVC and will be
well-prepared to integrate it into a larger system, alongside the LLM ISVC that you create in the subsequent Notebook.

## Table of Contents

1. [Invoke the Inference Service](#invoke-the-inference-service)
1. [Conclusion and Next Steps](#conclusion-and-next-steps)

In [None]:
import requests
import ipywidgets as widgets

from IPython.display import display

# Invoke the Inference Service

First, you need to construct the URL you use in POST request. For this example, you use the V1 inference protocol,
described below:

| API          | Verb | Path                          | Request Payload   | Response Payload                  |
|--------------|------|-------------------------------|-------------------|-----------------------------------|
| List Models  | GET  | /v1/models                    |                   | {"models": [<model_name>]}        |
| Model Ready  | GET  | /v1/models/<model_name>       |                   | {"name": <model_name>,"ready": $bool} |
| Predict      | POST | /v1/models/<model_name>:predict | {"instances": []}* | {"predictions": []}              |
| Explain      | POST | /v1/models/<model_name>:explain | {"instances": []}* | {"predictions": [], "explanations": []} |

\* Payload is optional

You want to invoke the `predict` API. So let's use a simple query to test the service:

In [None]:
# Add heading
heading = widgets.HTML("<h2>Credentials</h2>")
display(heading)

domain_input = widgets.Text(description='Domain:', placeholder="i001ua.tryezmeral.com")
username_input = widgets.Text(description='Username:')
password_input = widgets.Password(description='Password:')
submit_button = widgets.Button(description='Submit')
success_message = widgets.Output()

domain = None
username = None
password = None

def submit_button_clicked(b):
    global domain, username, password
    domain = domain_input.value
    username = username_input.value
    password = password_input.value
    with success_message:
        success_message.clear_output()
        print("Credentials submitted successfully!")
    submit_button.disabled = True

submit_button.on_click(submit_button_clicked)

# Set margin on the submit button
submit_button.layout.margin = '20px 0 20px 0'

# Display inputs and button
display(domain_input, username_input, password_input, submit_button, success_message)

In [None]:
token_url = f"https://keycloak.{domain}/realms/UA/protocol/openid-connect/token"

data = {
    "username" : username,
    "password" : password,
    "grant_type" : "password",
    "client_id" : "ua-grant",
}

token_responce = requests.post(token_url, data=data, allow_redirects=True, verify=False)

token = token_responce.json()["access_token"]

In [None]:
DOMAIN_NAME = "svc.cluster.local"
NAMESPACE = username
DEPLOYMENT_NAME = "vectorstore"
MODEL_NAME = DEPLOYMENT_NAME
SVC = f'{DEPLOYMENT_NAME}-predictor.{NAMESPACE}.{DOMAIN_NAME}'
URL = f"https://{SVC}/v1/models/{MODEL_NAME}:predict"

print(URL)

In [None]:
data = {
  "instances": [{
      "input": "Who's Ada Lovelace?",
      "num_docs": 4  # number of documents to retrieve
  }]
}

headers = {"Authorization": f"Bearer {token}"}

response = requests.post(URL, json=data, headers=headers, verify=False)

In [None]:
response.text

# Conclusion and Next Steps

Well done! Through this Notebook, you've successfully interacted with and tested the Vector Store ISVC. You've learned
how to construct and send requests to the service and how to interpret the responses. This hands-on experience is
crucial as it provides a practical understanding of the service's operation, preparing you for real-world applications.

In the next Notebook, you extend your question-answering system by creating an ISVC for the LLM. The LLM ISVC works in
conjunction with the Vector Store ISVC to provide comprehensive and accurate answers to user queries.