### Query embeddings from structured data

### 1) Install dependencies

Use Python3 (ipykernel) kernel

In [None]:
pip install langchain openai

### 2) Import libraries

In [None]:
import os
import pandas as pd
from openai import AzureOpenAI


### 3) Connect to the index
This is the index you created via [these instructions](https://github.com/STRIDES/NIHCloudLabAzure/blob/main/docs/create_index_from_csv.md).
Look [here](https://learn.microsoft.com/en-us/azure/search/search-create-service-portal#name-the-service) for your endpoint name, and [here](https://learn.microsoft.com/en-us/azure/search/search-security-api-keys?tabs=portal-use%2Cportal-find%2Cportal-query#find-existing-keys) for your index key.

In [None]:
endpoint="<Your AI Search Endpoint>"
index_name="<Your Index Name>"
index_key='<Your Index Key>'

In [None]:
#connect to vector store   
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential

search_client = SearchClient(endpoint, index_name, AzureKeyCredential(index_key))

### 4) Connect to your model
First, make sure you have a [model deployed](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-openai), and if not, deploy a model.
To get your endpoint, key, and version number, just go to the Chat Playground and click **View Code** at the top.

In [None]:
#connect to model
os.environ["AZURE_OPENAI_ENDPOINT"] = "<Azure AI Studio Endpoint>"
os.environ["AZURE_OPENAI_API_KEY"] = "<Azure AI Studio API Key"

client = AzureOpenAI(
    api_key=os.getenv("AZURE_OPENAI_KEY"),  
    api_version="2023-08-01-preview",
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
    )

### 5) Query the Vector Store

First, enter your question. Feel free to experiment with different variations or prompts

In [None]:
query = " \
    Your input data is a list of grants. \
    Based on only the 'Project_Title' \
    list the 'Project_Number' and 'Total_Cost' \
    of all grants related to breast cancer \
"

Now we feed the query and the input embeddings to our LLM and return the results 

In [None]:
#run query output on model
search_results = str(list(search_client.search(query)))
response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are an NIH Program Officer"},
        {"role": "user", "content": "Context: "+ search_results + "\n\n Query: " + query}
    ],
)
#view model output
response.choices[0].message.content.strip()

And that is it! You successfully created a simple chat bot that runs queries against structured data! This is a complex problem and there are a lot of good blogs out there that describe more complex architectures. We recommend you do some investigation and see if you can come up with an even better solution for your use case! 