### *Use LLMs with your Data* 
### DESCRIPTION
Load tens of thousands of Wikipedia articles into Azure Data Explorer.
Harness its sub milisecond query capabilities to search your data and combine this with LLM to generate a response with Retrieval Augmented Generation pattern.
Use Azure Data Explorer vector store capabilities with embeddings together with Generative AI to generate answers.  


### PREPARATION
* An ADX (Azure Data Explorer or Kusto) cluster  
* In ADX, create a Database named "llm"  
    <img src="../images/1.png" alt="Create Kusto cluster" /> 
* Create a table called wikipedia by ingesting data from ["./data/wikipedia/vector_database_wikipedia_articles_embedded_1000.csv"](./data/wikipedia/vector_database_wikipedia_articles_embedded_1000.csv)   
    <img src="../images/2.png" alt="Create Kusto cluster" /> 
* Create an AAD app registration for Authentication - see below   
    [Create an Azure Active Directory application registration in Azure Data Explorer](https://learn.microsoft.com/en-us/azure/data-explorer/provision-azure-ad-app)

* You need to add ADX function as follows:   
     Run this on ADX Explorer UI  
     
```
//create the cosine similarity function for embeddings
.create-or-alter function with (folder = "Packages\\Series", docstring = "Calculate the Cosine similarity of 2 numerical arrays")
series_cosine_similarity_fl(vec1:dynamic, vec2:dynamic, vec1_size:real=double(null), vec2_size:real=double(null))
{
    let dp = series_dot_product(vec1, vec2);
    let v1l = iff(isnull(vec1_size), sqrt(series_dot_product(vec1, vec1)), vec1_size);
    let v2l = iff(isnull(vec2_size), sqrt(series_dot_product(vec2, vec2)), vec2_size);
    dp/(v1l*v2l)
}
```

In [1]:
import pandas as pd
from azure.kusto.data import KustoClient, KustoConnectionStringBuilder
from azure.kusto.data.exceptions import KustoServiceError
from azure.kusto.data.helpers import dataframe_from_result_table
import pandas as pd
from ast import literal_eval
import os
from tenacity import retry, wait_random_exponential, stop_after_attempt
from dotenv import load_dotenv

# Configure environment variables
load_dotenv()

AAD_TENANT_ID = os.getenv("AAD_TENANT_ID")
KUSTO_CLUSTER = os.getenv("KUSTO_CLUSTER")
KUSTO_DATABASE = os.getenv("KUSTO_DATABASE")
KUSTO_TABLE = os.getenv("KUSTO_TABLE")
KUSTO_MANAGED_IDENTITY_APP_ID = os.getenv("KUSTO_MANAGED_IDENTITY_APP_ID")
KUSTO_MANAGED_IDENTITY_SECRET = os.getenv("KUSTO_MANAGED_IDENTITY_SECRET")

COHERE_API_KEY = os.getenv("COHERE_API_KEY")


In [2]:
# Connect to adx using AAD app registration
cluster = KUSTO_CLUSTER
kcsb = KustoConnectionStringBuilder.with_aad_application_key_authentication(cluster, KUSTO_MANAGED_IDENTITY_APP_ID, KUSTO_MANAGED_IDENTITY_SECRET,  AAD_TENANT_ID)
client = KustoClient(kcsb)
kusto_db = KUSTO_DATABASE
table_name = KUSTO_TABLE

In [3]:
#testing the connection to kusto works - sample query to get the top 10 results from wikipedia
query = table_name + " | take 10"

response = client.execute(kusto_db, query)
for row in response.primary_results[0]:
    print("EventType:{}".format(row["title"]))

EventType:Moneyball (film)
EventType:Ulysses (novel)
EventType:Beirut
EventType:Irish people
EventType:Arsenal F.C.
EventType:Ronda Rousey
EventType:Indian cuisine
EventType:Alfre Woodard
EventType:Tina Turner
EventType:Benedetta (film)


In [4]:
import cohere

co = cohere.Client(COHERE_API_KEY)

@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
def embed(query):
        queries_array = [query]
        response = co.embed(model='multilingual-22-12',texts=queries_array)
        return response.embeddings


def get_answer(question, nr_of_answers=1):
        searchedEmbedding = embed(question)
        kusto_query = table_name + " | extend similarity = series_cosine_similarity_fl(dynamic("+str(searchedEmbedding)+"), emb,1,1) | top " + str(nr_of_answers) + " by similarity desc "
        response = client.execute(kusto_db, kusto_query)
        return response

def ask_question(question, nr_of_answers=1):
        response = get_answer(question)

        for row in response.primary_results[0]:
                print("=====================================")
                print(f"Title:{row['title']} \n")
                print(f"Content:{row['text']} \n")


In [5]:
response = embed("how is the president of the United States elected?")
print('Embeddings: {}'.format(response))


Embeddings: [[0.3774414, 0.10998535, -0.049468994, -0.25756836, -0.0104522705, -0.19592285, 0.11340332, -0.33203125, 0.30688477, 0.33642578, -0.08947754, -0.11248779, -0.26098633, 0.54003906, -0.22399902, -0.1973877, 0.2265625, -0.039794922, -0.0019330978, 0.25830078, -0.1381836, 0.42529297, 0.10406494, 0.005710602, 0.066467285, -0.15332031, -0.29248047, 0.28198242, -0.13952637, 0.33422852, 0.24780273, -0.07281494, -0.04071045, 0.4116211, -0.23034668, 0.31689453, 0.07989502, 0.36254883, 0.13952637, 0.6640625, 0.16125488, 0.14575195, -0.058441162, -0.058410645, 0.35375977, -0.33398438, -0.11645508, 0.024230957, -0.3251953, -0.032684326, -0.12054443, -0.019836426, -0.11413574, -0.14697266, -0.57666016, 0.4477539, -0.2866211, 0.5566406, 0.114990234, 0.18237305, 0.26049805, -0.16247559, 0.0181427, -0.014701843, 0.28198242, 0.3071289, -0.06933594, 0.051574707, 0.28051758, 0.34350586, 0.18908691, 0.48828125, 0.55908203, -0.22509766, 0.007507324, -0.11303711, 0.2668457, 0.07720947, 0.3203125,

In [6]:
# here we get our answer but in a long and non concise way
ask_question("What is the size of Argentina?",1)

Title:The Road to El Dorado 

Content:Marylata Jacob, who started DreamWorks' music department in 1995, became the film's music supervisor before the script was completed. Consulting with Katzenberg, Jacob decided the musical approach to the film would be world music. In 1996, Tim Rice and Elton John were asked to compose seven songs which they immediately worked on. Their musical process began with Rice first writing the song lyrics and giving them to John to compose the music. John then recorded a demo which was given to the animators who storyboarded to the demo as the tempo and vocals would remain intact. 



In [None]:
# by using a prompt we can ask the LLM model and get answers in a concise manner
def ask_gpt(text, question):
    prompt = """You are a helpful assistant that answers questions.
                Answer in a clear and concise manner providing answers only from the text below. If the answer is not in the text, please answer with "I don't know".
                Text:

                """
    question_prompt = """"
                Question:
                """
    prompt = prompt + text + question_prompt + question
    response = openai_llm.Completion.create(
        engine=utils.OPENAI_DEPLOYMENT_NAME,
        prompt=prompt,
        temperature=0,
        max_tokens=2000,
        top_p=0.5,
        frequency_penalty=0,
        presence_penalty=0,
        stop=None)
    response = response['choices'][0]['text']
    response = utils.remove_chars("\n", response)
    response=utils.start_after_string("Answer:", response)
    response=utils.remove_tail_tags("<|im_end|>", response)
    return response

In [None]:
#get the relevant results from the Database
answer = get_answer("What is the size of Argentina?",1)
text = answer.primary_results[0].rows[0]['text']
#send the results to GPT to get a more concise answer
ask_gpt(text, "What is the size of Argentina?")


In [None]:
ask_gpt(text, "What is the size of Argentina in km?")

In [None]:
ask_gpt(text, "What is the sweetest fruit?")