# Queries with and without Azure OpenAI

Now that we have our Search Engine loaded **from two different data sources in two diferent indexes**, we are going to try some example queries and then use Azure OpenAI service to see if we can get even better results.

The idea is that a user can ask a question about Computer Science (first datasource/index) or about Covid (second datasource/index), and the engine will respond accordingly.
This **Multi-Index** demo, mimics the scenario where a company loads multiple type of documents of different types and about completly different topics and the search engine must respond with the most relevant results.

## Set up variables

In [1]:
# 2023Apr25 mike edits to work inside SHARED RG w SHARED cog services..
# looks like users cannot share COMPUTES... so each user must have their own compute...
# mike has his own compute...    must set up using    GPT-Azure-Search-Engine$pip install -r ./requirements.txt<enter>
# 2023Apr20 mikes copy of the workshop notebooks...
# NOTE - BEFORE this notebook can be executed, the computer instance MUST have had pip install requirements executed, so the langChain packages install.....
import os
import urllib
import requests
import random
from collections import OrderedDict
from IPython.display import display, HTML
from langchain.llms import AzureOpenAI
from langchain.chat_models import AzureChatOpenAI
from langchain.vectorstores import FAISS
from langchain.docstore.document import Document
from langchain.chains.question_answering import load_qa_chain
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.embeddings import HuggingFaceEmbeddings

from app.embeddings import OpenAIEmbeddings
from app.prompts import STUFF_PROMPT, REFINE_PROMPT, REFINE_QUESTION_PROMPT

# Don't mess with this unless you really know what you are doing
AZURE_SEARCH_API_VERSION = '2021-04-30-Preview'
AZURE_OPENAI_API_VERSION = "2023-03-15-preview"

# Change these below with your own services credentials
#AZURE_SEARCH_ENDPOINT = "Enter your Azure Cognitive Search Endpoint ..."
#AZURE_SEARCH_ENDPOINT = "https://azure-cog-search-hw3sksxnht5m6.search.windows.net"
#mike edits to used shared resources
AZURE_SEARCH_ENDPOINT = "https://azure-cog-search-gtekhenxlqzvu.search.windows.net"

#AZURE_SEARCH_KEY = "Enter your Azure Cognitive Search Key ..."
#AZURE_SEARCH_KEY = "ujyMyQ6sOIPDeTghq3cKIOP8J2liyrNSVqSn0CKMU0AzSeCisufr"
#mike edits to used shared resources
AZURE_SEARCH_KEY = "EPpDuVjPveOV8hhzKu7e17H0AB7QIzrWGqumQK87uEAzSeB9FqUZ"

#AZURE_OPENAI_ENDPOINT = "Enter your Azure OpenAI Endpoint ..."
#AZURE_OPENAI_ENDPOINT = "https://aoai-exerciseresource-01.openai.azure.com/"
#mike edits to used shared resources
AZURE_OPENAI_ENDPOINT = "https://oai-2023-mondelez-chatgpt-01.openai.azure.com/"

#AZURE_OPENAI_API_KEY = "Enter your Azure OpenAI Key ..."
#AZURE_OPENAI_API_KEY = "21f30f0fa43b4027b3179dd340eb9523"
#mike edits to used shared resources
AZURE_OPENAI_API_KEY = "f86736ea92444e9f836b69de0512ca55"


In [2]:
# Setup the Payloads header
headers = {'Content-Type': 'application/json','api-key': AZURE_SEARCH_KEY}

## Multi-Index Search queries

In [3]:
# Index that we are going to query (from Notebook 01 and 02)
index1_name = "cogsrch-index-files"
index2_name = "cogsrch-index-csv"
indexes = [index1_name, index2_name]

Try questions that you think might be answered or addressed in computer science papers in 2020-2021 or that can be addressed by medical publications about COVID in 2020. Try comparing the results with the open version of ChatGPT.<br>
The idea is that the answers using Azure OpenAI only looks at the information contained on these publications.

**Example Questions you can ask**:
- What is CLP?
- How Markov chains work?
- What are some examples of reinforcement learning?
- What are the main risk factors for Covid-19?
- What medicine reduces inflamation in the lungs?
- Why Covid doesn't affect kids that much compared to adults?
- Does chloroquine really works against covid?
- tell me Use cases where I can use deep learning to solve it

In [4]:
#QUESTION = "What is CLP?" 
# This questions is interesting since CLP means something in Computer science and means something different in medical field 

# MIKES questions, selected for relevence to his CDP data and for the MSFT COVID abstract data...
# QUESTION = "What does COVID stand for?" # answer mike is looking for is related to Corona Virus...
#QUESTION = "What is a CDP plan?" # answer mike is looking for is related to Client Data Protection...
#QUESTION = "Who has responsibility of CDP?" # answer mike is looking for is related to Client Data Protection...
#QUESTION = "what do ISL responsibilities include?"  # mike is looking for a bulleted list from a specific CDP .pdf slide deck...
QUESTION = "If you do not know who the ISL of your plan is, who should you reach out to?" # reach out to your Delivery Lead, Project Manager or the CDP Help Desk"



### Search on both indexes individually and aggragate results

Note: In order to standarize the indexes we are setting 4 mandatory fields to be present on each index: id, title, content, pages, language. These fields must be present in each index so that each document can be treated the same along the code.

In [5]:
# MIKE - at first errored, mike had created the NEW Cog Search Resource but did NOT set the Semantic Search Availability to Either FREE or STANDARD <- it was UNset, and so caused error....
agg_search_results = []

for index in indexes:
    print(index) #mikes debugging, wants to see what index was processed during each iteration...
    url = AZURE_SEARCH_ENDPOINT + '/indexes/'+ index + '/docs'
    url += '?api-version={}'.format(AZURE_SEARCH_API_VERSION)
    url += '&search={}'.format(QUESTION)
    url += '&select=*'
#    url += '&$top=10'  # You can change this to anything you need/want
    url += '&$top=5'  # You can change this to anything you need/want
    url += '&queryLanguage=en-us'
    url += '&queryType=semantic'
    url += '&semanticConfiguration=my-semantic-config'
    url += '&$count=true'
    url += '&speller=lexicon'
    url += '&answers=extractive|count-3'
    url += '&captions=extractive|highlight-false'

    resp = requests.get(url, headers=headers)
    print(url) #specific to index processed during this iteration...
    print(resp.status_code) #specific to the index processed during This iteration...
    search_results = resp.json() #secific to the index being processed during THIS iteration...
    #agg_search_results.append(search_results) # NOW, append current index search results to the aggregated search results....
    print("Results Found: {}, Results Returned: {}".format(search_results['@odata.count'], len(search_results['value']))) #display only latest iteration results...
    # when Done w processing this iterations index, Then wrap up by appending to the running total 'agg_search_results'...
    agg_search_results.append(search_results) # NOW, append current index search results to the aggregated search results....

    

cogsrch-index-files
https://azure-cog-search-gtekhenxlqzvu.search.windows.net/indexes/cogsrch-index-files/docs?api-version=2021-04-30-Preview&search=If you do not know who the ISL of your plan is, who should you reach out to?&select=*&$top=5&queryLanguage=en-us&queryType=semantic&semanticConfiguration=my-semantic-config&$count=true&speller=lexicon&answers=extractive|count-3&captions=extractive|highlight-false
200
Results Found: 25, Results Returned: 5
cogsrch-index-csv
https://azure-cog-search-gtekhenxlqzvu.search.windows.net/indexes/cogsrch-index-csv/docs?api-version=2021-04-30-Preview&search=If you do not know who the ISL of your plan is, who should you reach out to?&select=*&$top=5&queryLanguage=en-us&queryType=semantic&semanticConfiguration=my-semantic-config&$count=true&speller=lexicon&answers=extractive|count-3&captions=extractive|highlight-false
200
Results Found: 46560, Results Returned: 5


In [6]:
# # process only one index...   index1_name 'cogsrch_index_files'....
# agg_search_results = []

# #for index in indexes:
# index = 'cogsrch-index-files'
# url = AZURE_SEARCH_ENDPOINT + '/indexes/'+ index + '/docs'
# url += '?api-version={}'.format(AZURE_SEARCH_API_VERSION)
# url += '&search={}'.format(QUESTION)
# url += '&select=*'
# #    url += '&$top=10'  # You can change this to anything you need/want
# url += '&$top=5'  # You can change this to anything you need/want
# url += '&queryLanguage=en-us'
# url += '&queryType=semantic'
# url += '&semanticConfiguration=my-semantic-config'
# url += '&$count=true'
# url += '&speller=lexicon'
# url += '&answers=extractive|count-3'
# url += '&captions=extractive|highlight-false'

# resp = requests.get(url, headers=headers)
# print(url)
# print(resp.status_code)
# search_results = resp.json()
# agg_search_results.append(search_results)
# print("Results Found: {}, Results Returned: {}".format(search_results['@odata.count'], len(search_results['value'])))
# print("search_results for Mike")
# print(search_results)


In [7]:
# # process only one index...   index1_name 'cogsrch_index_csv'....
# agg_search_results = []

# #for index in indexes:
# index = 'cogsrch-index-csv'
# url = AZURE_SEARCH_ENDPOINT + '/indexes/'+ index + '/docs'
# url += '?api-version={}'.format(AZURE_SEARCH_API_VERSION)
# url += '&search={}'.format(QUESTION)
# url += '&select=*'
# #    url += '&$top=10'  # You can change this to anything you need/want
# url += '&$top=5'  # You can change this to anything you need/want
# url += '&queryLanguage=en-us'
# url += '&queryType=semantic'
# url += '&semanticConfiguration=my-semantic-config'
# url += '&$count=true'
# url += '&speller=lexicon'
# url += '&answers=extractive|count-3'
# url += '&captions=extractive|highlight-false'

# resp = requests.get(url, headers=headers)
# print(url)
# print(resp.status_code)
# search_results = resp.json()
# agg_search_results.append(search_results)
# print("Results Found: {}, Results Returned: {}".format(search_results['@odata.count'], len(search_results['value'])))
# print("search_results for Mike")
# print(search_results)


In [8]:
# note - this should only print out the search_results of the last index processed...     it wont print any earlier ones as this assignment will just be the residuals from the for loop...
#print(index) # mikes debugging info...  so can see which iteration's (from above) index was being processed/displayed...

print("\n") # just for formatting output...
#search_results # mike - at THIS point, shouldnt we be more concerned w agg_search_results vs. just the latest search_results???

print("\n") # just for formatting output...
agg_search_results # mike - isnt this more important to display????








[{'@odata.context': "https://azure-cog-search-gtekhenxlqzvu.search.windows.net/indexes('cogsrch-index-files')/$metadata#docs(*)",
  '@odata.count': 25,
  '@search.answers': [],
  'value': [{'@search.score': 6.8654,
    '@search.rerankerScore': 0.27257728576660156,
    '@search.captions': [{'text': 'True or False \tMobilization/Plan\t2\tDifference between a work product and a deliverable and examples\t"Deliverable" is an outcome / product that needs to be created to deliver a solution. True or False \tMobilization/Plan\t2\tUnderstand Client Data Protection\tClient Data Protection is the responsibility of the Security Officer.',
      'highlights': ''}],
    'id': 'aHR0cHM6Ly9hc2Fmb3Jhb2Fpc2FuZGJveDAxLmJsb2IuY29yZS53aW5kb3dzLm5ldC9jb250YWluZXItb3BlbmFpLXNhbmRib3gtMDEvRXhhbVByZXAtUXVlc3Rpb25zLnhsc3g1',
    'title': None,
    'content': 'Exam Prep\n\tArea\tDM Day Covered\tLearning Objective\tAssessment question\n\tADF\t2\tADF Mandatory Deliverables\tWhat is not a mandatory work product for

### Display the top results (from both searches) based on the score

In [9]:
display(HTML('<h4>Top Answers</h4>'))

for search_results in agg_search_results:
    for result in search_results['@search.answers']:
#        if result['score'] > 0.5: # Show answers that are at least 50% of the max possible score=1, mike thinks this might be screening out his answers.....
        if result['score'] > 0.1: # Show answers that are at least 30% of the max possible score=1, mike thinks this might be screening out his answers.....
            display(HTML('<h5>' + 'Answer - score: ' + str(result['score']) + '</h5>'))
            display(HTML(result['text']))
            
print("\n\n")



display(HTML('<h4>Top Results</h4>'))

file_content = OrderedDict()
content = dict()

for search_results in agg_search_results:
    for result in search_results['value']:
#        if result['@search.rerankerScore'] > 1: # Filter results that are at least 25% of the max possible score=4
        if result['@search.rerankerScore'] > 0.5: # Filter results that are at least 12% of the max possible score=4
            content[result['id']]={
                                    "title": result['title'],
                                    "chunks": result['pages'],
                                    "language": result['language'], 
                                    "caption": result['@search.captions'][0]['text'],
                                    "score": result['@search.rerankerScore'],
                                    "location": result['metadata_storage_path']                  
                                }
    
    #After results have been filtered we will Sort and add them as an Ordered list\n",
    for id in sorted(content, key= lambda x: content[x]["score"], reverse=True):
        file_content[id] = content[id]
        display(HTML('<h5>' + str(file_content[id]['title']) + '&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;score: '+ str(file_content[id]['score']) + '</h5>'))
        display(HTML(file_content[id]['caption']))






In [10]:
agg_search_results


[{'@odata.context': "https://azure-cog-search-gtekhenxlqzvu.search.windows.net/indexes('cogsrch-index-files')/$metadata#docs(*)",
  '@odata.count': 25,
  '@search.answers': [],
  'value': [{'@search.score': 6.8654,
    '@search.rerankerScore': 0.27257728576660156,
    '@search.captions': [{'text': 'True or False \tMobilization/Plan\t2\tDifference between a work product and a deliverable and examples\t"Deliverable" is an outcome / product that needs to be created to deliver a solution. True or False \tMobilization/Plan\t2\tUnderstand Client Data Protection\tClient Data Protection is the responsibility of the Security Officer.',
      'highlights': ''}],
    'id': 'aHR0cHM6Ly9hc2Fmb3Jhb2Fpc2FuZGJveDAxLmJsb2IuY29yZS53aW5kb3dzLm5ldC9jb250YWluZXItb3BlbmFpLXNhbmRib3gtMDEvRXhhbVByZXAtUXVlc3Rpb25zLnhsc3g1',
    'title': None,
    'content': 'Exam Prep\n\tArea\tDM Day Covered\tLearning Objective\tAssessment question\n\tADF\t2\tADF Mandatory Deliverables\tWhat is not a mandatory work product for

## Comments on Query results

As seen above the semantic search feature of Azure Cognitive Search service is good. It gives us some answers and also the top results with the corresponding file and the paragraph where the answers is possible located.

Let's see if we can make this better with Azure OpenAI

# Using Azure OpenAI

Of course we want OpenAI to give a better answer, so we instead of sending these results, <u>_we send the content of the documents of the search result articles to OpenAI and lets GPT model give the answer._</u>
In MIKES case, he had reduced the Top 10 to Top 3, so, only had three (3) results returned to select from....
The problem is that the content of the search result files is or can be very lengthy, more than the allowed tokens allowed by the GPT Azure OpenAI models. So what we need to do is to split in chunks, vectorize and do a vector semantic search. 

Notice that **the documents chunks are already done in Azure Search**. file_content dictionary (created in the cell above) contains the pages (chunks) of each document. So we dont really need to chunk them again, each doc page for sure will fit on the max tokens limit of the completions LLM and of the embedding LLM.

We will use a genius library call LangChain that wraps a lot of boiler plate code.

In [11]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
# NOTE the below clever Python with two ='s     if a = b if b exists, else a = c....  go look it up...
os.environ["OPENAI_API_BASE"] = os.environ["AZURE_OPENAI_ENDPOINT"] = AZURE_OPENAI_ENDPOINT
os.environ["OPENAI_API_KEY"] = os.environ["AZURE_OPENAI_API_KEY"] = AZURE_OPENAI_API_KEY
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"] = AZURE_OPENAI_API_VERSION

In [12]:
docs = []
for key,value in file_content.items():
    for page in value["chunks"]:
        docs.append(Document(page_content=page, metadata={"source": value["location"]}))
        
print("Number of chunks:",len(docs))

Number of chunks: 1


Depending of the amount of chunks/pages returned from the search result, which is very related to the size of the documents returned
we pick the embedding model that give us fast results. <br>The logic is, if there is less than 50 chunks (of 5000 chars each) to vectorize then we use 
OpenAI models which currently don't offer batch processing, but if there is more than 50 chunks we use a BERT based in-memory model that processes in batches and in parallel (it is recommended a VM of at least 4 cores).

For more information on in-memory models that you can use, see [HERE](https://www.sbert.net/docs/pretrained_models.html)

In [13]:
# Select the Embedder model
if len(docs) < 50:
    # OpenAI models are accurate but slower
    embedder = OpenAIEmbeddings(document_model_name="text-embedding-ada-002", query_model_name="text-embedding-ada-002") 
else:
    # Bert based models are faster (3x-10x) but not as great in accuracy as OpenAI models
    # Since this repo supports Multiple languages we need to use a multilingual model. 
    # But if English only is the requirement, use "multi-qa-MiniLM-L6-cos-v1"
    # The fastest english model is "all-MiniLM-L12-v2"
    if random.choice(list(file_content.items()))[1]["language"] == "en":
        embedder = HuggingFaceEmbeddings(model_name = 'multi-qa-MiniLM-L6-cos-v1')
    else:
        embedder = HuggingFaceEmbeddings(model_name = 'distiluse-base-multilingual-cased-v2')

In [14]:
embedder

OpenAIEmbeddings(client=<class 'openai.api_resources.embedding.Embedding'>, document_model_name='text-embedding-ada-002', query_model_name='text-embedding-ada-002', openai_api_key=None)

In [15]:
%%time
if(len(docs)>1):
    db = FAISS.from_documents(docs, embedder)
else:
    print("No results Found")

No results Found
CPU times: user 42 µs, sys: 6 µs, total: 48 µs
Wall time: 51.3 µs


In [16]:
docs_db = db.similarity_search(QUESTION, k=4)

NameError: name 'db' is not defined

At this point we already have the most similar chunks (in order of relevance given by the in-memory vector cosine similarity search) in docs_db

### Now we use GPT-3.5(Turbo) using map-reduce chain in order to stay within the limits of the allow model's token count

for more information on the different types of prompts for these chains please see here:

https://github.com/hwchase17/langchain/tree/master/langchain/chains/question_answering

In [None]:
# Make sure you have the deployment named "gpt-35-turbo" for the model "gpt-35-turbo (0301)". 
# Use "gpt-4" if you have it available.
llm = AzureChatOpenAI(deployment_name="gpt-35-turbo", temperature=0.9, max_tokens=500)
chain = load_qa_with_sources_chain(llm, chain_type="map_reduce", return_intermediate_steps=True)

In [None]:
%%time
response = chain({"input_documents": docs_db, "question": QUESTION}, return_only_outputs=True)

In [None]:
# 2023Apr20 replaced with better if else test from mev... avoid 'Invalid Index error..'
# answer = response['output_text']

# display(HTML('<h4>Azure OpenAI ChatGPT Answer:</h4>'))
# print(answer.split("SOURCES:")[0])
# print("Sources:")
# print(answer.split("SOURCES:")[1].replace(" ","").split(","))

In [None]:
answer = response['output_text']

display(HTML('<h4>Azure OpenAI ChatGPT Answer:</h4>'))
print(answer.split("SOURCES:")[0])
#20230419 Mark Vogt (Avanade) ADDED exception-handling in case "answer" comes back with NO "SOURCES:"...
if(answer.find("SOURCES:") < 0):
    print("Sources: none")
else:
    print("Sources:")
    print(answer.split("SOURCES:")[1].replace(" ","").split(","))

In [None]:
# Uncomment if you want to inspect the results from each top similar chunk (k=4 by default)
response['intermediate_steps']

# Summary
##### This answer is way better than taking just the result from Azure Cognitive Search. So the summary is:
- Azure Cognitive Search give us the top results (context)
- Azure OpenAI takes these results and understand the content and uses it as context to give the best answer
- Best of two worlds!

# NEXT
We know now how to do a Smart Search Engine!! great!

But, does this solve all the possible scenarios that a virtual assistant will require?  **What about if the answer to the Smart Search Engine is not related to text, but instead requires to look into tabular data?** The next notebook 04 explains and solves this problem