# Queries with and without Azure OpenAI

Now that we have our Search Engine loaded and running, we are going to try some example queries and then use Azure OpenAI service to see if we can get even better results

## Set up variables

In [1]:
import os
import urllib
import requests
from IPython.display import display, HTML
from langchain.llms import AzureOpenAI
from langchain.chat_models import AzureChatOpenAI
from langchain.vectorstores import FAISS
from langchain.docstore.document import Document
from langchain.chains.question_answering import load_qa_chain
from langchain.chains.qa_with_sources import load_qa_with_sources_chain

from app.embeddings import OpenAIEmbeddings
from app.prompts import STUFF_PROMPT, REFINE_PROMPT, REFINE_QUESTION_PROMPT
from app.credentials import (
    DATASOURCE_CONNECTION_STRING,
    AZURE_SEARCH_API_VERSION,
    AZURE_SEARCH_ENDPOINT,
    AZURE_SEARCH_KEY,
    COG_SERVICES_NAME,
    COG_SERVICES_KEY,
    AZURE_OPENAI_ENDPOINT,
    AZURE_OPENAI_KEY,
    AZURE_OPENAI_API_VERSION
)

In [2]:
# Setup the Payloads header
headers = {'Content-Type': 'application/json','api-key': AZURE_SEARCH_KEY}
params = {'api-version': AZURE_SEARCH_API_VERSION}

## Without Azure OpenAI

In [3]:
# Index that we are going to query (from Notebook 01)
index_name = "cogsrch-index"

In [12]:
QUESTION = "What are decission trees?" 

# Try questions that you think might be answered or addressed in computer science papers in 2020-2021
# And compare the results with the open version of ChatGPT
# The idea is that the answers using Azure OpenAI only looks at the information contained on these publications.

# For Example:
# What is CLP?
# How Markov chains work?
# What are some examples of reinforcement learning?

In [24]:
url = AZURE_SEARCH_ENDPOINT + '/indexes/'+ index_name + '/docs'
url += '?api-version={}'.format(AZURE_SEARCH_API_VERSION)
url += '&search={}'.format(QUESTION)
url += '&select=pages'
url += '&$top=5'  # You can change this to anything you need/want
url += '&queryLanguage=en-us'
url += '&queryType=semantic'
url += '&semanticConfiguration=my-semantic-config'
url += '&$count=true'
url += '&speller=lexicon'
url += '&answers=extractive|count-3'
url += '&captions=extractive|highlight-false'

resp = requests.get(url, headers=headers)
print(url)
print(resp.status_code)

search_results = resp.json()
print("Results Found: {}, Results Returned: {}".format(search_results['@odata.count'], len(search_results['value'])))

https://azure-cog-search-cstevuxaqrxcm.search.windows.net/indexes/cogsrch-index/docs?api-version=2021-04-30-Preview&search=What are decission trees?&select=pages&$top=5&queryLanguage=en-us&queryType=semantic&semanticConfiguration=my-semantic-config&$count=true&speller=lexicon&answers=extractive|count-3&captions=extractive|highlight-false
200
Results Found: 9745, Results Returned: 5


In [25]:
# Answers from semantic Search
display(HTML('<h4>Top Answers</h4>'))
for result in search_results['@search.answers']:
    if result['score'] > 0.5: # Show answers that are at least 50% of the max possible score=1
        display(HTML('<h5>' + 'Answer - score: ' + str(result['score']) + '</h5>'))
        display(HTML(result['text']))

        
# Results from key-word search
file_content = dict()

print("\n\n")
display(HTML('<h4>Top Results</h4>'))
for result in search_results['value']:
    if result['@search.rerankerScore'] > 0.4: # Show results that are at least 10% of the max possible score=4
        display(HTML('<h5>' + result['metadata_storage_name'] + '&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;score: ' + str(result['@search.rerankerScore']) + '</h5>'))
        display(HTML(result['@search.captions'][0]['text']))
        file_content[result['metadata_storage_path']]={
                            "content": result['pages'],  
                            "score": result['@search.rerankerScore'], 
                            "caption": result['@search.captions'][0]['text']        
                            }






## Comments on Query results

As seen above the semantic search feature of Azure Cognitive Search service is pretty good. It gives us the top answers and also the top results with the corresponding file and the paragraph where the answers is possible located
Let's see if we can make this better with Azure OpenAI

## Using Azure OpenAI

Of course we want OpenAI to give a better answer, so we instead of sending this results, we send the content(pages) of the search result articles to OpenAI and lets GPT model give the answer.

The problem is that the content of the search result files is or can be very lengthy, more than the allowed tokens allowed by the GPT Azure OpenAI models. So what we need to do is to split in chunks, vectorize and do a vector semantic search. 

Notice that **the documents chunks are already done in Azure Search**. file_content dictionary (created in the cell above) contains the pages (chunks) of each document. So we dont really need to chunk them again, each doc page for sure will fit on the max tokens limit of davinci-003 and text-embeddding-ada-002 models.


We will use a genius library call LangChain that wraps a lot of boiler plate code.

In [15]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_BASE"] = os.environ["AZURE_OPENAI_ENDPOINT"] = AZURE_OPENAI_ENDPOINT
os.environ["OPENAI_API_KEY"] = os.environ["AZURE_OPENAI_API_KEY"] = AZURE_OPENAI_KEY
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"] = AZURE_OPENAI_API_VERSION

In [16]:
# In Azure OpenAI create a deployment for the model "text-embedding-ada-002"
# and VERY IMPORTANT name the deployment the same: "text-embedding-ada-002"
embeddings = OpenAIEmbeddings()

In [17]:
docs = []
for key,value in file_content.items():
    for page in value["content"]:
        docs.append(Document(page_content=page, metadata={"source": key}))

In [18]:
%%time
if(len(docs)>1):
    db = FAISS.from_documents(docs, embeddings)
else:
    print("No results Found")

CPU times: user 145 ms, sys: 9.11 ms, total: 154 ms
Wall time: 4.55 s


In [19]:
docs_db = db.similarity_search(QUESTION)

At this point we already have the most similar chuncks (in order of relevance given by the in-memory vector cosine similarity search) in docs_db

### Now we use GPT-3.5(Turbo) using map-reduce chain in order to stay within the limits of the allow model's token count

for more information on the different types of prompts for these chains please see here:

https://github.com/hwchase17/langchain/tree/master/langchain/chains/question_answering

In [20]:
# Make sure you have the deployment named "gpt-35-turbo" for the model "gpt-35-turbo (0301)"
llm = AzureChatOpenAI(deployment_name="gpt-35-turbo", temperature=0.3, max_tokens=500)
chain = load_qa_with_sources_chain(llm, chain_type="map_reduce", return_intermediate_steps=True)

In [21]:
%%time
response = chain({"input_documents": docs_db, "question": QUESTION}, return_only_outputs=True)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised Timeout: Request timed out: HTTPSConnectionPool(host='open-ai-pinternal.openai.azure.com', port=443): Read timed out. (read timeout=60).
Token indices sequence length is longer than the specified maximum sequence length for this model (2145 > 1024). Running this sequence through the model will result in indexing errors


CPU times: user 612 ms, sys: 93.2 ms, total: 705 ms
Wall time: 1min 19s


In [22]:
answer = response['output_text']

display(HTML('<h4>Azure OpenAI ChatGPT Answer:</h4>'))
print(answer.split("SOURCES:")[0])
print("Sources:")
print(answer.split("SOURCES:")[1].replace(" ","").split(","))

Decision trees are a fundamental programming abstraction used to determine one of n outcome events. They can be defined as a rooted directed tree in which each internal node is labeled by a coordinate and each leaf is labeled by an element of the output set. Decision trees are used to compute a function and have a cost on input x as the length of the root-leaf path T follows on input x. There are also zero-error randomized decision tree complexities R(T) and R(f). Decision trees are also used in dynamic branch prediction, where branches are predicted based on prior instances of the same and different branch instructions. There is a binary tree function where the internal nodes have labels from {1, 2, . . . , n} and whose leaves have labels from {0, 1}. If a node has label i, then the test performed at that node is to examine the ith bit of the input. If the result is 0, one descends into the left subtree, whereas if the result is 1, one descends into the right subtree. The label of the

In [23]:
# Uncomment if you want to inspect the results from each top similar chunk
# response['intermediate_steps']

# Summary
##### This answer is way better than taking just the result from Azure Cognitive Search. So the summary is:
- Azure Cognitive Search give us the top results (context)
- Azure OpenAI takes these results and understand the content and uses it as context to give the best answer
- Best of two worlds!