# Azure OpenAI Service - Q&A with semantic answering exercise

In this tutorial, you'll build a simple Q&A system, that can give semantic answers to questions. Three sample documents from the Azure documentation are provided. Fill out the missing pieces in the source source to get everything working (indicated by `#FIXME`).

In [1]:
import os
import tiktoken
import openai
import numpy as np
from dotenv import load_dotenv
from openai.embeddings_utils import cosine_similarity

# Load environment variables
load_dotenv('../.env')

# Configure Azure OpenAI Service API
openai.api_type = "azure"
openai.api_version = os.getenv('OPENAI_API_VERSION', "2022-12-01")
OPENAI_API_BASE = openai.api_base = os.getenv('OPENAI_API_BASE')
openai.api_key = os.getenv("OPENAI_API_KEY")

# Define embedding model and encoding
EMBEDDING_MODEL = os.getenv('OPENAI_EMBEDDING_MODEL', 'text-embedding-ada-002')
EMBEDDING_ENCODING = os.getenv('OPENAI_EMBEDDING_ENCODING', 'cl100k_base')
EMBEDDING_CHUNK_SIZE = os.getenv('OPENAI_EMBEDDING_CHUNK_SIZE', 8000)
COMPLETION_MODEL = os.getenv('OPENAI_COMPLETION_MODEL', 'gpt-35-turbo')

# initialize tiktoken for encoding text
encoding = tiktoken.get_encoding(EMBEDDING_ENCODING)

params_gathered = dict(
    EMBEDDING_MODEL=EMBEDDING_MODEL,
    EMBEDDING_ENCODING=EMBEDDING_ENCODING,
    EMBEDDING_CHUNK_SIZE=EMBEDDING_CHUNK_SIZE,
    COMPLETION_MODEL=COMPLETION_MODEL,
    OPENAI_API_VERSION=openai.api_version,
    OPENAI_API_BASE=OPENAI_API_BASE
)
for key, val in params_gathered.items():
    print(key, val)


EMBEDDING_MODEL text-embedding-ada-002
EMBEDDING_ENCODING cl100k_base
EMBEDDING_CHUNK_SIZE 8000
COMPLETION_MODEL gpt-35-turbo
OPENAI_API_VERSION 2022-12-01
OPENAI_API_BASE https://handsonoctober.openai.azure.com/


Next, let's read the documents in `/data/qna/*.txt`, which are our sample documents:

In [2]:
# list all files in the samples directory
samples_dir = os.path.join(os.getcwd(), "../data/qna/")
sample_files = os.listdir(samples_dir)

# read each file and remove and newlines (better for embeddings later)
documents = []
for file in sample_files:
    with open(os.path.join(samples_dir, file), "r") as f:
        content = f.read()
        content = content.replace("\n", " ")
        content = content.replace("  ", " ")
        documents.append(content)

# print some stats about the documents
print(f"Loaded {len(documents)} documents")
for doc in documents:
    num_tokens = len(encoding.encode(doc))
    print(f"Content: {doc[:80]}... \n---> Tokens: {num_tokens}\n")

Loaded 3 documents
Content:  # What is Azure Cognitive Services Translator? Translator Service is a cloud-ba... 
---> Tokens: 739

Content:  # What is conversational language understanding? Conversational language unders... 
---> Tokens: 1341

Content:  # What is Azure OpenAI? The Azure OpenAI service provides REST API access to Op... 
---> Tokens: 1891



Now that we have all documents loaded, we can embed them using our embedding model:

In [3]:
#Use this encoder we already initilized to encode the documents:
#encoding = tiktoken.get_encoding(EMBEDDING_ENCODING)
API = [func for func in dir(encoding) if str(func)[0]!='_']
print("This is the encoder API", API)


# Create embeddings for all docs: USE LOCAL MODEL TO AVOID OUT RUNNING QUOTAS.
embeddings = #FIXME

# print some stats about the embeddings
for e in embeddings:
    print(len(e))

SyntaxError: invalid syntax (3554952104.py, line 8)

Now that we have our embeddings, we can try to ask some questions and see if it retrieves the correct document. You can try the following questions:

* what is azure openai service?
* can translator be fine tuned?
* what is the difference between luis and clu?
* what is form recognizer? (should yield no result)

In [None]:
# create embedding for question
question = "what is azure openai service?"
qe = #FIXME  Embed the question here

# calculate cosine similarity between question and each document
similaries = #FIXME  Calculate cosine similarity between question and each document

# Get the matching document, in this case we just use argmax of similarities
max_i = #FIXME  Get the index of the max similarity

# print some stats about the similarities
for i, s in enumerate(similaries):
    print(f"Similarity to {sample_files[i]} is {s}")
print(f"Matching document is {sample_files[max_i]}")
best_match = documents[max_i]  # get the best match

In [None]:
# Generate a prompt that we use for completion, in this case we put the matched document and the question in the prompt
prompt = #FIXME  # Generate a prompt that we use for completion, in this case we put the matched document and the question in the prompt

# get response from completion model
response = #FIXME # get response from completion model
answer = #FIXME # get the answer from the response

# print the question and answer
print(f"Question was: {question}\nRetrieved answer was: {answer}")

Great, that worked. Now we should have a simple understanding how Q&A can work using Azure OpenAI Service embeddings and completions. Next step would be:

* Chunking of longer documents (you might run into token limits for embeddings and the answering prompt)
* Usage of a vector database (pinecone, redis, etc.) to scale the search part to a larger amount of documents
* Evaluation of the top k results, instead of just the best matching document
* ...and a few more!