# Azure OpenAI Service - Q&A with semantic answering exerise

In this tutorial, you'll build a simple Q&A system, that can give semantic answers to questions. Three sample documents from the Azure documentation are provided. Fill out the missing pieces in the source source to get everything working (indicated by `#FIXME`).

In [3]:
pip install openai==0.28.1

Defaulting to user installation because normal site-packages is not writeable
Collecting openai==0.28.1
  Using cached openai-0.28.1-py3-none-any.whl.metadata (11 kB)
Using cached openai-0.28.1-py3-none-any.whl (76 kB)
Installing collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.59.7
    Uninstalling openai-1.59.7:
      Successfully uninstalled openai-1.59.7
Successfully installed openai-0.28.1
Note: you may need to restart the kernel to use updated packages.


DEPRECATION: Loading egg at c:\programdata\anaconda3\lib\site-packages\vboxapi-1.0-py3.12.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation.. Discussion can be found at https://github.com/pypa/pip/issues/12330


In [40]:
import os
import json
import tiktoken
import openai
import numpy as np
from dotenv import load_dotenv
# from openai.embeddings_utils import cosine_similarity

# Load environment variables
load_dotenv()

# Configure Azure OpenAI Service API
api_type = "azure"
api_version = "2022-12-01"
api_base = os.getenv('AZURE_OPENAI_ENDPOINT')
api_key = os.getenv("AZURE_OPENAI_API_KEY")

client = openai.AzureOpenAI(
    azure_endpoint=api_base,
    api_key=api_key,
    api_version="2023-09-01-preview"
)
client2 = openai.AzureOpenAI(
    azure_endpoint='https://alvar-m5xnib7a-swedencentral.cognitiveservices.azure.com/',
    api_key="24ymwZq2Hrj8GheFk2mzgXfHJSEPmPcmonZnAnnMXHvB3LAKj13BJQQJ99BAACfhMk5XJ3w3AAAAACOGm1Bp",
    api_version="2022-12-01"
)

# Define embedding model and encoding
EMBEDDING_MODEL = 'text-embedding-ada-002'
EMBEDDING_ENCODING = 'cl100k_base'
EMBEDDING_CHUNK_SIZE = 8000
COMPLETION_MODEL = 'davinci-002'

# initialize tiktoken for encoding text
encoding = tiktoken.get_encoding(EMBEDDING_ENCODING)

Next, let's read the documents in `/data/qna/*.txt`, which are our sample documents:

In [17]:
# list all files in the samples directory
samples_dir = os.path.join(os.getcwd(), "./data/qna/")
sample_files = os.listdir(samples_dir)

# read each file and remove and newlines (better for embeddings later)
documents = []
for file in sample_files:
    with open(os.path.join(samples_dir, file), "r") as f:
        content = f.read()
        content = content.replace("\n", " ")
        content = content.replace("  ", " ")
        documents.append(content)

# print some stats about the documents
print(f"Loaded {len(documents)} documents")
for doc in documents:
    num_tokens = len(encoding.encode(doc))
    print(f"Content: {doc[:80]}... \n---> Tokens: {num_tokens}\n")

Loaded 3 documents
Content:  # What is conversational language understanding? Conversational language unders... 
---> Tokens: 1341

Content:  # What is Azure OpenAI? The Azure OpenAI service provides REST API access to Op... 
---> Tokens: 1891

Content:  # What is Azure Cognitive Services Translator? Translator Service is a cloud-ba... 
---> Tokens: 739



Now that we have all documents loaded, we can embed them using our embedding model:

In [22]:
# Create embeddings for all docs
embeddings = []
for doc in documents:
    response = client.embeddings.create(model=EMBEDDING_MODEL, input=doc)
    embeddings.append(response.data[0].embedding)


# print some stats about the embeddings
for e in embeddings:
    print(len(e))

1536
1536
1536


Now that we have our embeddings, we can try to ask some questions and see if it retrieves the correct document. You can try the following questions:

* what is azure openai service?
* can translator be fine tuned?
* what is the difference between luis and clu?
* what is form recognizer? (should yield no result)

In [35]:
from sklearn.metrics.pairwise import cosine_similarity

# create embedding for question
question = "what is azure openai service?"
qe = client.embeddings.create(model=EMBEDDING_MODEL, input=question) #FIXME
qe_embedding = qe.data[0].embedding

# calculate cosine similarity between question and each document
similarities = cosine_similarity([qe_embedding], embeddings)[0] #FIXME

# Get the matching document, in this case we just use argmax of similarities
max_i = np.argmax(similarities) #FIXME

# print some stats about the similarities
for i, s in enumerate(similarities):
    print(f"Similarity to {sample_files[i]} is {s}")
print(f"Matching document is {sample_files[max_i]}")

Similarity to overview_clu.txt is 0.7744814220828408
Similarity to overview_openai.txt is 0.8682318308941729
Similarity to overview_translator.txt is 0.7929442705877128
Matching document is overview_openai.txt


In [43]:
# Generate a prompt that we use for completion, in this case we put the matched document and the question in the prompt
prompt = f"Document: {documents[max_i]}\n\nQuestion: {question}\n\nAnswer:" #FIXME

# get response from completion model
response = client2.completions.create(
    model=COMPLETION_MODEL,
    prompt=prompt
) #FIXME
answer = response.choices[0].text #FIXME

# print the question and answer
print(f"Question was: {question}\nRetrieved answer was: {answer}")

Question was: what is azure openai service?
Retrieved answer was:  Azure OpenAI Service is a [PLAINTEXT.](https://en


Great, that worked. Now we should have a simple understanding how Q&A can work using Azure OpenAI Service embeddings and completions. Next step would be:

* Chunking of longer documents (you might run into token limits for embeddings and the answering prompt)
* Usage of a vector database (pinecone, redis, etc.) to scale the search part to a larger amount of documents
* Evaluation of the top k results, instead of just the best matching document
* ...and a few more!