# Azure OpenAI Service - Q&A with semantic answering exerise

In this tutorial, you'll build a simple Q&A system, that can give semantic answers to questions. Three sample documents from the Azure documentation are provided. Fill out the missing pieces in the source source to get everything working (indicated by `#FIXME`).

In [None]:
import os
import json
import tiktoken
import openai
import numpy as np
from dotenv import load_dotenv
from openai.embeddings_utils import cosine_similarity

# Load environment variables
load_dotenv()

# Configure OpenAI API
openai.api_type = "azure"
openai.api_version = "2022-12-01"
openai.api_base = os.getenv('OPENAI_API_BASE')
openai.api_key = os.getenv("OPENAI_API_KEY")

# Define embedding model and encoding
EMBEDDING_MODEL = 'text-embedding-ada-002'
EMBEDDING_ENCODING = 'cl100k_base'
EMBEDDING_CHUNK_SIZE = 8000
COMPLETION_MODEL = 'text-davinci-003'

# initialize tiktoken for encoding text
encoding = tiktoken.get_encoding(EMBEDDING_ENCODING)

Next, let's read the documents in `samples/*.json`, which are our sample documents. The `content` section is the interesting piece of information for this tutorial:

```json
{
  "content": "\n# What is Azure OpenAI?\n\nThe ...",
  "product_name": "cognitive-services",
  "title": "Azure Cognitive Services",
  "description": "Apply advanced language models to variety of use cases with the Azure OpenAI service",
  "topic": "overview",
  "date": "11/07/2022"
}
```

In [None]:
# list all files in the samples directory
samples_dir = os.path.join(os.getcwd(), "data-qna/")
sample_files = os.listdir(samples_dir)

# read content field from each file and append it to documents, and remove and newlines (better for embeddings later)
documents = []
for file in sample_files:
    with open(os.path.join(samples_dir, file), "r") as f:
        content = json.load(f)["content"]
        content = content.replace("\n", " ")
        content = content.replace("  ", " ")
        documents.append(content)

# print some stats about the documents
print(f"Loaded {len(documents)} documents")
for doc in documents:
    num_tokens = len(encoding.encode(doc))
    print(f"Content: {doc[:80]}... \n---> Tokens: {num_tokens}\n")

Now that we have all documents loaded, we can embed them using our embedding model:

In [None]:
# Create embeddings for all docs
embeddings = #FIXME

# print some stats about the embeddings
for e in embeddings:
    print(len(e))

Now that we have our embeddings, we can try to ask some questions and see if it retrieves the correct document. You can try the following questions:

* what is azure openai service?
* can translator be fine tuned?
* what is the difference between luis and clu?
* what is form recognizer? (should yield no result)

In [None]:
# create embedding for question
question = "what is azure openai service?"
qe = #FIXME

# calculate cosine similarity between question and each document
similaries = #FIXME

# Get the matching document, in this case we just use argmax of similarities
max_i = #FIXME

# print some stats about the similarities
for i, s in enumerate(similaries):
    print(f"Similarity to {sample_files[i]} is {s}")
print(f"Matching document is {sample_files[max_i]}")

In [None]:
# Generate a prompt that we use for completion, in this case we put the matched document and the question in the prompt
prompt = #FIXME

# get response from completion model
response = #FIXME
answer = #FIXME

# print the question and answer
print(f"Question was: {question}\nRetrieved answer was: {answer}")

Great, that worked. Now we should have a simple understanding how Q&A can work using OpenAI embeddings and completions. Next step would be:

* Chunking of longer documents (you might run into token limits for embeddings and the answering prompt)
* Usage of a vector database (pinecone, redis, etc.) to scale the search part to a larger amount of documents
* Evaluation of the top k results, instead of just the best matching document
* ...and a few more!