## Retrieval Augmeneted Generation in Action

All steps of RAG pattern are implemented in this notebook. We will use the same dataset as in the previous notebook.

In [1]:
# Import required libraries
import os
import json
import openai
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chat_models import AzureChatOpenAI
from langchain.llms import AzureOpenAI
from dotenv import load_dotenv
from tenacity import retry, wait_random_exponential, stop_after_attempt
from IPython.display import display, HTML, JSON, Markdown

# Configure environment variables
load_dotenv()

True

In [2]:
service_endpoint = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
index_name = os.getenv("AZURE_SEARCH_INDEX_NAME")
key = os.getenv("AZURE_SEARCH_ADMIN_KEY")

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
OPENAI_DEPLOYMENT_ENDPOINT = os.getenv("OPENAI_DEPLOYMENT_ENDPOINT")
OPENAI_DEPLOYMENT_NAME = os.getenv("OPENAI_DEPLOYMENT_NAME")
OPENAI_MODEL_NAME = os.getenv("OPENAI_MODEL_NAME")
OPENAI_DEPLOYMENT_VERSION = os.getenv("OPENAI_DEPLOYMENT_VERSION")

OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME = os.getenv(
    "OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME")
OPENAI_ADA_EMBEDDING_MODEL_NAME = os.getenv("OPENAI_ADA_EMBEDDING_MODEL_NAME")

# Configure OpenAI API
openai.api_type = "azure"
openai.api_version = OPENAI_DEPLOYMENT_VERSION
openai.api_base = OPENAI_DEPLOYMENT_ENDPOINT
openai.api_key = OPENAI_API_KEY
# ---
credential = AzureKeyCredential(key)

print(OPENAI_DEPLOYMENT_ENDPOINT)

https://openailabs303474.openai.azure.com/


In [3]:
def init_llm(model=OPENAI_MODEL_NAME,
             deployment_name=OPENAI_DEPLOYMENT_NAME,
             temperature=0,
             max_tokens=500,
             ):

    llm = AzureOpenAI(deployment_name=deployment_name,
                      model=model,
                      temperature=temperature,
                      max_tokens=max_tokens,
                      model_kwargs={"stop": ["<|im_end|>"]}
                      )
    return llm

In [9]:
# Generate Document Embeddings using OpenAI Ada 002

@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
# Function to generate embeddings for title and content fields, also used for query embeddings
def generate_embeddings(page):
    response = openai.Embedding.create(
        input=page, engine="text-embedding-ada-002")

    embeddings = response['data'][0]['embedding']
    return embeddings

In [10]:
template = """ System:
You are assistant helping the company technical support users with their questions about different product features. 
Answer ONLY with the facts listed in the sources below delimited by triple backticks.
If there isn't enough information in the Sources, say you don't know and ask a user to provide more details. 
Do not generate answers that don't use the Sources below. 
If asking a clarifying question to the user would help, ask the question. 

Sources:
```{sources}```
User question is here below delimited by triple backticks
User:
```{question}``` 

You answer is here below:
Answer:
<|im_end|>
"""

### Init LLM model 

We use here Langchain ConversationChain and ConversationBufferMemory to keep the context of the conversation.

In [4]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain.prompts import ChatPromptTemplate

llm = init_llm()
# ConversationBufferMemory is a memory that stores the conversation history
memory = ConversationBufferMemory()
# try to change the verbose to True, to see more details
conversation = ConversationChain(llm=llm, memory=memory, verbose=False)

### Init Azure Cognitive Search client

In [5]:

search_client = SearchClient(

    service_endpoint, index_name="sk-cogsrch-vector-index-2", credential=credential)

#### Search method for searching in Azure Cognitive Search

In [6]:

def search(question, top_k=3):
    results = search_client.search(
        search_text=None,
        vector=generate_embeddings(question),
        top_k=top_k,
        vector_fields="contentVector",
        select=["title", "content"],
    )

    result_pages = []
    for result in results:
        result_pages.append(result['content'])

    sources = "\n\n".join([page for page in result_pages])
    return sources

#### Calling OpenAI with the retrieved documents

In [7]:

def ask_openai(sources, question):
    prompt = ChatPromptTemplate.from_template(template=template)
    response = conversation.run(input=prompt.format(sources=sources,
                                                    question=question))
    return response

## Put all RAG phases together

In [12]:
question = "what's semnatic kernel"
#1. Retrieval. Call Azure Cognitive Search to retrieve relevant documents
sources = search(question, top_k=3)  # retrieval
#print(sources)
# 2. Generation. Call OpenAI to generate a final answer
response = ask_openai(sources, question)
display(Markdown(response))

 Semantic Kernel is an open-source SDK that lets you easily combine AI services like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C# and Python. By doing so, you can create AI apps that combine the best of both worlds. During Kevin Scott's talk The era of the AI Copilot, he showed how Microsoft powers its Copilot system with a stack of AI models and plugins. At the center of this stack is an AI orchestration layer that allows us to combine AI models and plugins together to create brand new experiences for users. Semantic Kernel is at the center of the copilot stack.

In [13]:
question = "which programming languages are supported by semantic kernel?"
#1. Retrieval. Call Azure Cognitive Search to retrieve relevant documents
sources = search(question, top_k=3)
#print(sources)
# 2. Generation. Call OpenAI to generate a final answer
response = ask_openai(sources, question)
display(Markdown(response))

  Semantic Kernel plans on providing support to the following languages: C#, Python, Java. While the overall architecture of the kernel is consistent across all languages, we made sure the SDK for each language follows common paradigms and styles in each language to make it feel native and easy to use. Today, not all features are available in all languages. The following tables show which features are available in each language. The 🔄 symbol indicates that the feature is partially implemented, please see the associated note column for more details. The ❌ symbol indicates that the feature is not yet available in that language; if you would like to see a feature implemented in a language, please consider contributing to the project or opening an issue.

### Vector Search is multi-lingual

In this example we ask a question in Spanish on the text, which is in English. 

The answer is in Spansih since the question is in Spanish so OpenAI generates the answer in the same language as the question.

In [15]:
question = "¿Qué son el planificador semántico del kernel y el kernel?"
sources = search(question, top_k=3)
#print(sources)
response = ask_openai(sources, question)
display(Markdown(response))

   El kernel orquesta el ASK del usuario expresado como un objetivo. El planificador lo descompone en pasos en función de los recursos disponibles. Los plugins son recursos personalizables construidos a partir de LLM AI prompts y código nativo.

In [17]:

question = "explain how to deploy Semantic Kernel to Azure as a web app service"
sources = search(question, top_k=3)
#print(sources)
response = ask_openai(sources, question)
display(Markdown(response))

    You can deploy Semantic Kernel to Azure as a web app service by using the standard methods available to deploy an ASP.net web app. Alternatively, you can follow the steps below to manually build and upload your customized version of the Semantic Kernel service to Azure. First, at the command line, go to the '/webapi' directory and enter the following command: PowerShell. This will create a directory which contains all the files needed for a deployment. Zip the contents of that directory and store the resulting zip file on cloud storage, e.g. Azure Blob Container. Put its URI in the "Package Uri" field in the web deployment page you access through the "Deploy to Azure" buttons or use its URI as the value for the PackageUri parameter of the deployment scripts found on this page. Your deployment will then use your customized deployment package. That package will be used to create a new Azure web app, which will be configured to run your customized version of the Semantic Kernel service.