### Langchain
[LangChain](https://python.langchain.com/en/latest/index.html) is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model, but will also be:
- Data-aware: connect a language model to other sources of data
- Agentic: allow a language model to interact with its environment

The LangChain framework is designed around these principles.

We will use Langchain framework for rest of the workshop.

#### Question Answering over the docs/index
Question answering in this context refers to question answering over your document data.  For question answering over many documents, you almost always want to create an index over the data. This can be used to smartly access the most relevant documents for a given question, allowing you to avoid having to pass all the documents to the LLM (saving you time and money).

#### Set Environment Variables

In [22]:
import os  
import json  
import openai
from Utilities.envVars import *

# Set Search Service endpoint, index name, and API key from environment variables
indexName = SearchIndex

# Set OpenAI API key and endpoint
openAiEndPoint = f"{OpenAiEndPoint}"
assert openAiEndPoint, "ERROR: Azure OpenAI Endpoint is missing"

#### Generate answer for a question from the document we already indexed in Vector Store

In [29]:
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.chat_models import AzureChatOpenAI, ChatOpenAI, ChatLiteLLM
from langchain.embeddings.openai import OpenAIEmbeddings
from Utilities.cogSearch import performCogSearch
from langchain.docstore.document import Document
from langchain.prompts import PromptTemplate
from IPython.display import display, HTML

embeddingModelType = "azureopenai"
temperature = 0.3
tokenLength = 1000

if (embeddingModelType == 'azureopenai'):
        #os.environ["AZURE_API_BASE"] = OpenAiEndPoint
        #os.environ["AZURE_API_VERSION"] = OpenAiVersion
        #os.environ["AZURE_API_KEY"] = OpenAiKey
        print("Azure OpenAI Chat")
        llm = ChatLiteLLM(
                temperature=temperature,
                max_tokens=tokenLength,
                model_kwargs={"custom_llm_provider": "azure", "deployment_id":OpenAiChat, "AZURE_API_BASE": OpenAiEndPoint, "AZURE_API_VERSION": OpenAiVersion, "AZURE_API_KEY": OpenAiKey})
        embeddings = OpenAIEmbeddings(deployment=OpenAiEmbedding, openai_api_key=OpenAiKey, openai_api_type="azure")
        logging.info("LLM Setup done")
elif embeddingModelType == "openai":
        print("OpenAI Chat")
        #os.environ['OPENAI_API_KEY'] = OpenAiApiKey
        llm = ChatLiteLLM(temperature=temperature,
        openai_api_key=OpenAiApiKey,
        model_name="gpt-3.5-turbo",
        max_tokens=tokenLength)
        embeddings = OpenAIEmbeddings(openai_api_key=OpenAiApiKey)

Azure OpenAI Chat


In [30]:
# We already created our index and loaded the data, so we can skip that part. Let's try to ask a question:
# Question answering involves fetching multiple documents, and then asking a question of them. 
# The LLM response will contain the answer to your question, based on the content of the documents.
# The simplest way of using Langchain and LLM is to use load_qa_chain and run it with a query and a list of documents.

chainType = "stuff"
topK = 3
query = "What is Microsoft Fabric"

# Since we already index our document, we can perform the search on the query to retrieve "TopK" documents
r = performCogSearch(OpenAiEndPoint, OpenAiKey, OpenAiVersion, OpenAiApiKey, SearchService, SearchKey, embeddingModelType, OpenAiEmbedding, query, indexName, topK)

if r == None:
    docs = [Document(page_content="No results found")]
else :
    docs = [
        Document(page_content=doc['content'], metadata={"id": doc['id'], "source": doc['sourcefile']})
        for doc in r
        ]

qaChain = load_qa_with_sources_chain(llm, chain_type=chainType)
#qaChain.run(input_documents=docs, question=query)
answer = qaChain({"input_documents": docs, "question": query}, return_only_outputs=True)
outputAnswer = answer['output_text']
print(outputAnswer)

TypeError: completion() got an unexpected keyword argument 'AZURE_API_BASE'

#### How about we ask a question for which the answer is not in the document we have indexed in Vector Store

In [5]:
chainType = "stuff"
topK = 3
query = "Tell me a Joke"
#query = "Who is the CEO of Microsoft"
# Since we already index our document, we can perform the search on the query to retrieve "TopK" documents
r = performCogSearch(OpenAiEndPoint, OpenAiKey, OpenAiVersion, OpenAiApiKey, SearchService, SearchKey, embeddingModelType, OpenAiEmbedding, query, indexName, topK)

if r == None:
    docs = [Document(page_content="No results found")]
else :
    docs = [
        Document(page_content=doc['content'], metadata={"id": doc['id'], "source": doc['sourcefile']})
        for doc in r
        ]

qaChain = load_qa_with_sources_chain(llm, chain_type=chainType)
#qaChain.run(input_documents=docs, question=query)
answer = qaChain({"input_documents": docs, "question": query}, return_only_outputs=True)
outputAnswer = answer['output_text']
print(outputAnswer)


I'm sorry, I don't have the capability to tell jokes.


#### What if we don't want to have LLM answer the question outside of the document we have indexed in Vector Store. We can use the custom prompt to do that.

In [6]:
chainType = "stuff"
topK = 3
query = "Who is the CEO of Microsoft"

# Since we already index our document, we can perform the search on the query to retrieve "TopK" documents
r = performCogSearch(OpenAiEndPoint, OpenAiKey, OpenAiVersion, OpenAiApiKey, SearchService, SearchKey, embeddingModelType, OpenAiEmbedding, query, indexName, topK)

if r == None:
    docs = [Document(page_content="No results found")]
else :
    docs = [
        Document(page_content=doc['content'], metadata={"id": doc['id'], "source": doc['sourcefile']})
        for doc in r
        ]

template = """
            Given the following extracted parts of a long document and a question, create a final answer. 
            If you don't know the answer, just say that you don't know. Don't try to make up an answer. 
            If the answer is not contained within the text below, say \"I don't know\".

            QUESTION: {question}
            =========
            {summaries}
            =========
            """
#qaPrompt = load_prompt('lc://prompts/qa_with_sources/stuff/basic.json')
qaPrompt = PromptTemplate(template=template, input_variables=["summaries", "question"])
qaChain = load_qa_with_sources_chain(llm, chain_type=chainType, prompt=qaPrompt)
#qaChain.run(input_documents=docs, question=query)
answer = qaChain({"input_documents": docs, "question": query}, return_only_outputs=True)
outputAnswer = answer['output_text']
print(outputAnswer)

I don't know.


#### Chain type
This category of chains are used for interacting with indexes. The purpose these chains is to combine your own data (stored in the indexes) with LLMs. The best example of this is question answering over your own documents.

A big part of this is understanding how to pass multiple documents to the language model. There are a few different methods, or chains, for doing so. LangChain supports four of the more common ones - and we are actively looking to include more, so if you have any ideas please reach out! Note that there is not one best method - the decision of which one to use is often very context specific. In order from simplest to most complex

##### Stuff
Stuffing is the simplest method, whereby you simply stuff all the related data into the prompt as context to pass to the language model. This is implemented in LangChain as the StuffDocumentsChain.

- Pros: Only makes a single call to the LLM. When generating text, the LLM has access to all the data at once.
- Cons: Most LLMs have a context length, and for large documents (or many documents) this will not work as it will result in a prompt larger than the context length.

##### Map-Reduce
This method involves running an initial prompt on each chunk of data (for summarization tasks, this could be a summary of that chunk; for question-answering tasks, it could be an answer based solely on that chunk). Then a different prompt is run to combine all the initial outputs. This is implemented in the LangChain as the MapReduceDocumentsChain.

- Pros: Can scale to larger documents (and more documents) than StuffDocumentsChain. The calls to the LLM on individual documents are independent and can therefore be parallelized.
- Cons: Requires many more calls to the LLM than StuffDocumentsChain. Loses some information during the final combined call.

##### Refine
This method involves running an initial prompt on the first chunk of data, generating some output. For the remaining documents, that output is passed in, along with the next document, asking the LLM to refine the output based on the new document.

- Pros: Can pull in more relevant context, and may be less lossy than MapReduceDocumentsChain.
- Cons: Requires many more calls to the LLM than StuffDocumentsChain. The calls are also NOT independent, meaning they cannot be paralleled like MapReduceDocumentsChain. There is also some potential dependencies on the ordering of the documents.

##### Map-Rerank
This method involves running an initial prompt on each chunk of data, that not only tries to complete a task but also gives a score for how certain it is in its answer. The responses are then ranked according to this score, and the highest score is returned.

- Pros: Similar pros as MapReduceDocumentsChain. Requires fewer calls, compared to MapReduceDocumentsChain.
- Cons: Cannot combine information between documents. This means it is most useful when you expect there to be a single simple answer in a single document.

##### Let's test the same question with Map Reduce Chaintype

In [7]:
topK = 3
query = "What is Microsoft Fabric"
chainType = "map_reduce"

# Since we already index our document, we can perform the search on the query to retrieve "TopK" documents
r = performCogSearch(OpenAiEndPoint, OpenAiKey, OpenAiVersion, OpenAiApiKey, SearchService, SearchKey, embeddingModelType, OpenAiEmbedding, query, indexName, topK)

if r == None:
    docs = [Document(page_content="No results found")]
else :
    docs = [
        Document(page_content=doc['content'], metadata={"id": doc['id'], "source": doc['sourcefile']})
        for doc in r
        ]

qaTemplate = """Use the following portion of a long document to see if any of the text is relevant to answer the question.
            Return any relevant text.
            {context}
            Question: {question}
            Relevant text, if any :"""

qaPrompt = PromptTemplate(
    template=qaTemplate, input_variables=["context", "question"]
)

combinePromptTemplate = """Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES").
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.

QUESTION: {question}
=========
{summaries}
=========
"""
combinePrompt = PromptTemplate(
    template=combinePromptTemplate, input_variables=["summaries", "question"]
)

qaChain = load_qa_with_sources_chain(llm, chain_type=chainType, question_prompt=qaPrompt, 
                                     combine_prompt=combinePrompt, 
                                     return_intermediate_steps=True)
answer = qaChain({"input_documents": docs, "question": query})
outputAnswer = answer['output_text']
print(outputAnswer)

Microsoft Fabric is an all-in-one analytics solution for enterprises that brings together components from Power BI, Azure Synapse, and Azure Data Explorer into a single integrated environment. It offers a comprehensive suite of services, including data movement, data engineering, data integration, data science, real-time analytics, and business intelligence. Fabric is designed to simplify analytics needs by providing a highly integrated, end-to-end, and easy-to-use product. It is built on a foundation of Software as a Service (SaaS), which enhances simplicity and integration. Fabric allows creators to focus on their work without the need to manage or understand the underlying infrastructure. 

SOURCES:
- Fabric Get Started.pdf


In [8]:
# For the chaintype of MapReduce and Refine, we can also get insight into intermediate steps of the pipeline.
# This way you can inspect the results from map_reduce chain type, each top similar chunk summary
intermediateSteps = answer['intermediate_steps']
for step in intermediateSteps:
        display(HTML("<b>Chunk Summary:</b> " + step))

##### This time with Refine Chain Type

In [9]:
topK = 3
query = "What is Microsoft Fabric"
chainType = "refine"

# Since we already index our document, we can perform the search on the query to retrieve "TopK" documents
r = performCogSearch(OpenAiEndPoint, OpenAiKey, OpenAiVersion, OpenAiApiKey, SearchService, SearchKey, embeddingModelType, OpenAiEmbedding, query, indexName, topK)

if r == None:
    docs = [Document(page_content="No results found")]
else :
    docs = [
        Document(page_content=doc['content'], metadata={"id": doc['id'], "source": doc['sourcefile']})
        for doc in r
        ]
    
refineTemplate = (
                    "The original question is as follows: {question}\n"
                    "We have provided an existing answer, including sources: {existing_answer}\n"
                    "We have the opportunity to refine the existing answer"
                    "(only if needed) with some more context below.\n"
                    "------------\n"
                    "{context_str}\n"
                    "------------\n"
                    "Given the new context, refine the original answer to better "
                    "If you do update it, please update the sources as well. "
                    "If the context isn't useful, return the original answer."
                )
refinePrompt = PromptTemplate(
    input_variables=["question", "existing_answer", "context_str"],
    template=refineTemplate,
)

qaTemplate = (
    "Answer the question as truthfully as possible using the provided text below, and if the answer is not contained within the text below, say \"I don't know\"\n"
    "Context information is below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the question: {question}\n"
    "\n---------------------\n"
)
qaPrompt = PromptTemplate(
    input_variables=["context_str", "question"], template=qaTemplate
)
qaChain = load_qa_with_sources_chain(llm, chain_type=chainType, question_prompt=qaPrompt, refine_prompt=refinePrompt,
                                     return_intermediate_steps=True)

answer = qaChain({"input_documents": docs, "question": query}, return_only_outputs=True)
modifiedAnswer = answer['output_text']
print(modifiedAnswer)

Microsoft Fabric is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, Real-Time Analytics, and business intelligence. It offers a comprehensive suite of services, including data lake, data engineering, and data integration, all in one place. Fabric brings together new and existing components from Power BI, Azure Synapse, and Azure Data Explorer into a single integrated environment. This integration provides advantages such as an extensive range of deeply integrated analytics, shared experiences across familiar and easy-to-learn interfaces, easy access and reuse of assets for developers, a unified data lake that retains data where it is while using preferred analytics tools, and centralized administration and governance across all experiences. With the Microsoft Fabric SaaS experience, data and services are seamlessly integrated, allowing IT teams to centrally configure core enterprise capabilities and apply permissions automatic

In [10]:
# For the chaintype of MapReduce and Refine, we can also get insight into intermediate steps of the pipeline.
# This way you can inspect the results from map_reduce chain type, each top similar chunk summary
intermediateSteps = answer['intermediate_steps']
for step in intermediateSteps:
        display(HTML("<b>Chunk Summary:</b> " + step))