# Notebook 5: LangChain Integrations 

[LangChain](https://python.langchain.com/docs/get_started/introduction.html) is a popular library for developing applications powered by language models. You can use LangChain with LLMs to build various interesting applications such as [Chatbot](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/langchain/transformers_int4/chat.py), [Document Q&A](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/langchain/transformers_int4/docqa.py), [voice assistant](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/langchain/transformers_int4/voiceassistant.py). BigDL-LLM provides LangChain integrations (i.e. LLM wrappers and embeddings) and you can use them the same way as [other LLM wrappers in LangChain](https://python.langchain.com/docs/integrations/llms/). 

This notebook goes over how to use langchain to interact with BigDL-LLM.

## 5.1 Installation

First of all, install BigDL-LLM in your prepared environment. For best practices of environment setup, refer to [Chapter 2]() in this tutorial.

In [None]:
!pip install bigdl-llm[all]

Then install LangChain.

In [None]:
!pip install -U langchain==0.0.248

> **Note**
> 
> We recommend to use `langchain==0.0.248`, which is verified in our tutorial.

## 5.3 LLM Wrapper

BigDL-LLM provides `TransformersLLM` and `TransformersPipelineLLM`, which implement the standard interface of LLM wrapper of LangChain.

`TransformerLLM` can be instantiated using `TransformerLLM.from_model_id` from a huggingface model_id or path. Model generation related parameters (e.g. `temperature`, `max_length`) can be passed in as a dictionary in `model_kwargs`. Let's use [`vicuna-7b-v1.5`](https://huggingface.co/lmsys/vicuna-7b-v1.5) model as an example to instatiate `TransformerLLM`.


In [None]:
from bigdl.llm.langchain.llms import TransformersLLM

llm = TransformersLLM.from_model_id(
        model_id="lmsys/vicuna-7b-v1.5",
        model_kwargs={"temperature": 0, "max_length": 1024, "trust_remote_code": True},
    )

> **Note**
>
> `TransformersPipelineLLM` can be instantiated in similar way as `TransformersLLM` from a huggingface model_id or path, `model_kwargs` and `pipeline_kwargs`. Besides, there's an extra `task` parameter which specifies the type of task to perform.

Use a prompt template to format the prompt and simply call `llm` to test generation.

> **Note**
>
> `max_new_tokens` parameter defines the maximum number of tokens to generate.

In [197]:
prompt = "What is AI?"
VICUNA_PROMPT_TEMPLATE = "USER: {prompt}\nASSISTANT:"
result = llm(prompt=VICUNA_PROMPT_TEMPLATE.format(prompt=prompt), max_new_tokens=128)

AI stands for "Artificial Intelligence." It refers to the development of computer systems that can perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. AI can be achieved through a combination of techniques such as machine learning, natural language processing, computer vision, and robotics. The ultimate goal of AI research is to create machines that can think and learn like humans, and can even exceed human capabilities in certain areas.


You can also use `generate` on LLM to get batch results.

In [None]:
llm_result = llm.generate([VICUNA_PROMPT_TEMPLATE.format(prompt="Tell me a joke"), VICUNA_PROMPT_TEMPLATE.format(prompt="Tell me a poem")]*3)

In [199]:
print("-"*20+"number of generations"+"-"*20)
print(len(llm_result.generations))
print("-"*20+"the first generation"+"-"*20)
print(llm_result.generations[0][0].text)

--------------------number of generations--------------------
6
--------------------the first generation--------------------
USER: Tell me a joke
ASSISTANT: Why did the tomato turn red?

Because it saw the salad dressing!


## 5.4 Using Chains

Now let's begin using LLM wrappers and embeddings in [Chains](https://docs.langchain.com/docs/components/chains/).

>**Note**
> Chain is an important component in LangChain, which combines a sequence of modular components (even other chains) to achieve a particular purpose. The compoents in chain may be propmt templates, models, memory buffers, etc. 

### 5.4.1 LLMChain

Let's first try use a simple chain `LLMChain`. 

Create a simple prompt template as below. 

In [252]:
from langchain import PromptTemplate

template ="USER: {question}\nASSISTANT:"
prompt = PromptTemplate(template=template, input_variables=["question"])

Now use the `llm` we created in previous section and the prompt tempate we just created to instantiate a `LLMChain`. 

In [201]:
from langchain import LLMChain

llm_chain = LLMChain(prompt=prompt, llm=llm)

Now let's ask the llm a question and get the response by calling `run` on `LLMChain`.

In [203]:
question = "What is AI?"
result = llm_chain.run(question)

AI stands for "Artificial Intelligence." It refers to the development of computer systems that can perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. AI can be achieved through a combination of techniques such as machine learning, natural language processing, computer vision, and robotics. The ultimate goal of AI research is to create machines that can think and learn like humans, and can even exceed human capabilities in certain areas.


### 5.4.2 Conversation Chain

To build a chat application, we can use a more complex chain with memory buffers to remember the chat history. This is useful to enable multi-turn chat experience.

> **Note**
>
> `ConversationBufferMemory` is a type of memory in LangChain that allows for storing messages from a conversation and extracting them in different formats.

In [3]:
from langchain import PromptTemplate
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import ConversationBufferMemory

template = "The following is a friendly conversation between a human and an AI.\
    The AI is talkative and provides lots of specific details from its context.\
    If the AI does not know the answer to a question, it truthfully says it does not know.\
    \nCurrent conversation:\n{history}\nHuman: {input}\nAI Asistant:"
prompt = PromptTemplate(template=template, input_variables=["history", "input"])
conversation_chain = ConversationChain(
    verbose=True,
    prompt=prompt,
    llm=llm,
    memory=ConversationBufferMemory(),
    llm_kwargs={"max_new_tokens": 256},
)

In [4]:
query ="Good morning AI!" 
result = conversation_chain.run(query)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI.    The AI is talkative and provides lots of specific details from its context.    If the AI does not know the answer to a question, it truthfully says it does not know.    
Current conversation:

Human: Good morning AI!
AI Asistant:[0m
Good morning! How can I assist you today?

[1m> Finished chain.[0m


In [5]:
query ="Tell me about Intel." 
result = conversation_chain.run(query)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI.    The AI is talkative and provides lots of specific details from its context.    If the AI does not know the answer to a question, it truthfully says it does not know.    
Current conversation:
Human: Good morning AI!
AI: The following is a friendly conversation between a human and an AI.    The AI is talkative and provides lots of specific details from its context.    If the AI does not know the answer to a question, it truthfully says it does not know.    
Current conversation:

Human: Good morning AI!
AI Asistant: Good morning! How can I assist you today?
Human: Tell me about Intel.
AI Asistant:[0m
Intel is a multinational technology company that specializes in the development and manufacturing of computer processors and related technologies. It was founded in 1976 by Robert Noyce and Gordon Moore, and is headquartered in San

### 5.4.3 MathChain

Let's try use LLM solve some math problem, using `MathChain`.

> **Note** 
> MathChain usually need LLMs to be instantiated with larger `max_length`, e.g. 1024


In [255]:
from langchain.chains import LLMMathChain

MATH_CHAIN_TEMPLATE ="Question: {question}\nAnswer:"
prompt = PromptTemplate(template=MATH_CHAIN_TEMPLATE, input_variables=["question"])
llm_math = LLMMathChain.from_llm(prompt=prompt, llm=llm, verbose=True)

In [256]:
question = "What is 13 raised to the 2 power"
llm_math.run(question)



[1m> Entering new LLMMathChain chain...[0m
What is 13 raised to the 2 power

13 raised to the 2 power is equal to 13 \* 13, which is 169.
[32;1m[1;3mQuestion: What is 13 raised to the 2 power
Answer: 13 raised to the 2 power is equal to 13 \* 13, which is 169.[0m
[1m> Finished chain.[0m


'Answer:  13 raised to the 2 power is equal to 13 \\* 13, which is 169.'

### 5.5 Question Answering over Docs
Suppose you have some text documents (PDF, blog, Notion pages, etc.) and want to ask questions related to the contents of those documents. LLMs, given their proficiency in understanding text, are a great tool for this.

#### 5.5.1 Installation

Please install the necessary dependency library before running the example.

In [None]:
!pip install -U faiss-cpu

#### 5.5.1 Load documents

For convienence, here we use a text string as a loaded document. Unstructured data can be loaded from many sources. Use the [LangChain integration hub](https://integrations.langchain.com/) to browse the full set of loaders.

In [76]:
input_doc = "\
    BigDL: fast, distributed, secure AI for Big Data\n\n\
    BigDL seamlessly scales your data analytics & AI applications from laptop to cloud, with the following libraries:\
        Orca: Distributed Big Data & AI (TF & PyTorch) Pipeline on Spark and Ray\
        Nano: Transparent Acceleration of Tensorflow & PyTorch Programs on XPU\
        DLlib: “Equivalent of Spark MLlib” for Deep Learning\
        Chronos: Scalable Time Series Analysis using AutoML\
        Friesian: End-to-End Recommendation Systems\
        PPML: Secure Big Data and AI (with SGX Hardware Security)\
        LLM: A library for running large language models with very low latency using low-precision techniques on Intel platforms\n\n\
    "

#### 5.5.2 Split texts of input documents

[Text splitters](https://python.langchain.com/docs/modules/data_connection/document_transformers/) break Documents into splits of specified size. Here, we split the Document into chunks for embedding and vector storage.

> **Note**
> 
> `CharacterTextSplitter` will only split on separator (which is `'\n\n'` by default).
>
> `chunk_size` is the maximum number of characters that will be split if splitting is possible.
>
> `chunk_overlap` is the overlap number of characters between each split.

In [None]:
from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(chunk_size=650, chunk_overlap=0)
texts = text_splitter.split_text(input_doc)

#### 5.5.3 Create embeddings and store into vector stores

After splitting the documents, we need to store the splits for later we can search them based on input query. The most common way to do this is to embed the contents of each split then store the embedding vectors in a vector store. 

As we known, in Transformers, there are some embedding layers to transform unstructured data to embedding vectors(a list of numbers) to perform various operations with them. The embedding vectors represent the real-world objects and concepts, such as words, documents and so on.

BigDL-LLM provides `TransformersEmbeddings`, which allows you to obtain embeddings from text input using LLM.

`TransformersEmbeddings` can be instantiated the similar way as `TransformersLLM`

In [None]:
from bigdl.llm.langchain.embeddings import TransformersEmbeddings

embeddings = TransformersEmbeddings.from_model_id(model_id="lmsys/vicuna-7b-v1.5")

After introducing `TransformersEmbeddings`, let's create embeddings and store into vector stores. A vector store takes care of storing embedded data and performing vector search for you. Here we use [Faiss](https://faiss.ai/index.html) as an example, Faiss is a library for efficient similarity search and clustering of dense vectors.

In [82]:
from langchain.vectorstores import FAISS

docsearch = FAISS.from_texts(texts, embeddings, metadatas=[{"source": str(i)} for i in range(len(texts))]).as_retriever()

#### 5.5.4 Get relavant documents

As we mentioned, embedding vectors can be the representation of queries and docuemnts. This representation makes it possible to translate semantic similarity as perceived by humans to proximity in a vector space. Thus we can search the document through this similarity.

In [83]:
query = "What is BigDL?"
docs = docsearch.get_relevant_documents(query)
print("-"*20+"number of relevant documents"+"-"*20)
print(len(docs))

--------------------number of relevant documents--------------------
2


#### 5.5.5 Prepare chain

In [84]:
from langchain.chains.chat_vector_db.prompts import QA_PROMPT
from langchain.chains.question_answering import load_qa_chain

doc_chain = load_qa_chain(
    llm, chain_type="stuff", prompt=QA_PROMPT
)

#### 5.5.6 Generate

In [85]:
result = doc_chain.run(input_documents=docs, question=query)

BigDL is a fast, distributed, and secure AI library for Big Data. It enables seamless scaling of data analytics and AI applications from laptops to the cloud. BigDL supports various libraries, including Orca, Nano, DLlib, Chronos, Friesian, PPML, and LLM. These libraries cater to different use cases, such as distributed Big Data processing, transparent acceleration of TensorFlow and PyTorch programs, scalable time series analysis, end-to-end recommendation systems, and secure Big Data and AI with SGX hardware security. BigDL aims to provide a unified platform for AI and data analytics, making it easier for developers to build and deploy their applications at scale.
