# Langchain RAG with Mistral

This notebook is part of a medium article that I wrote about the Mistal 7B model. In this notebook I will show a simple example of how to use the Langchain RAG with the Mistral model. This can also be used with a series of other models from the LLM family. See the table below. 

First, intall the Mistal 7B model here:

## [Mistral 7B](https://ollama.ai/download)

## [Medium Article](Link to medium article)

In [15]:
!pip install langchain
!pip install gpt4all
!pip install chromadb
!pip install langchainhub



# Load data from the web
Let's load some data from the web. We will use the BBC article about the Japanese method of upcycling.

In [28]:
from langchain.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://www.bbc.com/culture/article/20231013-the-300-year-old-japanese-method-of-upcyling")
data = loader.load()

# Split into chunks
We will split the document into chunks of 1500 characters with 100 characters overlap.

In [29]:
# Split into chunks 
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=100)
all_splits = text_splitter.split_documents(data)

# Embed and store
We will embed the chunks using GPT4AllEmbeddings and store them in a Chroma vectorstore.

In [30]:
# Embed and store
from langchain.vectorstores import Chroma
from langchain.embeddings import GPT4AllEmbeddings
from langchain.embeddings import OllamaEmbeddings # We can also try Ollama embeddings
vectorstore = Chroma.from_documents(documents=all_splits,
                                    embedding=OllamaEmbeddings())

In [31]:
# Retrieve
question = "What is this article about?"
docs = vectorstore.similarity_search(question)
len(docs)

4

In [32]:
# RAG prompt
from langchain import hub
QA_CHAIN_PROMPT = hub.pull("rlm/rag-prompt-llama")

# Choose a model
We will use the Mistral model from the LLM family.

In [33]:
# LLM
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = Ollama(model="mistral",
             verbose=True,
             callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

In [34]:
# QA chain
from langchain.chains import RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)

In [35]:
question = "What is this article about?"
result = qa_chain({"query": question})

This article is about patching and mending fabrics, specifically providing tips for selecting patching fabric that is the same weight or even slightly thinner than the fabric being repaired. The advice is based on avoiding rubbing or feeling stiffness when repairing the fabric, according to Jessica Smulders Cohen.

# Ask a question
Let's ask a question about the article.

In [38]:
from langchain.schema import LLMResult
from langchain.callbacks.base import BaseCallbackHandler

class GenerationStatisticsCallback(BaseCallbackHandler):
    def on_llm_end(self, response: LLMResult, **kwargs) -> None:
        print(response.generations[0][0].generation_info)
        
callback_manager = CallbackManager([StreamingStdOutCallbackHandler(), GenerationStatisticsCallback()])

llm = Ollama(base_url="http://localhost:11434",
             model="mistral",
             verbose=True,
             callback_manager=callback_manager)

qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)

question = "What is the technic mentioned in the article?"
result = qa_chain({"query": question})

The technic mentioned in the article is patching fabric repair, where the weight of the patching fabric should be the same or slightly thinner than the fabric being mended to avoid rubbing and feeling stiff.{'model': 'mistral', 'created_at': '2023-10-16T21:45:39.717684Z', 'response': '', 'done': True, 'context': [733, 16289, 28793, 10649, 28747, 733, 16289, 28793, 5275, 18741, 4060, 995, 460, 396, 13892, 354, 2996, 28733, 509, 28727, 2131, 9796, 28723, 5938, 272, 2296, 7769, 302, 17913, 286, 2758, 298, 4372, 272, 2996, 28723, 1047, 368, 949, 28742, 28707, 873, 272, 4372, 28725, 776, 1315, 369, 368, 949, 28742, 28707, 873, 28723, 5938, 1712, 23748, 7228, 304, 1840, 272, 4372, 3078, 864, 26364, 700, 18741, 4060, 28705, 13, 24994, 28747, 1824, 349, 272, 2412, 294, 7083, 297, 272, 5447, 28804, 28705, 13, 2083, 28747, 387, 29000, 29000, 29000, 29000, 29000, 29000, 29000, 29000, 29000, 6746, 1864, 574, 11780, 288, 10455, 349, 272, 1348, 4336, 442, 1019, 7191, 306, 3739, 821, 272, 10455, 368,

In [37]:
question = "What was the former question?"
result = qa_chain({"query": question})

I'm sorry, but I couldn't find any previous question for context. Could you please provide me with a specific question that you would like me to answer?{'model': 'mistral', 'created_at': '2023-10-16T21:44:24.396601Z', 'response': '', 'done': True, 'context': [733, 16289, 28793, 10649, 28747, 733, 16289, 28793, 5275, 18741, 4060, 995, 460, 396, 13892, 354, 2996, 28733, 509, 28727, 2131, 9796, 28723, 5938, 272, 2296, 7769, 302, 17913, 286, 2758, 298, 4372, 272, 2996, 28723, 1047, 368, 949, 28742, 28707, 873, 272, 4372, 28725, 776, 1315, 369, 368, 949, 28742, 28707, 873, 28723, 5938, 1712, 23748, 7228, 304, 1840, 272, 4372, 3078, 864, 26364, 700, 18741, 4060, 28705, 13, 24994, 28747, 1824, 403, 272, 4494, 2996, 28804, 28705, 13, 2083, 28747, 387, 29000, 29000, 29000, 29000, 29000, 29000, 29000, 29000, 29000, 6746, 1864, 574, 11780, 288, 10455, 349, 272, 1348, 4336, 442, 1019, 7191, 306, 3739, 821, 272, 10455, 368, 28809, 267, 290, 2570, 28723, 345, 1976, 949, 28809, 28707, 947, 378, 298, 