# Retrieval Augmented Generation (RAG)

In [1]:
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOllama
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.schema import HumanMessage
from langchain.vectorstores import FAISS

In [2]:
chat_model = ChatOllama(model="llama2:13b", 
                        callback_manager = CallbackManager([StreamingStdOutCallbackHandler()]))
embedder = HuggingFaceEmbeddings(
    model_name='sentence-transformers/all-MiniLM-L12-v2'
)
store = FAISS.load_local('core_knowledge',
                         embeddings=embedder
                        )
qa_chain = RetrievalQA.from_chain_type(
    chat_model,
    retriever=store.as_retriever()
)

In [3]:
qa_chain({'query': 'When was LLaMA released?'})

 LLaMA was released on February 23, 2023.

{'query': 'When was LLaMA released?',
 'result': ' LLaMA was released on February 23, 2023.'}

In [4]:
qa_chain({'query': 'What is LLaMA?'})

 LLaMA (LLaMA: Open and Efficient Foundation Language Models, Anthropic Harmless, and Multitask Helpful) is a series of language models developed by researchers at Meta AI. The models are designed to be open and efficient, with a focus on safety and multitask learning.

LLaMA is based on the transformer architecture and was trained on a large corpus of text data. The model is available in various sizes, ranging from 1.3 billion parameters (LLaMA-B) to 6.5 billion parameters (LLaMA-7B). The larger models have been shown to achieve better performance on a variety of natural language processing tasks, such as text classification, sentiment analysis, and question answering.

One of the key features of LLaMA is its ability to be finetuned for specific tasks with minimal additional data. This makes it a useful tool for a wide range of applications, from chatbots and virtual assistants to content generation and language translation. Additionally, the model is designed to be "anthropic" and "h

{'query': 'What is LLaMA?',
 'result': ' LLaMA (LLaMA: Open and Efficient Foundation Language Models, Anthropic Harmless, and Multitask Helpful) is a series of language models developed by researchers at Meta AI. The models are designed to be open and efficient, with a focus on safety and multitask learning.\n\nLLaMA is based on the transformer architecture and was trained on a large corpus of text data. The model is available in various sizes, ranging from 1.3 billion parameters (LLaMA-B) to 6.5 billion parameters (LLaMA-7B). The larger models have been shown to achieve better performance on a variety of natural language processing tasks, such as text classification, sentiment analysis, and question answering.\n\nOne of the key features of LLaMA is its ability to be finetuned for specific tasks with minimal additional data. This makes it a useful tool for a wide range of applications, from chatbots and virtual assistants to content generation and language translation. Additionally, th