## Prototype for Document Question Answering Using Llama and LangChain

Importing the packages and setting the model path

In [1]:
from llama_cpp import Llama
from langchain import PromptTemplate
from langchain.llms import LlamaCpp
from langchain import PromptTemplate
from langchain.chains import LLMChain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma


MODEL_PATH = "./models/llama-7b.ggmlv3.q4_K_M.bin"


Initializing Llama and loading the document on which question answering is to be performed. `TextLoader` is used for .txt files.

In [2]:
llm = LlamaCpp(model_path=MODEL_PATH)
loader = TextLoader("./docs/sample_input.txt")
docs = loader.load()

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 


Transforming the document by breaking it into smaller chunks using `TextSplitter`. It separates the document at the '\n\n' separator. However, if we set the separator to null and define a specific chunk size, each chunk will be of that specified length. Consequently, the resulting list length will be equal to the length of the document divided by the chunk size. In summary, it will resemble something like this: list length = length of doc / chunk size.


In [3]:
text_splitter = CharacterTextSplitter(chunk_size=10, chunk_overlap=0)
texts = text_splitter.split_documents(docs)
texts

Created a chunk of size 251, which is longer than the specified 10
Created a chunk of size 425, which is longer than the specified 10
Created a chunk of size 327, which is longer than the specified 10
Created a chunk of size 313, which is longer than the specified 10
Created a chunk of size 371, which is longer than the specified 10


[Document(page_content='Batman is a popular fictional superhero who first appeared in comic books published by DC Comics in 1939. Created by writer Bill Finger and artist Bob Kane, Batman is known for being a masked vigilante who fights crime in the fictional city of Gotham.', metadata={'source': './docs/sample_input.txt'}),
 Document(page_content='The character of Batman, also known as Bruce Wayne, witnessed the murder of his parents as a child, which fueled his desire to combat crime and seek justice. Wayne, a billionaire industrialist, uses his wealth and resources to create a secret identity as Batman. He relies on his intelligence, detective skills, martial arts training, and an array of gadgets and technology to fight against criminals and protect Gotham City.', metadata={'source': './docs/sample_input.txt'}),
 Document(page_content="One of Batman's defining characteristics is his lack of superhuman powers. Instead, he relies on his physical prowess, tactical thinking, and gadget

Embedding documents and embedding queries

In [4]:
from langchain.embeddings import LlamaCppEmbeddings
embeddings = LlamaCppEmbeddings(model_path=MODEL_PATH)
_texts = []
for i in range(len(texts)):
    _texts.append(texts[i].page_content)
texts[0]

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 


Document(page_content='Batman is a popular fictional superhero who first appeared in comic books published by DC Comics in 1939. Created by writer Bill Finger and artist Bob Kane, Batman is known for being a masked vigilante who fights crime in the fictional city of Gotham.', metadata={'source': './docs/sample_input.txt'})

In [5]:
embedded_texts = embeddings.embed_documents(_texts)
len(embedded_texts), len(embedded_texts[0])

(6, 4096)

In [6]:
embedded_texts[0][:4]

[2.635892152786255,
 -0.2077806144952774,
 0.5032820105552673,
 -4.212489128112793]

In [7]:
query = "Which city does Batman fight crimes in?"
embedded_query = embeddings.embed_query(query)
len(embedded_query)

4096

Creating a vector store using ChromaDB to store embedded data and to perform vector search operations.

In [8]:
db = Chroma.from_documents(texts, embeddings)
query_vector = embeddings.embed_query(query)
docs = db.similarity_search_by_vector(query_vector, k=1)
docs

[Document(page_content='The character of Batman, also known as Bruce Wayne, witnessed the murder of his parents as a child, which fueled his desire to combat crime and seek justice. Wayne, a billionaire industrialist, uses his wealth and resources to create a secret identity as Batman. He relies on his intelligence, detective skills, martial arts training, and an array of gadgets and technology to fight against criminals and protect Gotham City.', metadata={'source': './docs/sample_input.txt'})]

Giving a prompt to the Llama model.

In [9]:
from langchain.prompts import PromptTemplate

template = """Use only the following context to answer the question at the end briefly. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
{context}
Q: {question}
A:"""

prompt = PromptTemplate.from_template(template)
prompt.input_variables

['context', 'question']

In [10]:
query = "Which city does Batman fight crimes in?"
similar_doc = db.similarity_search(query, k=1)
context = similar_doc[0].page_content
print(context)

The character of Batman, also known as Bruce Wayne, witnessed the murder of his parents as a child, which fueled his desire to combat crime and seek justice. Wayne, a billionaire industrialist, uses his wealth and resources to create a secret identity as Batman. He relies on his intelligence, detective skills, martial arts training, and an array of gadgets and technology to fight against criminals and protect Gotham City.


In [11]:
query_llm = LLMChain(llm=llm, prompt=prompt)
response = query_llm.run({"context": context, "question": query})
print(response)

 Gotham City.


In [12]:
llm.seed

-1