# PDF QA PoC using Llamafile

We will use LangChain and Llamafile to create a simple PDF QA PoC. We will use a simple PDF file and a simple question to demonstrate the process. However to do this we need to initialize the environment with a local server using a Llamafile that will interact with the LangChain API.

We will run one server for embeddings and one for chat.

In [37]:
%%bash
LLAMAFILE="TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile"

# Check if the file exists
if [ ! -f $LLAMAFILE ]; then
    echo "File $LLAMAFILE not found!"
    # Download the file
    wget -q --show-progress --progress=bar:force https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/$LLAMAFILE

    # Make the file executable
    chmod +x $LLAMAFILE
fi

# Run the file as a embeddings server
./$LLAMAFILE --server --nobrowser --embedding --port 8080 > embeddings.log 2>&1 &
pid=$!
echo "Embeddings server started with PID $pid"
echo "Logs are being written to embeddings.log"
echo "$pid" > embeddings.pid

# Run the file as a chat server
./$LLAMAFILE --server --nobrowser --port 8081 > chat.log 2>&1 &
pid=$!
echo "Chat server started with PID $pid"
echo "Logs are being written to chat.log"
echo "$pid" > chat.pid

File TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile not found!




Embeddings server started with PID 77029
Logs are being written to embeddings.log
Chat server started with PID 77030
Logs are being written to chat.log


Now it is time to setup langchain to interact with the servers.

In [18]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load a PDF document
loader = PyPDFLoader("data/sample_en.pdf")
# Create a splitter with some overlap
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
# Load the pdf and split it into chunks
documents = loader.load_and_split(splitter)

page_content='Stock prices quickly incorporate information from earnings
announcements, making it difficult to beat the market by
trading on these events. A replication of Martineau (2022).
Efficient-market hypothesis
The efficient-market hypothesis (EMH)[a] is
a hypothesis in financial econom ics that states
that asset prices reflect all available
information. A direct implication is that it is
impossible to "beat the market" consistently on
a risk-adjusted basis since market prices should' metadata={'source': 'data/sample_en.pdf', 'page': 0}


In [20]:
from langchain_chroma import Chroma
from langchain_community.embeddings import LlamafileEmbeddings

vector_store = Chroma.from_documents(documents=documents, embedding=LlamafileEmbeddings(
  base_url="http://localhost:8080"
))

# Test the vector store
# Ask a question to find similar text
question = "Is it possible to know how the market will evolve in the future?"
docs = vector_store.similarity_search(question)

print(len(docs))

4


In [21]:
from langchain_community.llms.llamafile import Llamafile

llm = Llamafile(base_url="http://localhost:8081")

In [22]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate

# Create a prompt
prompt = PromptTemplate.from_template("Summarize the following text in 3 sentences: {docs}")

# Next we define the chain using a utility function
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

chain = { "docs": format_docs } | prompt | llm | StrOutputParser()

# Run the chain with the question and docs we have from above
chain.invoke(docs)

' Marwala argues that the efficient market hypothesis is applicable to AI-based markets,\nhowever, its applicability is more limited than the one proposed by the EMH. The efficiency of AI-based\nmarkets arises from the ability of these systems to learn from past data and adapt accordingly, Marwala argues,\nwhile the efficient market hypothesis applies only to human-produced markets where trading decisions are made\nby individuals using their knowledge of the past.\nS2CID 853397 (https://api.semanticscholar.org/CorpusID:853397).\n\ni%2Fcbi014). The efficient market hypothesis is relevant to stock trading because it suggests that investors have\nlimited information when making trading decisions, and thus it should be difficult to determine whether stocks are\noverpriced or undervalued. The EMH argues against this claim by showing that markets are not inefficient even\nwhen there is little market information available. This can be seen as an example of the "invisible hand" argument that a

In [38]:
%%bash

# cleanup: kill the llamafile server processes

kill $(cat chat.pid)
rm chat.pid

kill $(cat embeddings.pid)
rm embeddings.pid