In [20]:
import PyPDF2

In [21]:
pdf_file = "../data/Erlang-in-Anger.pdf"
pdf_reader = PyPDF2.PdfReader(pdf_file)
pdf_reader.pages[20].extract_text()

'CHAPTER 2. BUILDING OPEN SOURCE ERLANG SOFTWARE 15\n1{relx, [\n2{release, {demo, "1.0.0"},\n3[myapp1, myapp2, ..., recon]},\n4\n5{include_erts, false} % will use local Erlang install\n6]}\nCallingrebar3 release will build a release, to be found in the _build/default/rel/\ndirectory. Calling rebar3 tar will generate a tarball at\n_build/default/rel/demo/demo-1.0.0.tar.gz , ready to be deployed.\n2.2 Supervisors and start_link Semantics\nIn complex production systems, most faults and errors are transient, and retrying an opera-\ntion is a good way to do things — Jim Gray’s paper6quotesMean Times Between Failures\n(MTBF) of systems handling transient bugs being better by a factor of 4 when doing this.\nStill, supervisors aren’t just about restarting.\nOne very important part of Erlang supervisors and their supervision trees is that their\nstart phases are synchronous . Each OTP process has the potential to prevent its siblings\nand cousins from booting. If the process dies, it’s retried 

In [22]:
from langchain_core.documents import Document

# Split up the PDF document into Pages.
docs = []
for page_index, page in enumerate(pdf_reader.pages):
        docs.append(Document(page_content=page.extract_text(), meta_data={'file': pdf_file, 'page': page_index}))


In [23]:
docs[6]

Document(page_content='Introduction\nOn Running Software\nThere’s something rather unique in Erlang in how it approaches failure compared to most\nother programming languages. There’s this common way of thinking where the language,\nprogramming environment, and methodology do everything possible to prevent errors.\nSomething going wrong at run-time is something that needs to be prevented, and if it\ncannot be prevented, then it’s out of scope for whatever solution people have been thinking\nabout.\nTheprogramiswrittenonce, andafterthat, it’soﬀtoproduction, whatevermayhappen\nthere. If there are errors, new versions will need to be shipped.\nErlang, on the other hand, takes the approach that failures will happen no matter what,\nwhether they’re developer-, operator-, or hardware-related. It is rarely practical or even\npossible to get rid of all errors in a program or a system.1If you can deal with some errors\nrather than preventing them at all cost, then most undeﬁned behaviours of a 

In [24]:
docs[0].page_content

"'3&%\x01)&#&35\n"

In [25]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=200,
    chunk_overlap=20,
)

# Build up a list of Documents, each holding a Paragraph from a Page.
# Meta-data identifies the source and the page number of the Document.
doc_texts = []
for pindex, page in enumerate(docs):    
    doc_texts.extend(text_splitter.create_documents(texts=[page.page_content], metadatas=[{'file': pdf_file, 'page': pindex}]))


In [26]:
doc_texts[200].metadata

{'file': '../data/Erlang-in-Anger.pdf', 'page': 20}

In [27]:
from qutils import VectorStore

# Prepare for our persistent Vector Store if non-existent.
# Else open the existing Vector Store.
v = VectorStore(persist_directory="AngerDB")

In [28]:
v.persist_directory

'AngerDB'

In [29]:
# If no Vector Store exist, create one and load the documents.
if not v.db:
    v.store_documents(doc_texts)

In [30]:
query="How do I profile my Erlang program using fprof?"

# Search for text chunks similar to our Query.
answer = v.similarity_search(
    query=query,
    num_results=5
)

In [31]:
# Find out what Pages the text chunks are referring to.
s = set()
for a in answer:
    s.add(a.metadata['page'])
    print(a.metadata['page'])

50
7
91
24
55


In [32]:
s

{7, 24, 50, 55, 91}

In [33]:
context = []
for i in s:
    print(docs[i])
    print(type(docs[i]))
    context.append(docs[i].page_content)

context = '\n'.join(context)
context

page_content='2\nthe immune system where errors at run time can be dealt with and seen as survivable.\nBecausethesystemdoesn’tcollapsetheﬁrsttimesomethingbadtouchesit,Erlang/OTP\nalso allows you to be a doctor. You can go in the system, pry it open right there in pro-\nduction, carefully observe everything inside as it runs, and even try to ﬁx it interactively.\nTo continue with the analogy, Erlang allows you to perform extensive tests to diagnose the\nproblem and various degrees of surgery (even very invasive surgery), without the patients\nneeding to sit down or interrupt their daily activities.\nThis book intends to be a little guide about how to be the Erlang medic in a time of\nwar. It is ﬁrst and foremost a collection of tips and tricks to help understand where failures\ncome from, and a dictionary of diﬀerent code snippets and practices that helped developers\ndebug production systems that were built in Erlang.\nWho is this for?\nThis book is not for beginners. There is a gap le

'2\nthe immune system where errors at run time can be dealt with and seen as survivable.\nBecausethesystemdoesn’tcollapsetheﬁrsttimesomethingbadtouchesit,Erlang/OTP\nalso allows you to be a doctor. You can go in the system, pry it open right there in pro-\nduction, carefully observe everything inside as it runs, and even try to ﬁx it interactively.\nTo continue with the analogy, Erlang allows you to perform extensive tests to diagnose the\nproblem and various degrees of surgery (even very invasive surgery), without the patients\nneeding to sit down or interrupt their daily activities.\nThis book intends to be a little guide about how to be the Erlang medic in a time of\nwar. It is ﬁrst and foremost a collection of tips and tricks to help understand where failures\ncome from, and a dictionary of diﬀerent code snippets and practices that helped developers\ndebug production systems that were built in Erlang.\nWho is this for?\nThis book is not for beginners. There is a gap left between mo

In [34]:
from langchain.prompts import PromptTemplate

template = """
    - You are a technical assistant good at searching documents.
    - You will use the provided context to answer the question. 
    - If you do not have an answer from the provided information say so.

    Question: {question}
    
    Context: {context}
    """

prompt = template.format(question=query, context=context)
print(prompt)



    - You are a technical assistant good at searching documents.
    - You will use the provided context to answer the question. 
    - If you do not have an answer from the provided information say so.

    Question: How do I profile my Erlang program using fprof?
    
    Context: 2
the immune system where errors at run time can be dealt with and seen as survivable.
Becausethesystemdoesn’tcollapsetheﬁrsttimesomethingbadtouchesit,Erlang/OTP
also allows you to be a doctor. You can go in the system, pry it open right there in pro-
duction, carefully observe everything inside as it runs, and even try to ﬁx it interactively.
To continue with the analogy, Erlang allows you to perform extensive tests to diagnose the
problem and various degrees of surgery (even very invasive surgery), without the patients
needing to sit down or interrupt their daily activities.
This book intends to be a little guide about how to be the Erlang medic in a time of
war. It is ﬁrst and foremost a collection of t

In [35]:
import ollama

output = ollama.generate(model="llama3", prompt=prompt, stream=False)



In [36]:
output

{'model': 'llama3',
 'created_at': '2024-04-23T04:19:08.125307Z',
 'response': 'This is a collection of notes on Erlang programming, specifically focused on runtime metrics, ports, and tracing.\n\n**Runtime Metrics**\n\n* `process_info/2`: returns information about the current process, including:\n\t+ `monitored_by`: a list of processes monitoring the current process.\n\t+ `monitors`: a list of processes being monitored by the current process.\n\t+ `trap_exit`: a boolean indicating whether the process is trapping exits.\n* `current_function/0`: returns the current running function as a tuple `{Mod, Fun, Arity}`.\n\n**Ports**\n\n* `port_info/2`: returns information about a port, including:\n\t+ `id`: internal index of the port.\n\t+ `name`: type of the port (e.g. "tcp_inet", "udp_inet", etc.).\n\t+ `os_pid`: OS pid related to an external program (if applicable).\n\t+ `connected`: process ID of the controlling process.\n\t+ `links`: list of linked processes.\n\t+ `monitors`: list of proc

In [37]:
import textwrap

# To make the text easier to read, we wrap it to a maximum width of 62 characters.
# Split the response into lines, wrap each line, then join them back together
print("\n".join([textwrap.fill(line, width=62, break_long_words=False) for line in output['response'].split('\n')]))

This is a collection of notes on Erlang programming,
specifically focused on runtime metrics, ports, and tracing.

**Runtime Metrics**

* `process_info/2`: returns information about the current
process, including:
        + `monitored_by`: a list of processes monitoring the
current process.
        + `monitors`: a list of processes being monitored by
the current process.
        + `trap_exit`: a boolean indicating whether the
process is trapping exits.
* `current_function/0`: returns the current running function
as a tuple `{Mod, Fun, Arity}`.

**Ports**

* `port_info/2`: returns information about a port, including:
        + `id`: internal index of the port.
        + `name`: type of the port (e.g. "tcp_inet",
"udp_inet", etc.).
        + `os_pid`: OS pid related to an external program (if
applicable).
        + `connected`: process ID of the controlling process.
        + `links`: list of linked processes.
        + `monitors`: list of processes being monitored by the
port.
* `recon: