# Interact with your book 📖❓🙋🏻‍♀️

A simple demonstration of how you can implement retrieval augmented generation for a book.

## How retrieval augmented generation works

Following are the high level steps needed for the implementation for retrieval augmented generation.

1. Extract text from source. If the source is unstructured, like PDF, the extraction can be a challenge.
2. Index the extracted text, often as vector embeddings and store.
3. Let the user ask questions related to the source.
4. Perform a similarity search in the index and retrieve relevant text chunks.
5. Insert these text chunks in the prompt along with the question.
6. Request an LLM (e.g. chatgpt) to produce an answer *only* based on the context


### Step 1 & 2
Execute the python file extract_text_and_save_index.py for extracting text and saving index. You should run the python file again if you change the source PDF file or want to change how the text is extracted.

Rest of the steps are provided below.

## Import necessary packages

In [4]:
import os
import glob
import time

from langchain.schema import Document
from langchain.text_splitter import NLTKTextSplitter
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain import PromptTemplate
from langchain.vectorstores import FAISS
from langchain.text_splitter import NLTKTextSplitter
from tqdm import tqdm
import pandas as pd
import pdfplumber

### Load already saved vector embeddings

In [7]:
%%time

### download embeddings model
embeddings = HuggingFaceInstructEmbeddings(
    model_name = 'sentence-transformers/all-MiniLM-L6-v2',
    model_kwargs = {"device": "cpu"}
)

### load vector DB embeddings
vectordb : FAISS = FAISS.load_local(
    "faiss_index_hp",
    embeddings
)

  from tqdm.autonotebook import trange


load INSTRUCTOR_Transformer
max_seq_length  512
CPU times: user 3.41 s, sys: 1.48 s, total: 4.88 s
Wall time: 6.74 s


### Verify that similarity search is working

In [8]:
### test if vector DB was loaded correctly
results = vectordb.similarity_search('check digit')
results

[Document(page_content='2 Data transmission\n(VIN).\n\nCheck digits are used to identify errors in data entry caused by mis-typing\nor mis-scanning a barcode.\n\nThey can usually detect the following types of error:\n» an incorrect digit entered, for example 5327 entered instead of 5307\n» transposition errors where two numbers have changed order, for example 5037\ninstead of 5307\n» omitted or extra digits, for example 537 instead of 5307 or 53107 instead\nof 5307\n» phonetic errors, for example 13 (thirteen), instead of 30 (thirty).\n\nThere are a number of different methods used to generate a check digit.\n\nTwo\ncommon methods will be considered here:\n» ISBN 13\n» Modulo-11\nExample 1: ISBN 13\nThe check digit in ISBN 13 is the thirteenth digit in the number.\n\nWe will now consider\ntwo different calculations.\n\nThe first calculation is the generation of the check digit.\n\nThe second calculation is a verification of the check digit (that is, a recalculation).\n\nCalculation 1 –

### Create a prompt template requiring the LLM to generate an answer only based on the provided context.

In [9]:
prompt_template = """
Don't try to make up an answer, if you don't know just say that you don't know.
Answer in the same language the question was asked.
Use only the following pieces of context to answer the question at the end.

{context}

Question: {question}
Answer:"""

### Configure that we will use top 3 results from similarity search

In [10]:
from langchain.memory.vectorstore import VectorStoreRetriever

retriever : VectorStoreRetriever = vectordb.as_retriever(search_kwargs = {"k": 3, "search_type" : "similarity"})

### Provide a question for which answer is required.
- The final prompt including context will be copied to the clipboard.
- You can paste the prompt on an LLM interface (e.g. chat.openai.com) and get your answer!

In [14]:
query = """
Describe the main differences between control and monitoring of a process.
"""

docs = retriever.get_relevant_documents(query)
merged_context = ''
reference = 'Page Numbers: '
for doc in docs:
    merged_context = merged_context + ' ' + doc.page_content
    reference = reference + ' ' + str(doc.metadata['page'])
    print('\n\nPage Number: ' + str(doc.metadata['page']))
    print(doc.page_content)


final_prompt = prompt_template.format(context=merged_context, question=query)
print('\n*\n*\n*\n')
print('FINAL PROMPT')
print(final_prompt)
print('\n\n')
print(reference)
print('\n\n')
import pyperclip
pyperclip.copy(final_prompt)



Page Number: 113
3.2 Input and output devices
Sensors are used in both monitoring and control applications.

There is a
subtle difference between how these two methods work (the flowchart is a
simplification of the process):
Examples of monitoring
» Monitoring of a patient in a hospital for vital signs such as heart rate, temperature,
etc.

» Monitoring of intruders in a burglar alarm system
» Checking the temperature levels in a car engine
» Monitoring pollution levels in a river.

Examples of control
» Turning street lights on at night and turning them off again during daylight
» Controlling the temperature in a central heating/air conditioning system
» Chemical process control (for example, maintaining temperature and pH of
process)


Page Number: 220
6 AutomAted And emerging technologies
Example 2: Manufacture of paracetamol
This automated system also depends on sensors, a computer, actuators and
software.

Process 1 is the manufacture of the paracetamol.

Process 2 is the making

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
