Steps:
1. Load the smartphone manual pdf data
2. Split into chunks
3. Embed the chunks
4. Store the embedding in FAISS Vector database
5. Use OpenAI API based LLM to fetch the data

In [1]:
!nvidia-smi

Sat Apr 13 10:46:51 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA A100-SXM4-40GB          Off | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P0              53W / 400W |      2MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [4]:
!pip install langchain openai tiktoken rapidocr-onnxruntime

Collecting langchain
  Downloading langchain-0.1.16-py3-none-any.whl (817 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m817.7/817.7 kB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting openai
  Downloading openai-1.17.1-py3-none-any.whl (268 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.3/268.3 kB[0m [31m15.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting tiktoken
  Downloading tiktoken-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m27.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting rapidocr-onnxruntime
  Downloading rapidocr_onnxruntime-1.3.16-py3-none-any.whl (14.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.9/14.9 MB[0m [31m70.3 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collectin

In [3]:
# lets fetch and load the pdf data from the GDrive folder
!ls drive/MyDrive/llms/smartphone_manuals

OnePlus_5_User_Manual.pdf		   Samsung_Galaxy_S6_active_G890A.pdf
sam-f946-f731-en-um-os13-072723-final.pdf


In [6]:
!pip install pypdf

Collecting pypdf
  Downloading pypdf-4.2.0-py3-none-any.whl (290 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/290.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m286.7/290.4 kB[0m [31m10.6 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m290.4/290.4 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pypdf
Successfully installed pypdf-4.2.0


In [7]:
# reading the directory of pdf files
from langchain.document_loaders import PyPDFDirectoryLoader

data = PyPDFDirectoryLoader('drive/MyDrive/llms/smartphone_manuals').load()

# printing the data
print(data)



In [8]:
data[0].page_content

'SMARTPHONE\nUser Manual\nPlease read this manual before operating your device \nand keep it for future reference.'

In [9]:
# split the data into compatible chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=20)
text_chunks = text_splitter.split_documents(data)

In [13]:
type(text_chunks), len(text_chunks)

(list, 854)

In [14]:
text_chunks[0].page_content

'SMARTPHONE\nUser Manual\nPlease read this manual before operating your device \nand keep it for future reference.'

In [15]:
text_chunks[1].page_content



In [17]:
from google.colab import userdata
import os
os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

In [18]:
# convert text chunks into vector embeddings
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

In [23]:
embed_length = len(embeddings.embed_query("hi i am fine"))
embed_length

1536

In [19]:
# embeddings then shall be stored in FAISS in memory database
!pip install faiss-cpu

Collecting faiss-cpu
  Downloading faiss_cpu-1.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.0/27.0 MB[0m [31m45.9 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: faiss-cpu
Successfully installed faiss-cpu-1.8.0


In [26]:
!free -h

               total        used        free      shared  buff/cache   available
Mem:            50Gi       1.1Gi        44Gi       4.0Mi       4.9Gi        49Gi
Swap:             0B          0B          0B


In [28]:
from langchain.vectorstores import FAISS
vectorstore=FAISS.from_documents(text_chunks, embeddings)

In [29]:
!free -h

               total        used        free      shared  buff/cache   available
Mem:            50Gi       1.1Gi        44Gi       4.0Mi       4.9Gi        49Gi
Swap:             0B          0B          0B


In [30]:
vectorstore

<langchain_community.vectorstores.faiss.FAISS at 0x7821dd169120>

In [31]:
retriever=vectorstore.as_retriever()

In [32]:
from langchain.prompts import ChatPromptTemplate

In [33]:
template = """You are an smartphoe assistant bot. Users will come to you and ask you various questions
on their smartphone. You need to use the retrieved context that is provided to you and provide answer
to the user based on their question. Try to be precise in your response and use upto 5 sentences in your reply.
Also, if you don't know something, tell them politely that you have no idea on this topic.
Question: {query}
Context: {context}
Answer:
"""

In [34]:
template

"You are an smartphoe assistant bot. Users will come to you and ask you various questions\non their smartphone. You need to use the retrieved context that is provided to you and provide answer\nto the user based on their question. Try to be precise in your response and use upto 5 sentences in your reply.\nAlso, if you don't know something, tell them politely that you have no idea on this topic.\nQuestion: {query}\nContext: {context}\nAnswer: \n"

In [35]:
prompt=ChatPromptTemplate.from_template(template)

In [36]:
from langchain.chat_models import ChatOpenAI
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

In [37]:
output_parser=StrOutputParser()

In [38]:
llm_model=ChatOpenAI(model_name="gpt-3.5-turbo")

  warn_deprecated(


In [41]:
rag_chain = (
    {"context": retriever,  "query": RunnablePassthrough()}
    | prompt
    | llm_model
    | output_parser
)

In [42]:
rag_chain.invoke('how does multi window feature work on samsung smartphone')

'Multi window feature on Samsung smartphones allows you to use multiple applications simultaneously on a split screen. You can activate Multi Window by pressing and holding the Recent apps key and selecting the apps you want to display together. You can switch between the apps, adjust the size of their windows, and even copy information from one app to another. Note that not all apps support Multi Window feature, so make sure to check if the app you want to use is compatible. To adjust the size of the windows, simply drag the middle of the window border to resize them accordingly.'

In [43]:
rag_chain.invoke('how to set screen saver on One Plus 5 mobile')

'To set a screensaver on your OnePlus 5 mobile, you can follow these steps:\n1. Go to Settings and tap on Display.\n2. Then select Screensaver from the options.\n3. Choose from the available options like Colors, Phototable, Photoframe, or Photos.\n4. You can also tap Preview to see a demonstration of the selected screensaver.\n5. This way, you can customize the screensaver on your OnePlus 5 device.'

In [44]:
rag_chain.invoke('how to set screen saver on Oppo mobile')

'To set a screen saver on your Oppo mobile, follow these steps:\n1. Go to Settings and tap on Display.\n2. Then select Screensaver from the options available.\n3. Choose from options like None, Colors, Phototable, Photoframe, or Photos.\n4. You can preview the selected screensaver before confirming.\n5. This will allow you to display colors or photos when the screen turns off or while charging.'

In [45]:
rag_chain.invoke('Which is better phone for the price between Samsung Galaxy and OnePlus 5')

'Both Samsung Galaxy and OnePlus 5 offer great value for their price. Samsung Galaxy phones are known for their excellent display quality and camera capabilities. On the other hand, OnePlus 5 is praised for its top-notch hardware, smooth user experience, and dual camera system. Ultimately, the choice between the two will depend on your personal preferences and priorities in a smartphone.'