<a href="https://colab.research.google.com/github/yousenwang/langchain_llm/blob/main/RetrievalQA_by_Ethan.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [13]:
!pip -q install langchain==0.0.215 chromadb pypdf sentence_transformers InstructorEmbedding

In [14]:
!pip show langchain

Name: langchain
Version: 0.0.215
Summary: Building applications with LLMs through composability
Home-page: https://www.github.com/hwchase17/langchain
Author: 
Author-email: 
License: MIT
Location: /usr/local/lib/python3.10/dist-packages
Requires: aiohttp, async-timeout, dataclasses-json, langchainplus-sdk, numexpr, numpy, openapi-schema-pydantic, pydantic, PyYAML, requests, SQLAlchemy, tenacity
Required-by: 


In [15]:
# from langchain.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.vectorstores import Chroma
# from langchain.llms import OpenAI
from langchain.chains import RetrievalQA, RetrievalQAWithSourcesChain

In [16]:
loader = DirectoryLoader('./', glob="./*.pdf", loader_cls=PyPDFLoader)

documents = loader.load()

In [17]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000//2, chunk_overlap=200//2)
texts = text_splitter.split_documents(documents)

In [18]:
instructor_embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-xl",
                                                      model_kwargs={"device": "cuda"})

load INSTRUCTOR_Transformer
max_seq_length  512


In [19]:
persist_directory = 'db'

## Here is the nmew embeddings being used
embedding = instructor_embeddings

vectordb = Chroma.from_documents(documents=texts,
                                 embedding=embedding,
                                 persist_directory=persist_directory)

In [20]:
from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

In [21]:
!pip install gpt4all==0.3.6



In [34]:
llm_name = "ggml-gpt4all-j-v1.3-groovy.bin"

In [22]:
import gpt4all
gptj = gpt4all.GPT4All(llm_name)


Found model file at  /root/.cache/gpt4all/ggml-gpt4all-j-v1.3-groovy.bin


In [23]:
%%time
# You can run raw generate as well on your input. But performance will degrade.
res = gptj.generate('I am Ethan and I live in Taiwan. What is my name?')
print(res)


Ethan

Ethan
CPU times: user 25.7 s, sys: 17.4 ms, total: 25.8 s
Wall time: 16.1 s


In [30]:
context = texts[0].page_content

In [36]:
print(texts[0].page_content)

1 / 6 
This document  belongs to Wareconn  Technology Services (Tianjin) Co., Ltd. It is only intended to be 
used to introduce Wareconn functions and procedures. Please do not use it for other purposes.  1 wareconn standard operating procedure  
 
 
 
 
 
 
wareconn standard operating procedure  
RMA request  SOP  
 
Version : 
Version  Date  Editor  Description  
v 1.0  2023/03/16 Eric Sun First draft (English version)  
 
 
Content


In [37]:
print(texts[0])

page_content='1 / 6 \nThis document  belongs to Wareconn  Technology Services (Tianjin) Co., Ltd. It is only intended to be \nused to introduce Wareconn functions and procedures. Please do not use it for other purposes.  1 wareconn standard operating procedure  \n \n \n \n \n \n \nwareconn standard operating procedure  \nRMA request  SOP  \n \nVersion : \nVersion  Date  Editor  Description  \nv 1.0  2023/03/16 Eric Sun First draft (English version)  \n \n \nContent' metadata={'source': 'MSFT RMA request SOP v1.0.pdf', 'page': 0}


In [31]:
query = "How can I navigate to the RMA request page in the Wareconn Customer Portal?"

In [32]:
input_prompt = f"""Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {query}
Answer:"""

In [33]:
%%time
# You can run raw generate as well on your input. But performance will degrade.
res = gptj.generate(input_prompt)
print(res)

 To navigate to the RMA request page in the Wareconn Customer Portal, follow these steps:
1. Log in to your Wareconn account by entering the username and password.
2. Once you are logged in, click on the "My Account" tab.
3. In the "My Account" page, click on the "Wareconn Customer Portal" tab.
4. In the "Wareconn Customer Portal" page, click on the "RMAs" tab.
5. In the "RMAs" page, click on the specific RMA you want to navigate to
 To navigate to the RMA request page in the Wareconn Customer Portal, follow these steps:
1. Log in to your Wareconn account by entering the username and password.
2. Once you are logged in, click on the "My Account" tab.
3. In the "My Account" page, click on the "Wareconn Customer Portal" tab.
4. In the "Wareconn Customer Portal" page, click on the "RMAs" tab.
5. In the "RMAs" page, click on the specific RMA you want to navigate to
CPU times: user 7min 31s, sys: 1.05 s, total: 7min 32s
Wall time: 4min 40s


In [24]:
# https://python.langchain.com/docs/modules/chains/popular/vector_db_qa
from langchain.prompts import PromptTemplate
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
Answer in Italian:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

In [None]:
docsearch = Chroma.from_documents(texts, embeddings)

In [None]:
chain_type_kwargs = {"prompt": PROMPT}
qa = RetrievalQA.from_chain_type(llm=gptj, chain_type="stuff", retriever=docsearch.as_retriever(), chain_type_kwargs=chain_type_kwargs)