# Langchain + Pinecone
You will learn the following things:

* Create embeddings from a document
* Save these embeddings to Pinecone indexes
* Query these Pinecone 

There is a `requirements.txt` file, install all the dependencies from it into a new virtual environment.

In [1]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.chat_models import ChatOpenAI
from langchain.vectorstores import Pinecone
from PyPDF2 import PdfReader
import pinecone

from config import config

  from tqdm.autonotebook import tqdm


* Reading a PDF file from the path
* Creating and saving the content of the PDF file into a `output.txt` file

In [2]:
file_path = 'sample.pdf'

In [3]:
reader = PdfReader(file_path)
text = ''
for page in reader.pages:
    text += page.extract_text()

with open(f'output.txt', 'w') as file:
    file.write(text)

In [4]:
text

'Deepak Chopra\nThe Seven\nspiritual\nlaws\nof\nSuccessContents\nAcknowledgments ......................................................................................................... 3\nIntroduction .................................................................................................................. 4The law of pure potentiality ........................................................................................... 6The Law of Giving ...................................................................................................... 11\nThe law of \x93karma\x94 or cause and effect......................................................................... 15\nThe law of least effort .................................................................................................. 19The law of Intention and Desire .................................................................................... 23The law of Detachment ..............................................

* Loading the folder with `.txt` files

In [5]:
loader = DirectoryLoader(
    './',
    glob='**/*.txt',
    loader_cls=TextLoader
)

In [6]:
documents = loader.load()

In [8]:
documents

[Document(page_content='aiofiles==23.1.0\naiohttp==3.8.4\naiosignal==1.3.1\naltair==5.0.1\nanyio==3.7.1\nasttokens==2.2.1\nasync-timeout==4.0.2\nattrs==23.1.0\nautopep8==2.0.2\nbackcall==0.2.0\nbackoff==2.2.1\ncertifi==2023.5.7\ncharset-normalizer==3.2.0\nclick==8.1.4\nclickhouse-connect==0.6.6\ncoloredlogs==15.0.1\ncomm==0.1.3\ncontourpy==1.1.0\ncycler==0.11.0\ndataclasses-json==0.5.9\ndebugpy==1.6.7\ndecorator==5.1.1\ndnspython==2.3.0\nduckdb==0.8.1\nexceptiongroup==1.1.2\nexecuting==1.2.0\nfastapi==0.100.0\nffmpy==0.3.0\nfilelock==3.12.2\nflatbuffers==23.5.26\nfonttools==4.40.0\nfrozenlist==1.3.3\nfsspec==2023.6.0\ngradio==3.36.1\ngradio_client==0.2.7\ngreenlet==2.0.2\nh11==0.14.0\nhnswlib==0.7.0\nhttpcore==0.17.3\nhttptools==0.6.0\nhttpx==0.24.1\nhuggingface-hub==0.16.4\nhumanfriendly==10.0\nidna==3.4\nimportlib-metadata==6.8.0\nimportlib-resources==6.0.0\nipykernel==6.24.0\nipython==8.14.0\nipywidgets==8.0.7\njedi==0.18.2\nJinja2==3.1.2\njsonschema==4.18.0\njsonschema-specificatio

* Split the documents into chonk of text

In [9]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1024,
    chunk_overlap=0
)

In [10]:
texts = text_splitter.split_documents(documents)

In [11]:
texts

[Document(page_content='aiofiles==23.1.0\naiohttp==3.8.4\naiosignal==1.3.1\naltair==5.0.1\nanyio==3.7.1\nasttokens==2.2.1\nasync-timeout==4.0.2\nattrs==23.1.0\nautopep8==2.0.2\nbackcall==0.2.0\nbackoff==2.2.1\ncertifi==2023.5.7\ncharset-normalizer==3.2.0\nclick==8.1.4\nclickhouse-connect==0.6.6\ncoloredlogs==15.0.1\ncomm==0.1.3\ncontourpy==1.1.0\ncycler==0.11.0\ndataclasses-json==0.5.9\ndebugpy==1.6.7\ndecorator==5.1.1\ndnspython==2.3.0\nduckdb==0.8.1\nexceptiongroup==1.1.2\nexecuting==1.2.0\nfastapi==0.100.0\nffmpy==0.3.0\nfilelock==3.12.2\nflatbuffers==23.5.26\nfonttools==4.40.0\nfrozenlist==1.3.3\nfsspec==2023.6.0\ngradio==3.36.1\ngradio_client==0.2.7\ngreenlet==2.0.2\nh11==0.14.0\nhnswlib==0.7.0\nhttpcore==0.17.3\nhttptools==0.6.0\nhttpx==0.24.1\nhuggingface-hub==0.16.4\nhumanfriendly==10.0\nidna==3.4\nimportlib-metadata==6.8.0\nimportlib-resources==6.0.0\nipykernel==6.24.0\nipython==8.14.0\nipywidgets==8.0.7\njedi==0.18.2\nJinja2==3.1.2\njsonschema==4.18.0\njsonschema-specificatio

In [12]:
embeddings = OpenAIEmbeddings(
    openai_api_key=config.OPENAI_API_KEY
)

* Initialize `Pinecone`

In [13]:
pinecone.init(
    api_key=config.PINECONE_API_KEY,
    environment=config.PINECONE_ENVIRONMENT,
)

In [14]:
index_name = 'test'

In [15]:
pinecone.list_indexes()

[]

In [16]:
pinecone.create_index(
    name=index_name,
    dimension=1536
)

* Create indexes in Pinecone

In [17]:
Pinecone.from_documents(
    documents=texts,
    embedding=embeddings,
    index_name=index_name
)

<langchain.vectorstores.pinecone.Pinecone at 0x7ff5581712b0>

* Load the existing indexes

In [18]:
db = Pinecone.from_existing_index(
    index_name=index_name,
    embedding=embeddings
)

In [19]:
db

<langchain.vectorstores.pinecone.Pinecone at 0x7ff552a10730>

* QnA chain without `memory`

In [20]:
qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(temperature=0.0),
    chain_type="stuff",
    retriever=db.as_retriever()
)

In [21]:
result = qa.run('what are the different spiritual laws?')

In [22]:
print(result)

The text mentions four of the seven spiritual laws of success: 
1. The Law of Pure Potentiality: This law states that our true self is pure consciousness and pure potentiality. By aligning with this power, we can manifest anything we desire.
2. The Law of Giving: This law emphasizes the importance of giving and serving others. By giving, we open ourselves up to receiving abundance and creating positive energy.
3. The Law of Karma or Cause and Effect: This law states that every action we take has consequences. By being aware of our actions and making conscious choices, we can create positive outcomes.
4. The Law of Unity: This law highlights the underlying unity and interconnectedness of all life. By recognizing this unity, we can experience the divinity within ourselves and in everything around us.

The text does not mention the remaining three spiritual laws.


* QnA with memory

In [23]:
cqa = ConversationalRetrievalChain.from_llm(
    llm=ChatOpenAI(temperature=0.0),
    retriever=db.as_retriever()
)

In [24]:
chat_history = []
query = 'what are the different spiritual laws?'
result = cqa({'question': query, 'chat_history': chat_history})

In [25]:
result

{'question': 'what are the different spiritual laws?',
 'chat_history': [],
 'answer': 'The text mentions four of the seven spiritual laws of success: \n1. The Law of Pure Potentiality: This law states that our true self is pure consciousness and pure potentiality. By aligning with this power, we can manifest anything we desire.\n2. The Law of Giving: This law emphasizes the importance of giving and serving others. By giving, we open ourselves up to receiving abundance and creating positive energy.\n3. The Law of Karma or Cause and Effect: This law states that every action we take has consequences. By being aware of our actions and making conscious choices, we can create positive outcomes.\n4. The Law of Unity: This law highlights the underlying unity and interconnectedness of all life. By recognizing this unity, we can experience the divinity within ourselves and in everything around us.\n\nThe text does not mention the remaining three spiritual laws.'}

In [26]:
chat_history.append((query, result['answer']))

In [27]:
query = 'can you explain the first one?'
result = cqa({'question': query, 'chat_history': chat_history})

In [28]:
result

{'question': 'can you explain the first one?',
 'chat_history': [('what are the different spiritual laws?',
   'The text mentions four of the seven spiritual laws of success: \n1. The Law of Pure Potentiality: This law states that our true self is pure consciousness and pure potentiality. By aligning with this power, we can manifest anything we desire.\n2. The Law of Giving: This law emphasizes the importance of giving and serving others. By giving, we open ourselves up to receiving abundance and creating positive energy.\n3. The Law of Karma or Cause and Effect: This law states that every action we take has consequences. By being aware of our actions and making conscious choices, we can create positive outcomes.\n4. The Law of Unity: This law highlights the underlying unity and interconnectedness of all life. By recognizing this unity, we can experience the divinity within ourselves and in everything around us.\n\nThe text does not mention the remaining three spiritual laws.')],
 'ans