## LangChain 101 - Data Connection
<img src="images/LangChain101_DataConnection.png" width=800 height=500 />

In [1]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("./data/lobel-frog-toad.pdf")
pages = loader.load_and_split()

In [5]:
pg_no=66
print(f"Content:\n {pages[pg_no].page_content} \n metadata:\n {pages[pg_no].metadata}")

Content:
 “Oh, yes there will,” said Frog,
“because I have sent you a letter.”
“You have?” said Toad.
“What did you write in the letter?”
Frog said, “I wrote ‘Dear Toad, I am
glad that you are my best friend.
Your best friend, Frog.’”
“Oh,” said Toad, “that makes a very
good letter.”
Then Frog and Toad went out onto
the front porch to wait for the mail.
They sat there, feeling happy
together. 
 metadata:
 {'source': './data/lobel-frog-toad.pdf', 'page': 82}


In [3]:
# import
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma # vector storage

In [None]:
# create the open-source embedding function
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

# load it into Chroma
db = Chroma.from_documents(pages, embedding_function)

In [5]:
# retriever
retriever = db.as_retriever()

In [8]:
# Initialize OpenAI Model
import os,configparser

config = configparser.ConfigParser()
config.read('secret.ini') 
''' secret.ini content is following
[OpenAI]
openai_api_key = your_Open_AI_key 

'''
os.environ["OPENAI_API_KEY"] = config['OpenAI']['openai_api_key'] # set open AI key as environment variable


In [11]:
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

llm_model_openai=OpenAI()
qa_openAI = RetrievalQA.from_chain_type(llm=llm_model_openai, chain_type="stuff", retriever=retriever)

query = "why front sent letter to Todd?"
print(qa_openAI.run(query))

 Frog sent a letter to Toad to let him know that he is Frog's best friend.


In [9]:
# change the model to vertex and query
os.environ['GOOGLE_APPLICATION_CREDENTIALS']='participant-sa-12-ghc-022_git.json' ## service account json

In [12]:
from langchain.llms import VertexAI
ga_vertexAI = VertexAI()
qa = RetrievalQA.from_chain_type(llm=ga_vertexAI, chain_type="stuff", retriever=retriever)

In [14]:
print(qa.run(query))

Frog sent a letter to Toad because he wanted to tell him that he was his best friend.
