Build a Simple LLM Application with LCEL(LangChain Expression Language)

In [125]:
!pip install langchain



In [126]:
##Installing Model Package:

!pip install -qU langchain-openai

In [127]:
!pip install langchain_community



In [128]:
%pip install --upgrade --quiet  langchain-google-genai pillow

In [129]:
from langchain_google_genai import ChatGoogleGenerativeAI

In [130]:
from google.colab import userdata
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')

In [131]:
import os

In [132]:
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

In [133]:
model = ChatGoogleGenerativeAI(model="gemini-1.5-pro")

In [134]:
model.invoke("what is a capital of india?").content

'The capital of India is **New Delhi**. \n'

In [135]:
from langchain_core.prompts import ChatPromptTemplate

In [136]:
system_template = "Translate the following into {language}:"

In [137]:
prompt_template = ChatPromptTemplate.from_messages(
    [("system", system_template), ("user", "{text}")]
)

In [138]:
prompt = prompt_template.invoke({"language": "italian", "text": "hi"})

In [139]:
prompt

ChatPromptValue(messages=[SystemMessage(content='Translate the following into italian:'), HumanMessage(content='hi')])

In [140]:
prompt.to_messages()

[SystemMessage(content='Translate the following into italian:'),
 HumanMessage(content='hi')]

In [141]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

pipe (|) operator

In [142]:
chain = prompt_template | model | parser

In [143]:
chain.invoke({"language": "italian", "text": "hi"})

'Ciao! 😊 \n'

In [144]:
!pip install pypdf



In [145]:
from langchain_community.document_loaders import PyPDFLoader

In [146]:
loader = PyPDFLoader("/content/RAG.pdf")

In [147]:
pages = loader.load_and_split()

In [148]:
print(pages[0].page_content)

Retrieval-Augmented Generation for
Knowledge-Intensive NLP Tasks
Patrick Lewis†‡, Ethan Perez⋆,
Aleksandra Piktus†, Fabio Petroni†, Vladimir Karpukhin†, Naman Goyal†, Heinrich Küttler†,
Mike Lewis†, Wen-tau Yih†, Tim Rocktäschel†‡, Sebastian Riedel†‡, Douwe Kiela†
†Facebook AI Research;‡University College London;⋆New York University;
plewis@fb.com
Abstract
Large pre-trained language models have been shown to store factual knowledge
in their parameters, and achieve state-of-the-art results when ﬁne-tuned on down-
stream NLP tasks. However, their ability to access and precisely manipulate knowl-
edge is still limited, and hence on knowledge-intensive tasks, their performance
lags behind task-speciﬁc architectures. Additionally, providing provenance for their
decisions and updating their world knowledge remain open research problems. Pre-
trained models with a differentiable access mechanism to explicit non-parametric
memory have so far been only investigated for extractive downstream tas

In [149]:
len(pages)

30

In [150]:
!pip install faiss-gpu



In [152]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

In [153]:
from langchain_community.vectorstores import FAISS

In [154]:
# Create embeddings using a Google Generative AI model

gemini_embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

In [155]:
faiss_index = FAISS.from_documents(pages, embedding=gemini_embeddings)


In [156]:
docs = faiss_index.similarity_search("what is RAG?", k=2)

In [157]:
len(docs)

2

In [158]:
docs[0].page_content

'minimize the negative marginal log-likelihood of each target,∑\nj−logp(yj|xj)using stochastic\ngradient descent with Adam [ 28]. Updating the document encoder BERTdduring training is costly as\nit requires the document index to be periodically updated as REALM does during pre-training [ 20].\nWe do not ﬁnd this step necessary for strong performance, and keep the document encoder (and\nindex) ﬁxed, only ﬁne-tuning the query encoder BERT qand the BART generator.\n2.5 Decoding\nAt test time, RAG-Sequence and RAG-Token require different ways to approximate arg maxyp(y|x).\nRAG-Token The RAG-Token model can be seen as a standard, autoregressive seq2seq genera-\ntor with transition probability: p′\nθ(yi|x,y 1:i−1) =∑\nz∈top-k(p(·|x))pη(zi|x)pθ(yi|x,zi,y1:i−1)To\ndecode, we can plug p′\nθ(yi|x,y 1:i−1)into a standard beam decoder.\nRAG-Sequence For RAG-Sequence, the likelihood p(y|x)does not break into a conventional per-\ntoken likelihood, hence we cannot solve it with a single beam search.

In [159]:
docs[1].page_content

'Document 1 : his works are considered classics of American\nliterature ... His wartime experiences formed the basis for his novel\n”A Farewell to Arms” (1929) ...\nDocument 2 : ... artists of the 1920s ”Lost Generation” expatriate\ncommunity. His debut novel, ”The Sun Also Rises” , was published\nin 1926.\nBOS”\nTheSunAlsoRises”isa\nnovelbythis\nauthorof”A\nFarewellto\nArms”Doc 1\nDoc 2\nDoc 3\nDoc 4\nDoc 5Figure 2: RAG-Token document posterior p(zi|x,yi,y−i)for each generated token for input “Hem-\ningway" for Jeopardy generation with 5 retrieved documents. The posterior for document 1 is high\nwhen generating “A Farewell to Arms" and for document 2 when generating “The Sun Also Rises".\nTable 3: Examples from generation tasks. RAG models generate more speciﬁc and factually accurate\nresponses. ‘?’ indicates factually incorrect responses, * indicates partially correct responses.\nTask Input Model Generation\nMS-\nMARCOdeﬁne middle\nearBART?The middle ear is the part of the ear betwee

In [160]:
retriever = faiss_index.as_retriever(k=3)

In [161]:
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

In [162]:
prompt = ChatPromptTemplate.from_template(template)

In [163]:
from langchain.schema.runnable import RunnablePassthrough

In [164]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

In [167]:
chain.invoke("what is RAG?")

'This document describes RAG, which stands for **Retrieval-Augmented Generation**, as a model for knowledge-intensive tasks in natural language processing. \n'