In [1]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("Generative AI with LangChain.pdf")
data = loader.load()

In [2]:
data

[Document(metadata={'source': 'Generative AI with LangChain.pdf', 'page': 0}, page_content=''),
 Document(metadata={'source': 'Generative AI with LangChain.pdf', 'page': 1}, page_content='Generative AI with LangChain\nBuild large language model (LLM) apps with Python, \nChatGPT, and other LLMs\nBen Auffarth\nBIRMINGHAM—MUMBAI'),
 Document(metadata={'source': 'Generative AI with LangChain.pdf', 'page': 2}, page_content='Generative AI with LangChain\nCopyright © 2023 Packt Publishing\nAll rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in \nany form or by any means, without the prior written permission of the publisher, except in the case of brief \nquotations embedded in critical articles or reviews.\nEvery effort has been made in the preparation of this book to ensure the accuracy of the information \npresented. However, the information contained in this book is sold without warranty, either express or \nimplied. Neither the author,

In [3]:
len(data)

361

In [5]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
docs = text_splitter.split_documents(data)

print("Total number of documents: ", len(docs))

Total number of documents:  499


In [6]:
docs[0]

Document(metadata={'source': 'Generative AI with LangChain.pdf', 'page': 1}, page_content='Generative AI with LangChain\nBuild large language model (LLM) apps with Python, \nChatGPT, and other LLMs\nBen Auffarth\nBIRMINGHAM—MUMBAI')

### Gemini API Key.

##### Get the gemini-api-key from : https://ai.google.dev/gemini-api/docs/api-key

##### Embedding models: https://python.langchain.com/v0.1/docs/integrations/text_embedding/

In [8]:
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
google_api_key = os.environ["GOOGLE_API_KEY"]


embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vector = embeddings.embed_query("hello, world!")
vector[:5]

[0.05168594419956207,
 -0.030764883384108543,
 -0.03062233328819275,
 -0.02802734263241291,
 0.01813093200325966]

### Vector Database.

In [9]:
vectorstore = Chroma.from_documents(documents=docs, embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

In [10]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k":10})

retrieved_docs = retriever.invoke("What is new in Generative AI with LangChain?")

In [11]:
retrieved_docs

[Document(metadata={'page': 16, 'source': 'Generative AI with LangChain.pdf'}, page_content='Prefacexvi\nWhether you are a beginner or an experienced developer, this book will be a valuable resource \nfor anyone who wants to get the most out of LLMs and to stay ahead of the curve about LLMs \nand LangChain.\nWhat this book covers\nChapter 1, What Is Generative AI? , explains how generative AI has revolutionized the processing of \ntext, images, and video, with deep learning at its core. This chapter introduces generative models \nsuch as LLMs, detailing their technical underpinnings and transformative potential across various \nsectors. This chapter covers the theory behind these models, highlighting neural networks and \ntraining approaches, and the creation of human-like content. The chapter outlines the evolution \nof AI, Transformer architecture, text-to-image models like Stable Diffusion, and touches on sound \nand video applications.\nChapter 2, LangChain for LLM Apps, uncovers t

In [12]:
len(retrieved_docs)

10

### Connect with LLM

In [13]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro", temperature=0.3, max_tokens=500)

In [14]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrived context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [15]:
question_answer_chain = create_stuff_documents_chain(llm, prompt)

rag_chain = create_retrieval_chain(retriever, question_answer_chain)

In [16]:
response = rag_chain.invoke({"input": "What is new in Generative AI with LangChain?"})
print(response["answer"])

LangChain offers a framework to build more powerful applications by combining LLMs with other data sources and tools.  It addresses LLM limitations like outdated knowledge and lack of action capabilities.  The book also covers model integrations, building assistants, and chatbot creation.

