## Summarize large documents using LangChain and Gemini

In [4]:
# Installing the necessary libraries

!pip install langchain
!pip install langchain-google-genai
!pip install langchain-community

Collecting langchain
  Downloading langchain-0.2.1-py3-none-any.whl (973 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m973.5/973.5 kB[0m [31m17.7 MB/s[0m eta [36m0:00:00[0m
Collecting langchain-core<0.3.0,>=0.2.0 (from langchain)
  Downloading langchain_core-0.2.1-py3-none-any.whl (308 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m308.5/308.5 kB[0m [31m37.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-text-splitters<0.3.0,>=0.2.0 (from langchain)
  Downloading langchain_text_splitters-0.2.0-py3-none-any.whl (23 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Downloading langsmith-0.1.63-py3-none-any.whl (122 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m122.8/122.8 kB[0m [31m18.4 MB/s[0m eta [36m0:00:00[0m
Collecting jsonpatch<2.0,>=1.33 (from langchain-core<0.3.0,>=0.2.0->langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting packaging<24.0,>=23.2 (from langcha

In [2]:
# Accessing the API keys
import os
import getpass

os.environ['GOOGLE_API_KEY'] = getpass.getpass('Gemini API Key:')

Gemini API Key:··········


In [5]:
# Importing the required libraries

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_community.document_loaders import WebBaseLoader
from langchain.schema import StrOutputParser
from langchain.schema.prompt_template import format_document
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.summarize import load_summarize_chain
from langchain.chains import MapReduceDocumentsChain, ReduceDocumentsChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain

In [6]:
# Loading the Data

loader = WebBaseLoader("https://blog.google/technology/ai/google-gemini-ai/#sundar-note")
docs = loader.load()

len(docs)

1

In [7]:
# Splitting the documents into chunks

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20)
data = text_splitter.split_documents(docs)

len(data)

42

In [8]:
# Initialize the Gemini API

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-pro", temperature=0.5, top_p = 0.85)

In [26]:
# Creating a map_reduce chain
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose = False)

In [27]:
import textwrap
output_summary = chain.run(docs)
wrapped_text = textwrap.fill(output_summary, width=100)
print(wrapped_text)

Google's Gemini AI model excels in various benchmarks, showcasing multimodal capabilities across
text, code, audio, images, and videos. It prioritizes responsibility and safety through
comprehensive evaluations and external partnerships. Gemini is being integrated into Google products
and will be accessible via APIs for developers and enterprise customers.


In [10]:
# Creating stuff chain

prompt_template = """Write a concise summary of the following:
"{text}"
CONCISE SUMMARY:"""

prompt = PromptTemplate.from_template(prompt_template)

In [31]:
llm_chain = LLMChain(llm=llm, prompt=prompt)

  warn_deprecated(


In [33]:
stuff_chain = StuffDocumentsChain(
    llm_chain=llm_chain,
    document_variable_name="text"
  )

In [34]:
output_summary = stuff_chain.run(docs)

In [39]:
wrapped_text = textwrap.fill(output_summary,
                             width=200,
                             break_long_words=False,
                             replace_whitespace=False)
print(wrapped_text)

Google introduces Gemini, its most advanced AI model yet, with state-of-the-art performance across various benchmarks. Gemini is multimodal, flexible, and optimized for different sizes. It excels in
sophisticated reasoning, understanding text, images, audio, and advanced coding. Gemini is designed with responsibility and safety in mind, undergoing comprehensive safety evaluations and
incorporating dedicated safety classifiers. Google is rolling out Gemini Pro in products like Bard and Pixel and making it available via APIs for developers and enterprise customers. Gemini Ultra, the
most capable version, will be released after further safety checks and feedback from select users.
